Problem solve Get help with specific problems with your technologies, process and projects.

Speech technology in the contact center: an open standards issue

A Service Oriented Architecture (SOA) gives companies the flexibility to treat elements of business processes and the underlying IT infrastructure as secure, standardized components (services) that can be rapidly reused and combined to address changing business priorities. Learn why speech technology in the contact center is so important and more about open standards here.

A Service Oriented Architecture (SOA) gives companies the flexibility to treat elements of business processes and the underlying IT infrastructure as secure, standardized components (services) that can be rapidly reused and combined to address changing business priorities. These services are used to help get the right information to the right people at the right time, thereby enabling a business to respond quickly to threats and opportunities quickly through integrated efforts with partners, suppliers and customers.

The contact center needs to be one of the doorways to the SOA. The keys to retaining customers, reducing costs and creating new cross-sell opportunities and new revenue streams, reside in the ability of the contact center to quickly offer self-services through multiple channels.

The contact center is transforming from a siloed cost center into an integral part of the company's business strategy through the advances of speech technology. Open standards have enabled horizontal integration of the contact center into the mainstream IT infrastructure. Capabilities like verification take interacting via speech to a whole new plane, doing for the telephone what fingerprint technology has done for laptops. With these advances in conversational self-service, contact centers can become profit centers as they become a place to satisfy and retain current customers, as well as service new customers.

In order for contact centers to grab the attention of the CEO and become more than an expense center, they need to embrace open standards and the creation of reusable business components. SOAs require planning and skill to build, deploy and use these elements through a managed and secure environment. SOAs require a strong middleware presence. The contact center takes advantage of that middleware presence to create profits, customer retention and lower costs.

Across virtually all industries, companies with contact centers are transforming their customer service operations to reduce costs, increase customer satisfaction, grow revenue and attain competitive advantage. Fundamental to this transformation is the creation, adoption and proliferation of open standards. This phenomena is happening now in contact centers. The legacy paradigm is that everything runs on the interactive voice response (IVR) platform. The business logic is written in some proprietary language that is unique with each vendor's IVR. Redundancy and failover is difficult and expensive since everything runs on the same machine. Moving to a new IVR platform means a total rewrite of the applications in a new proprietary language.

Speech entered the picture through proprietary APIs that were different for each IVR vendor, so the availability of choice in speech vendors was conditional on the cooperation between the speech vendor and the IVR vendor.

With the advent and acceptance of VoiceXML, the IVR primarily takes on the job of answering the phone and passing off the call to an application server that sends a VoiceXML page to the IVR's voice browser, where it gets rendered. The business logic is separated from the IVR function of answering and transferring phone calls. Now we can monitor and launch Web applications and speech applications from the same application server. What once used to be two silos of technology is now one horizontal, integrated infrastructure.

VoiceXML is just one of the open standards that gives the customer portability. SRGS and Speech Synthesis Markup Language are standards for grammar formats and text-to-speech tags which make not only the applications portable, but the associated grammars as well. VoiceXML was first submitted to the W3C on May 22, 2000, and the adoption rates have been through the roof.

There are thousands of VoiceXML applications running on platforms from nearly 100 different vendors. Not only has the standard taken off, but it has spawned a whole new category of tools vendors who specialize in application builders that generate VoiceXML.

Now that we have choice in the IVR vendor through portable applications built on open standards, how do we get choice in our speech vendor? The answer lies in a new Internet Engineering Task Force (IETF) standard called the Media Resource Control Protocol (MRCP). While the MRCP spec is only two years old, already all three of the major speech vendors claim support for it. With MRCP, the proprietary connectors between the IVR and the speech vendor are gone. Open standards once again create a climate for choice.

Call Control eXtensible Markup Language (CCXML) is a new proposed standard before the W3C that will standardize the call control functions of an IVR, such as "answer a call," "hang up," or "transfer a call."

As we adopt more open standards in the contact center space, the proprietary nature of IVRs disappears and prices go down. The contact center becomes horizontally integrated with the rest of the IT shop and economies of size take shape and drive down the total cost of customer care while improving the customer experience. There are still a lot of legacy, proprietary IVR systems out there from the Y2K buying binge, but as businesses see the need to improve customer care and create new channels of revenue, they will see the business justifications for moving to open standards based systems that provide reduced cost per touch, portability and protection of their investment, and a merged relationship with their other channels of contact center customer communications.

Time will be the biggest challenge moving forward. It will take time for the call center to converge with the contact center; for the development of new methodologies to reduce the cost of implementing speech self-service applications; for companies to become on-demand with SOAs. Consolidation within this industry is likely to come. The investment required in R&D for speech is very high, and the real challenge is about who has the staying power to get past the "curve of enlightenment" to mass adaptation. And who has the vision to keep investing in breakthroughs that have yet to occur?

About the author:
Brian Garr is program director and segment manager for Contact Center Solutions in the Software Group of IBM. He has been with IBM for six years, and is an evangelist and speaker worldwide on machine translation, text to speech and speech recognition. Prior to joining IBM, Garr was a CTO and VP of two startup technology companies. He has a BA degree from Washington and Lee University. Garr received the Smithsonian Institute's "Heroes of Technology" designation in 1998 for his work in machine translation.

Dig Deeper on VoIP Migration and Implementation

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.