Design Implications for VoiceXML Server
This chapter cover the following topics:
•What is VoiceXML over HTTP?
•Differences in the Supported Web Application Servers
•Where to Install Unified CVP Studio
What is VoiceXML over HTTP?
Communication between the VoiceXML server and the Voice Browser is based on request-response cycles using VoiceXML over HTTP. VoiceXML documents are linked together by using the Uniform Resource Identifiers (URI), a standardized technology to reference resources within a network. User input is carried out by web forms similar to HTML. Therefore, forms contain input fields that are edited by the user and sent back to a server.
Resources for the Voice Browser are located on the VoiceXML server. These resources are VoiceXML files, digital audio, instructions for speech recognition (Grammars) and scripts. Every Communication process between the VoiceXML browser and Voice Application has to be initiated by the VoiceXML browser as a request to the VoiceXML server. For this purpose, VoiceXML files contain Grammars which specify expected words and phrases. A Link contains the URL that refers to the Voice application. The browser connects to that URL as soon as it recovers a match between spoken input and one of the Grammars.
When gauging VoiceXML server performance, consider the following key aspects:
•QoS and network bandwidth between the Web application server and the voice gateway
See the section on Bandwidth Provisioning and QoS Considerations, page 9-1, for more details.
•Performance on the VoiceXML Server
The Hardware and System Software Specification for Cisco Unified CVP (formerly called the Bill of Materials), available at http://www.cisco.com/en/US/products/sw/custcosw/ps1006/prod_technical_reference_list.html, requires the Cisco MCS-7845 as a VoiceXML server. Adequate performance is required on the server side to respond to VoiceXML over HTTP requests.
•Use of prerecorded audio versus Text-to-Speech (TTS)
Voice user-interface applications tend to use prerecorded audio files wherever possible. Recorded audio sounds much better than TTS. Prerecorded audio file quality must be designed so that it does not impact download time and browser interpretation. Make recordings in 8-bit mu-law 8 kHz format.
•Audio file caching
Make sure the voice gateway is set to cache audio content to prevent delays from having to download files from the media source. For more details about prompt management on supported gateways, see Configuring Caching and Streaming in Cisco IOS, page 12-2.
•Use of grammars
A voice application, like any user-centric application, is prone to certain problems that might be discovered only through formal usability testing or observation of the application in use. Poor speech recognition accuracy is one type of problem common to voice applications, and a problem most often caused by poor grammar implementation. When users mispronounce words or say things that the grammar designer does not expect, the recognizer cannot match their input against the grammar. Poorly designed grammars containing many difficult-to-distinguish entries also results in many mis-recognized inputs, leading to decreased performance on the VoiceXML server. Grammar tuning is the process of improving recognition accuracy by modifying a grammar based on an analysis of its performance.
The Cisco IOS Voice Browser or the Media Resource Control Protocol (MRCP) specification does not impose restrictions on support for multiple languages. However, there might be restrictions on the automatic speech recognition (ASR) or TTS server. Check with your preferred ASR/TTS vendor about their support for your languages before preparing a multilingual application.
You can dynamically change the ASR server value by using the command cisco property com.cisco.asr-server in the VoiceXML script. This property overrides any previous value set by the VoiceXML script.
Differences in the Supported Web Application Servers
From a very high-level perspective, IBM WebSphere Application Server (http://www.ibm.com/websphere) is a complete J2EE application server environment complete with an administration console and connection pooling. However, Tomcat (http://tomcat.apache.org/) is a simple and basic environment with a Servlet Engine and a Java Server Pages engine only. The decision to use Tomcat or WebSphere Application Server depends on your current enterprise infrastructure requirements. In many cases, Tomcat is more than sufficient. But if you already have WebSphere infrastructure and management capabilities or have a preference for WebSphere in general, you should use it for Unified CVP.
Performance tests conducted on the web application server showed only slight variations in the processor performance between the two Web Application Servers using metrics such as the following:
•Impact of call volume
•Impact of application size
•Impact of application complexity
Either a Tomcat or WebSphere Application server running Unified CVP VoiceXML can support up to 500 simultaneous calls per Cisco MCS-7845 server.
Where to Install Unified CVP Studio
Unified CVP Studio is an Integrated Development Environment (IDE). As in the case of any IDE, the Unified CVP Studio needs to be installed in a setup that is conducive for development, such as workstations that are used for other software development or business analysis purposes. Because the Unified CVP Studio is Eclipse-based, many other development activities (such as writing Java programs or building object models) can be migrated to this tool so that developers and analysts have one common utility for most of their development needs.
For non-production systems, the Unified CVP Studio can be installed on the Unified CVP VoiceXML server. If the intent is only to test applications in a non-load scenario, this co-resident configuration is acceptable.