Voice over IP for the Cisco 3600 Series Overview
Table of Contents
Voice over IP for the Cisco 3600 Series Overview
Software Configuration Guide Overview
Configuring Voice over IPVoice Primer
Voice over IP Configuration Examples
Voice over IP Commands
The Voice over IP for the Cisco 3600 Series Software Configuration Guide shows you how to configure your Cisco 3600 series router to support voice transmission. Cisco's voice support is implemented using voice packet technology. In voice packet technology, voice signals are packetized and transported in compliance with ITU-T specification H.323, which is the ITU-T specification for transmitting multimedia (voice, video, and data) across a local-area network.
This overview is divided into two parts:
The Software Configuration Guide Overview describes the chapter contents in the Voice over IP for the Cisco 3600 Series. The Voice Primer section provides supplementary information for those users unfamiliar with voice telephony.
Voice over IP enables a Cisco 3600 series router to carry voice traffic (for example, telephone calls and faxes) over an IP network. In Voice over IP, the DSP segments the voice signal into frames, which are then coupled in groups of two and stored in voice packets. These voice packets are transported using IP in compliance with ITU-T specification H.323. Because it is a delay-sensitive application, you need to have a well-engineered network end-to-end to successfully use Voice over IP. Fine-tuning your network to adequately support Voice over IP involves a series of protocols and features geared toward quality of service (QoS). Traffic shaping considerations must be taken into account to ensure the reliability of the voice connection.
Voice over IP is primarily a software feature; however, to use this feature on a Cisco 3600 series router, you must install a Voice Network Module (VNM). The VNM can hold either two or four Voice Interface Cards (VIC), each of which is specific to a particular signaling type associated with a voice port.
The Voice over IP for the Cisco 3600 Series Configuration Guide is divided into three parts:
The following sections describe the chapter contents for each part of this configuration guide.
Configuring Voice over IP provides the information you need to configure Voice over IP on Cisco 3600 series routers. Sections include:
The key to understanding Cisco's voice implementation is to understand the use of dial peers. Dial peers describe the entities to and/or from which a call is established. All of the voice technologies use dial peers to define the characteristics associated with a call leg. A call leg is a discrete segment of a call connection that lies between two points in the connection, as shown in Figure 1 and Figure 2. An end-to-end call is comprised of four call legs, two from the perspective of the source router as shown in Figure 1, and two from the perspective of the destination router as shown in Figure 2. You use dial peers to apply specific attributes to call legs and to identify call origin and destination. Attributes applied to a call leg include Quality of Service (QoS), compression/decompression (CODEC), Voice Activation Detection (VAD), and fax rate.
Figure 1: Dial Peer Call Legs from the Perspective of the Source Router
Figure 2: Dial Peer Call Legs from the Perspective of the Destination Router
There are basically two different kinds of dial peers with each voice implementation:
Voice port commands for both the Cisco 3600 series define the characteristics associated with a particular voice-port signaling type. Voice ports for both the Cisco 3600 series routers provide support for three basic voice signaling formats:
The Cisco 3600 series currently provides only analog voice ports for its implementation of Voice over IP. The type of signaling associated with these analog voice ports depends on the interface module installed into the device.
This chapter provides four scenarios, demonstrating how to configure Voice over IP for the following situations:
This chapter provides an alphabetical listing of all commands used to configure Voice over IP for the Cisco 3600 Series.
To understand Cisco's voice implementations, it helps to have some understanding of analog and digital transmission and signaling. This section provides some very basic, abbreviated voice telephony information as background to help you configure Voice over IP, Voice over Frame Relay, Voice over ATM, and Voice over HDLC and includes the following topics:
The standard PSTN is basically a large, circuit-switched network. It uses a specific numbering scheme, which complies to the ITU-T E.164 recommendations. For example, in North America, the North American Numbering Plan (NANP) is used, which consists of an area code, an office code, and a station code. Area codes are assigned geographically, office codes are assigned to specific switches, and station codes identify a specific port on that switch. The format in North America is 1Nxx-Nxx-xxxx, with N = digits 2 through 9 and x = digits 0 through 9. Internationally, each country is assigned a one- to three-digit country code; the country's dialing plan follows the country code. In Cisco's voice implementations, numbering schemes are configured using the destination-pattern command.
Until recently, the telephone network was based on an analog infrastructure. Analog transmission is not particularly robust or efficient at recovering from line noise. Because analog signals degrade over distance, they need to be periodically amplified; this amplification boosts both the voice signal and ambient line noise, resulting in degradation of the quality of the transmitted sound.
In response to the limitations of analog transmission, the telephony network migrated to digital transmission using pulse code modulation (PCM) or adaptive differential pulse code modulation (ADPCM). In both cases, analog sound is converted into digital form by sampling the analog sound 8000 times per second and converting each sample into a numeric code.
PCM and ADPCM are examples of "waveform" CODEC techniques. Waveform CODECs are compression techniques that exploit the redundant characteristics of the waveform itself. In addition to waveform CODECs, there are source CODECs that compress speech by sending only simplified parametric information about voice transmission; these CODECs require less bandwidth. Source CODECs include linear predicative coding (LPC), code-excited linear prediction (CELP) and multi-pulse, multi-level quantization (MP-MLQ).
Coding techniques are standardized by the ITU-T in its G-series recommendations. The most popular coding standards for telephony and voice packet are:
In Cisco's voice implementations, compression schemes are configured using the codec command.
Each CODEC provides a certain quality of speech. The quality of transmitted speech is a subjective response of the listener. A common benchmark used to determine the quality of sound produced by specific CODECs is the mean opinion score (MOS). With MOS, a wide range of listeners judge the quality of a voice sample (corresponding to a particular CODEC) on a scale of 1 (bad) to 5 (excellent). The scores are averaged to provide the mean opinion score for that sample. Table 1 shows the relationship between CODECs and MOS scores.
Table 1: Compression Methods and MOS Scores
Although it might seem logical from a financial standpoint to convert all calls to low-bit rate CODECs to save on infrastructure costs, you should exercise additional care when designing voice networks with low-bit rate compression. There are drawbacks to compressing voice. One of the main drawbacks is signal distortion due to multiple encodings (called tandem encodings). For example, when a G.729 voice signal is tandem encoded three times, the MOS score drops from 3.92 (very good) to 2.68 (unacceptable). Another drawback is CODEC-induced delay with low bit-rate CODECs.
One of the most important design considerations in implementing voice is minimizing one-way, end-to-end delay. Voice traffic is real-time traffic; if there is too long a delay in voice packet delivery, speech will be unrecognizable. Delay is inherent in voice-networking and is caused by a number of different factors. An acceptable delay is less than 200 milliseconds.
There are basically two kinds of delay inherent in today's telephony networks: propagation delay and handling delay. Propagation delay is caused by the characteristics of the speed of light traveling via a fiberoptic-based or copper-based media. Handling delay (sometimes called serialization delay) is caused by the devices that handle voice information. Handling delays have a significant impact on voice quality in a packetized network.
CODEC-induced delays are considered a handling delay. Table 2 shows the delay introduced by different CODECs.
Table 2: CODEC-Induced Delays
Another handling delay is the time it takes to generate a voice packet. In Voice over IP, the DSP generates a frame every 10 milliseconds. Two of these frames are then placed within one voice packet; the packet delay is therefore 20 milliseconds.
Another source of handling delay is the time it takes to move the packet to the output queue. Cisco IOS software expedites the process of determining packet destination and getting the packet to the output queue. The actual delay at the output queue is another source of handling delay and should be kept to under 10 milliseconds whenever possible by using whatever queuing methods are optimal for your network. Output queue delays are a quality of service (QoS) issue in Voice over IP for the Cisco 3600 series and discussed in the "Configure IP Networks for Real-Time Voice Traffic" section.
In Voice over Frame Relay, you need to make sure that voice traffic is not crowded out by data traffic. Strategies on how to manage Voice over Frame Relay voice traffic are discussed in "Configuring Voice over Frame Relay."
Jitter is another factor that affects delay. Jitter occurs when there is a variation between when a voice packet is expected to be received and when it actually is received, causing a discontinuity in the real-time voice stream. Voice devices such as the Cisco 3600 series and the Cisco MC3810 compensate for jitter by setting up a playout buffer to playback voice in a smooth fashion. Playout control is handled through RTP encapsulation, either by selecting adaptive or non-adaptive playout-delay mode. In either mode, the default value for nominal delay is sufficient.
Figuring out the end-to-end delay is not difficult if you know the end-to-end signal paths/data paths, the CODEC, and the payload size of the packets. Adding the delays from the end points to the CODECs at both ends, the encoder delay (which is 5 milliseconds for G.711 and G.726 CODECs and 10 milliseconds G.729 CODEC), the packetization delay, and the fixed portion of the network delay yields the end-to-end delay for the connection.
Echo is hearing your own voice in the telephone receiver while you are talking. When timed properly, echo is reassuring to the speaker; if the echo exceeds approximately 25 milliseconds, it can be distracting and cause breaks in the conversation. In a traditional telephony network, echo is normally caused by a mismatch in impedance from the 4-wire network switch conversion to the 2-wire local loop and controlled by echo cancellers. In voice packet-based networks, echo cancellers are built into the low-bit rate CODECs and are operated on each DSP. Echo cancellers are limited by design by the total amount of time they will wait for the reflected speech to be received, which is known as an echo trail. The echo trail is normally 32 milliseconds.
In Cisco's voice implementations, echo cancellers are enabled using the echo-cancel enable command. The echo trails configured using the echo-cancel-coverage command. For example, Voice over IP has configurable echo trails of 16, 24, and 32 milliseconds.
Although there are various types of signaling used in telecommunications today, this document describes only those with direct applicability to Cisco's voice implementations. The first one involves access signaling, which determines when a line has gone off-hook or on-hook (in other words, dial tone). FXO and FXS are types of access signaling. There are two common methods of providing this basic signal:
In Cisco's voice implementations, access signaling is configured using the signal command.
Another signaling technique used mainly between PBXes or other network-to-network telephony switches is known as E&M. There are five types of E&M signaling, as well as two different wiring methods. Cisco's voice implementation supports E&M types I, II, III, and V, using both 2-wire and 4-wire implementations. In Cisco's voice implementations, E&M signal types are configured using the type command.