Table Of Contents
Playout Delay Enhancements for Voice over IP
Related Features and Technologies
Supported Standards, MIBs, and RFCs
Verifying Playout Delay Parameters
Playout Delay Enhancements for Voice over IP
Feature History
This document describes enhancements to the playout-delay command, which configures the jitter buffer to reduce delay variation on a VoIP network.
This document includes the following sections:
•
Supported Standards, MIBs, and RFCs
Feature Overview
Voice packet networks, which transmit time-sensitive data, experience problems not seen in either traditional, circuit-based voice networks or in non-voice data networks. One problem is delay, which has two effects:
•
Delay in an absolute sense can interfere with the rhythm of inquiry and reply in human conversation.
•
Delay variations, also known as jitter, can create unexpected pauses that may impair the intelligibility of the speech itself and cause the quality of voice to be jerky.
Jitter, the more serious of these problems, is defined as the difference between when a packet is expected to arrive and when it actually is received. Jitter is due primarily to queuing delays and congestion in the packet network, which cause discontinuity in the real-time voice stream.
Packet voice calls need a steady, even stream of packets to reproduce human speech successfully. However, delivery of voice packets is often irregular because conditions in the network are always changing. During congested periods, buffers on a network can fill instantaneously, delaying some packets until there is room for them on the network. Other packets in the same voice stream may not be delayed, because there was no congestion when they passed over the network. Thus, various packets in the same call can experience different amounts of inter-arrival variance, or jitter, which is a variable component of the total end-to-end network delay.
Cisco voice networks compensate for jitter by setting up a buffer, called the jitter buffer, on the gateway router at the far end (receiving end) of the voice transmission to be buffered (Figure 1 and Figure 2). From the IP network, the jitter buffer receives voice packets at irregular intervals, which are sometimes out of sequence. The jitter buffer holds the packets briefly, reorders them if necessary, and then plays them out at evenly spaced intervals to the decoder in the Digital Signal Processor (DSP) on the gateway. Algorithms in the DSP determine the size and behavior of the jitter buffer, based on user configuration and current network jitter conditions, to maximize the number of correctly delivered packets and minimize the amount of delay.
Figure 1 Jitter Buffer Placement for Human-to-Human Voice Call
Figure 2 Jitter Buffer Placement for Human-to-Machine (Automated) Voice Call
The size of the jitter buffer, and therefore the amount of delay, is user-configurable with the playout-delay command. Proper configuring is critical: if voice packets are held for too short a time, variations in delay may cause the buffer to underrun (become empty) and cause gaps in speech. On the other hand, packets that arrive at a full buffer are dropped, also causing gaps in speech. To improve voice quality, the speech gaps are hidden by several different concealment techniques that synthesize packets to replace those that were lost or not received in time. Depending upon the contiguous duration of the gaps, the missing voice frames are replaced by prediction from the past frames (usually the last frame only), followed by silence if the condition persists (for example, more than 30 to 50 ms). Buffer overflow and concealment statistics are available in the show call active voice command output, and they give a good indication of the effect of the network on the quality of the audio.
In an example that demonstrates how Packets can be lost, a jitter buffer is configured with a maximum playout delay of 40 ms. On the network, packets are delayed from their source (perhaps a media server stops sending packets for 60 ms or there is severe network congestion). The jitter buffer empties while waiting for input from the network that does not arrive until after the maximum playout delay time is reached, and there is a noticeable break in the voice transmission. The media server then sends packets at a faster rate than they are leaving the jitter buffer, which results in the jitter buffer filling up. Subsequent packets are discarded by the jitter buffer, resulting in a choppy voice signal.
Even though the size of the jitter buffer is configurable, it is important to note that if the buffer size is configured too large, the overall delay on the connection may rise to unacceptable levels. You must weigh the benefit of improving jitter conditions against the disadvantage of increasing total end-to-end delay, which can also cause voice quality problems.
The playout-delay command allows you to select a jitter buffer mode (fixed or adaptive) and specify certain values used by the DSP algorithms to adjust the size of the jitter buffer. For any voice call, the algorithms read time stamps in the Real-Time Transport Protocol (RTP) headers of a sample of packets to determine the amount of delay the jitter buffer applies to an average packet; that is, as if there is no jitter at all in the network. This is called the average delay.
When you configure the playout-delay mode adaptive option, the DSP algorithms in the codec take samples throughout the voice call and adjust the value of the average delay as network jitter conditions change (Figure 3). The size of the jitter buffer (and the amount of delay applied) is adjusted upward or downward as needed to ensure smooth transmission of voice frames to the codec, within the minimum and maximum limits you configure. The algorithms are designed to reduce the amount of delay slowly and increase delay quickly during adjustment, so that voice quality is achieved at the risk of larger delay times.
Figure 3 Adaptive Mode Jitter Buffer
When you configure the playout-delay mode fixed option, you can specify the nominal delay value, which is the amount of playout delay applied at the beginning of a call by the jitter buffer. This is also the maximum size of the jitter buffer throughout the call (Figure 4).
Figure 4 Fixed Mode Jitter Buffer
For most networks with an average amount of jitter, the defaults for the playout-delay command are adequate; the command does not need to be configured. Conditions that suggest configuration of playout-delay parameters include:
•
Choppy audio, poor voice quality
•
Overall network delay too large
•
Noisy but well-understood network or interworking with an application that has a lot of jitter at the transmission end, like a unified messaging server or interactive voice response (IVR) application
See the "Troubleshooting Tips" section for guidelines about setting playout-delay parameters to improve voice quality.
Prior to Cisco IOS Release 12.1(5)T, the playout-delay command was configured in voice-port configuration mode. For Cisco IOS Release 12.1(5)T and greater, in most cases playout delay should be configured in dial peer configuration mode on the VoIP dial peer that is on the receiving end of the voice traffic that is to be buffered. This dial peer senses network conditions and relays them to the DSPs, which then adjust the jitter buffer as necessary. When multiple applications are configured on the gateway, playout delay should be configured in dial peer configuration mode. When there are numerous dial peers to configure, it might be simpler to configure playout delay on a voice port. If conflicting playout delay values have been configured on a voice port and on a dial peer, the dial peer configuration takes precedence.
Benefits
The playout delay enhancements allow you to do the following:
•
Configure playout delay on the VoIP dial peer, so that the size of the jitter buffer can adapt to changing network conditions
•
Set the minimum playout delay easily to default, low, or high so that you can control the amount of variable delay in your network
Restrictions
The enhancements are not supported on the following platforms:
•
MGCP platforms
•
SCGP platforms
Related Features and Technologies
The playout control enhancements are related to voice quality of service. For a general discussion, see the following document:
•
Quality of Service for Voice over IP at http://www.cisco.com/univercd/cc/td/doc/cisintwk/intsolns/qossol/qosvoip.htm
Related Documents
•
Cisco IOS Multiservice Applications Configuration Guide, Release 12.1
•
Cisco IOS Multiservice Applications Command Reference, Release 12.1
Supported Platforms
These enhancements are available on Cisco platforms that support voice and H.323 gateway functionality, or voice and SIP gateway functionality, including:
•
Cisco 2600 series
•
Cisco 3600 series
•
Cisco MC3810
•
Cisco AS5200
•
Cisco AS5300
•
Cisco AS5800
•
Cisco 7200 series
Supported Standards, MIBs, and RFCs
Standards
No new or modified standards are supported by this feature.
MIBs
No new or modified MIBs are supported by this feature.
To obtain lists of supported MIBs by platform and Cisco IOS release, and to download MIB modules, go to the Cisco MIB website on Cisco.com at the following URL: http://www.cisco.com/public/sw-center/netmgmt/cmtk/mibs.shtml.
RFCs
No new or modified RFCs are supported by this feature.
Prerequisites
•
Operational VoIP network
•
Cisco IOS Release 12.1(5)T or greater
Configuration Tasks
See the following section for the complete set of configuration tasks for the enhancements to the playout delay feature. Each task in the list is identified as either required or optional.
•
Configuring Playout Delay (optional)
Configuring Playout Delay
To specify the type of jitter buffer playout delay and the size of the jitter buffer, complete these steps, beginning in global configuration mode.
Verifying Playout Delay Parameters
You can verify playout delay configuration with the show dial-peer voice command:
Router# show dial-peer voice 302...Playout Mode is set to adaptive,Initial 30 ms, Max 80 msPlayout-delay Minimum mode is set to low, value 10 ms...Troubleshooting Tips
The symptoms that lead you to adjust playout delay parameters are these:
•
Gaps in speech patterns that produce choppy or jerky audio suggest that you should increase the minimum playout delay, increase the maximum playout delay, or both, if you are using adaptive mode. For fixed mode, increase the nominal value.
•
High overall network delay suggests that you should reduce the maximum playout delay in adaptive mode, or reduce the nominal delay in fixed mode. (But watch for loss of voice quality—it's a trade-off!) The maximum delay value sets an upper limit on adaptive playout delay, which is in many cases the major contributor to the end-to-end delay. In many applications it may be preferable to have the system or the user terminate the call rather than to allow an arbitrarily large delay. The data received with jitter outside this limit will show up in the late packet count in the show call active voice playout statistics.
•
A noisy but well-understood network or interworking with an application that has lots of jitter at the transmission end, like a unified messaging server or interactive voice response (IVR) application, suggests selection of fixed mode.
•
Several fields in the show call active voice command output can help you determine the actual size of the jitter problems you are experiencing.
–
ReceiveDelay consists of the playout delay for jitter compensation plus the average expected delay after the frame is available for playout to the decoder. For the receive delay, the current, the low-water mark, and the high-water mark statistics are available in the output.
–
GapFillWith... fields refer to the amount of concealment, or synthesizing of packets to replace voice packets lost or not received in time, that took place in this call.
–
LostPackets is the count of the number of packets that were lost; that is, the packets not received at the egress gateway. This is detected using the sequence number field in the RTP packets.
–
EarlyPackets is the count of the number of packets that arrived earlier than the current minimum delay packet. They cause the de-jitter algorithm to readjust the minimum delay packet used in jitter estimation.
–
LatePackets is the count of the number of packets that arrived later than the current playout delay setting. The information in these packets was discarded.
The comparison of the following two show call active voice command samples from a Cisco 3640 shows jitter developing on a network. Poor voice quality was perceived at the same time that these samples were taken.
The first sample output displays average jitter statistics:
Router# show call active voice...VOIP:ConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]IncomingConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]RemoteIPAddress=192.168.100.101RemoteUDPPort=18834RoundTripDelay=11 msSelectedQoS=best-efforttx_DtmfRelay=inband-voiceFastConnect=TRUESeparate H245 Connection=FALSEH245 Tunneling=FALSESessionProtocol=ciscoSessionTarget=OnTimeRvPlayout=417000GapFillWithSilence=850 msGapFillWithPrediction=2590 msGapFillWithInterpolation=0 msGapFillWithRedundancy=0 msHiWaterPlayoutDelay=70 msLoWaterPlayoutDelay=29 msReceiveDelay=39 msLostPackets=0EarlyPackets=0LatePackets=86...The next sample output shows larger concealment and receive delay values than the previous sample, indicating increasing jitter conditions on the network:
Router# show call active voice...VOIP:ConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]IncomingConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]RemoteIPAddress=192.168.100.101RemoteUDPPort=18834RoundTripDelay=26 msSelectedQoS=best-efforttx_DtmfRelay=inband-voiceFastConnect=TRUESeparate H245 Connection=FALSEH245 Tunneling=FALSESessionProtocol=ciscoSessionTarget=OnTimeRvPlayout=482350GapFillWithSilence=1040 ms <------------ IncreasedGapFillWithPrediction=3160 ms <------------ IncreasedGapFillWithInterpolation=0 msGapFillWithRedundancy=0 msHiWaterPlayoutDelay=70 msLoWaterPlayoutDelay=29 msReceiveDelay=43 ms <------------ IncreasedLostPackets=0EarlyPackets=0LatePackets=105 <------------ Increased...Configuration Examples
The following example shows the running configuration for a Cisco AS5300 with two dial peers configured for adaptive mode playout delay.
Note
IP addresses and host names used in this example are fictitious.
!version 12.1no service single-slot-reload-enableservice timestamps debug datetime msec localtimeservice timestamps log datetime msec localtimeno service password-encryption!hostname Router1!no logging bufferedno logging bufferedlogging rate-limit console 10 except errors!!!resource-pool disable!ip subnet-zerono ip fingerno ip domain-lookupip host apple 192.168.254.254ip host banana 192.168.254.253!no ip dhcp-client network-discoveryno mgcp timer receive-rtcpisdn switch-type primary-5essisdn voice-call-failure 0call rsvp-sync!!!!!fax interface-type vfcmta receive maximum-recipients 0!partition flash 2 8 8!!!controller T1 0framing esfclock source line primarylinecode b8zads0-group 0 timeslots 1-24 type fgd-eana mf ani-dnis!controller T1 1framing esfclock source line secondary 1linecode b8zads0-group 0 timeslots 1-24 type fgd-eana mf ani-dnis!controller T1 2framing sflinecode amicablelength short 133!controller T1 3framing sflinecode amicablelength short 133!!interface Ethernet0ip address 10.13.125.5 255.255.0.0no ip route-cacheno ip mroute-cache!interface FastEthernet0ip address 10.0.0.1 255.255.0.0no keepaliveduplex fullspeed autono cdp enablehold-queue 75 in!ip default-gateway 10.13.0.1ip kerberos source-interface anyno ip classlessip route 192.168.254.253 255.255.255.255 10.5.0.1ip route 192.168.254.253 255.255.255.255 10.13.0.1no ip http server!!!!!voice-port 1:0!dial-peer voice 302 voipdestination-pattern 6......session target ipv4:10.13.101.2playout-delay maximum 80playout-delay nominal 30playout-delay minimum lowcodec g711ulaw!dial-peer voice 300 potsdestination-pattern 5551002port 0:0prefix 6551002!dial-peer voice 301 potsincoming called-number 6551002direct-inward-dialport 1:0!dial-peer voice 14 potsdestination-pattern 4.Tport 0:0prefix 4!dial-peer voice 24 potsdestination-pattern 4.Tport 1:0prefix 4!dial-peer voice 34 potsdestination-pattern 4.Tport 0:0prefix 4!dial-peer voice 44 potsdestination-pattern 4.Tport 0:0prefix 4!dial-peer voice 1000 voipplayout-delay maximum 80playout-delay nominal 30playout-delay minimum low!!line con 0exec-timeout 0 0logging synchronous level alltransport input noneline aux 0line vty 0 4exec-timeout 60 0password lablogin!endCommand Reference
This section documents modified commands. All other commands used with this feature are documented in the Cisco IOS Release 12.1 command reference publications.
playout-delay
To tune the playout buffer on Digital Signal Processors (DSPs) to accommodate packet jitter caused by switches in the WAN, use the playout-delay command in dial peer configuration mode. To restore the default value, use the no form of this command.
playout-delay {nominal value | maximum value | minimum {default | low | high}}
no playout-delay {nominal value | maximum value | minimum {default | low | high}}
Syntax Description
Defaults
The default for nominal is 200.
The default for maximum is 200.
The default for minimum is default (40).
Command Modes
Dial peer configuration
Voice-port configuration
Command History
Usage Guidelines
Prior to Cisco IOS Release 12.1(5)T, the playout-delay command was configured in voice-port configuration mode. For Cisco IOS Release 12.1(5)T and later releases, in most cases, playout delay should be configured in dial peer configuration mode on the VoIP dial peer that is on the receiving end of the voice traffic that is to be buffered. This dial peer senses network conditions and relays them to the DSPs, which adjust the jitter buffer as necessary. When multiple applications are configured on the gateway, playout delay should be configured in dial peer configuration mode. When there are numerous dial peers to configure, it might be simpler to configure playout delay on a voice port. If conflicting playout delay values have been configured on a voice port and on a dial peer, the dial peer configuration takes precedence.
Playout delay is the amount of time that elapses between the time that a voice packet is received at the jitter buffer on the DSP and the time that it is played out to the codec. In most networks with normal jitter conditions, the defaults are adequate and you do not need to configure the playout-delay command.
For situations in which you want to improve voice quality by reducing jitter, or you want to reduce network delay, you can configure the playout-delay command parameters. The parameters behave slightly differently in each of the two playout delay modes, adaptive and fixed (see the playout-delay mode command).
In adaptive mode, the average playout delay for voice packets varies based on the amount of inter-arrival variation that packets have as the call progresses. The jitter buffer grows and shrinks to compensate for jitter and to keep voice packets playing out smoothly, within the maximum and minimum limits that have been configured. The maximum limit establishes the highest value to which the adaptive delay will be set. The minimum limit is the low-end threshold for the delay of incoming packets by the adaptive jitter buffer. The algorithms in the DSPs that control the growth and shrinkage of the jitter buffer are weighted toward the improvement of voice quality at the expense of network delay: jitter buffer size increases rapidly in response to spikes in network transmissions, and decreases slowly in response to reduced congestion.
In fixed mode, the nominal value is the amount of playout delay applied at the beginning of a call by the jitter buffer in the gateway, and is also the maximum size of the jitter buffer throughout the call.
As a general rule, if there is excessive break-up of voice due to jitter with the default playout delay settings, increase playout delay times. If your network is small and jitter is minimal, decrease playout delay times for a smaller overall delay.
When there is bursty jitter in the network, voice quality can be degraded even though the jitter buffer is actually adjusting the playout delay correctly. The constant readjustment of playout delay to erratic network conditions causes voice quality problems that are usually alleviated by increasing the minimum playout delay value in adaptive mode, or by increasing the nominal delay for fixed mode.
The values that you enter for the maximum, nominal, and minimum value arguments must be coordinated with the codec selection that you make. See the codec complexity and codec (dial peer) commands in the Cisco IOS Multiservice Applications Command Reference, Release 12.1.
When deciding what values to enter for maximum, nominal, and minimum, keep these rules in mind. For high complexity, the maximum playout delay should be no greater than 250 ms minus the voice time contained in the largest frame size that you expect to receive. The incoming frame size is controlled by the gateway on the other side of an IP link, in the bytes argument to the codec selection in the dial peer. For medium complexity, the maximum playout delay should be no greater than 150 ms minus the voice time contained in the largest frame size that you expect to receive. The maximum value should be greater than or equal to the nominal value, and the nominal value should be greater than or equal to the minimum value.
Frame size is determined as follows. If you are using the default payload size, the frame size for any codec is 20 ms. If you configure a non-default payload size by changing the bytes value in the codec (dial peer) command, compute the number of milliseconds in a voice frame by multiplying the payload size by 8000 (the TDM bit rate) and dividing the result by the codec bit rate. For example, with a 40-byte payload size and a G.729 codec, the payload size in milliseconds is calculated by the equation: 40 * 8000 / 8000 = 40 ms.
Use the show call active voice command to display the current delay, as well as high- and low-water marks for delay during a call. Other fields that can help determine the size of a jitter problem are ReceiveDelay, GapFillWith..., LostPackets, EarlyPackets, and LatePackets.
Router# show call active voice...VOIP:ConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]IncomingConnectionId[0xECDE2E7B 0xF46A003F 0x0 0x47070A4]RemoteIPAddress=192.168.100.101RemoteUDPPort=18834RoundTripDelay=26 msSelectedQoS=best-efforttx_DtmfRelay=inband-voiceFastConnect=TRUESeparate H245 Connection=FALSEH245 Tunneling=FALSESessionProtocol=ciscoSessionTarget=OnTimeRvPlayout=417000GapFillWithSilence=850 msGapFillWithPrediction=2590 msGapFillWithInterpolation=0 msGapFillWithRedundancy=0 msHiWaterPlayoutDelay=70 msLoWaterPlayoutDelay=29 msReceiveDelay=39 msLostPackets=0EarlyPackets=0LatePackets=86Examples
The following example uses the default adaptive mode for the playout-delay command, with a minimum playout delay of 10 milliseconds and a maximum playout delay of 60 milliseconds, on the VoIP dial-peer tagged 80. The size of the jitter buffer will be adjusted up and down based on the amount of jitter the DSP finds, but will never be smaller than 10 ms, and never larger than 60 ms.
dial-peer 80 voipplayout-delay minimum lowplayout-delay maximum 60The next example configures fixed mode on a Cisco 3640 voice port, with a nominal delay of 80 ms.
voice-port 1/1/0playout-delay mode fixedplayout-delay nominal 80Related Commands
Command DescriptionSelects fixed or adaptive mode for the jitter buffer on Digital Signal Processors (DSPs).
show call active voice
Displays active call information for voice calls.
playout-delay mode
To select fixed or adaptive mode for playout delay from the jitter buffer on Digital Signal Processors (DSPs), use the playout-delay mode command in dial peer configuration mode. To restore the default value, use the no form of this command.
playout-delay mode {adaptive | fixed [no-timestamps]}
no playout-delay mode {adaptive | fixed [no-timestamps]}
Syntax Description
Defaults
The default is adaptive.
Command Modes
Dial peer configuration
Voice-port configuration
Command History
Usage Guidelines
Prior to Cisco IOS Release 12.1(5)T, the playout-delay command was configured in voice-port configuration mode. For Cisco IOS Release 12.1(5)T and later releases, in most cases playout delay should be configured in dial peer configuration mode on the VoIP dial peer on the receiving end of the voice traffic to be buffered. This dial peer senses network conditions and relays them to the DSPs, which adjust the jitter buffer as necessary. When multiple applications are configured on the gateway, playout delay should be configured in dial peer configuration mode. When there are numerous dial peers to configure, it might be simpler to configure playout delay on a voice port. If conflicting playout delay values have been configured on a voice port and on a dial peer, the dial peer configuration takes precedence.
In most networks with normal jitter conditions, the default is adequate and you do not need to configure the playout-delay mode command.
The default is adaptive mode, in which the average delay for voice packets varies based on the amount of inter-arrival variation that packets have as the call progresses. The jitter buffer grows and shrinks to compensate for jitter and to keep voice packets playing out smoothly, within the maximum and minimum limits that have been configured.
Select fixed mode only when you understand your network conditions well, and when you have a network with very poor quality of service (QoS) or when you are interworking with a media server or similar transmission source that tends to create a lot of jitter at the transmission source. In most situations, it is better to configure adaptive mode and let the DSP size the jitter buffer according to current conditions.
Examples
The following example configures adaptive playout-delay mode, with the minimum delay set at high (80 ms), on a VoIP dial peer that has a tag of 80:
dial-peer 80 voipplayout-delay mode adaptiveplayout-delay minimum highThe next example configures fixed mode on a Cisco 3640 voice port, with a nominal delay of 80 ms.
voice-port 1/1/0playout-delay mode fixedplayout-delay nominal 80Related Commands
Command DescriptionTunes the jitter buffer on Digital Signal Processors (DSPs) for playout delay of voice packets.
show call active voice
Displays active call information for voice calls.
Glossary
concealment—The synthesizing of packets to replace those that were lost or not received in time for playout to the decoder.
dial peer—An addressable call endpoint in the gateway router, which describes the characteristics associated with one or more call legs that terminate at the endpoint. (A call leg is the discrete segment of a call connection that lies between two endpoints in the connection.)
DSP—Digital Signal Processor. CPU that is specialized to perform complex calculations and processing on digitized data that was once analog data.
gateway—A gateway allows H.323 terminals to communicate with non-H.323 terminals by converting protocols. A gateway is the point at which a circuit-switched call is encoded and repackaged into IP packets.
H.323—ITU-T specification for real-time multimedia applications.
jitter—On voice calls, the packet inter-arrival variability caused by queuing delays and congestion in the network; the difference between when a packet is expected to arrive and when it actually is received. Jitter causes discontinuity in the real-time voice stream, which we hear as a choppy audio signal. It is a variable component of the total end-to-end network delay.
jitter buffer—Storage area used for handling voice packets that are in transit between the network and a codec. The jitter buffer's primary function is to reduce jitter by evening out the variability in the delivery times for voice packets.
MGCP—Media Gateway Control Protocol. Means of controlling telephone gateways through external call-control elements.
playout buffer—See jitter buffer.
RTP—Real-Time Transport Protocol. Defined by RFC 1890, a session-layer protocol used in IP networks for real-time traffic such as voice and video.
SIP—Session Initiation Protocol. Protocol that enables end devices (endpoints or gateways) to signal the setup and control of voice and multimedia call sessions over IP networks.





