Guest

Cisco Unified Contact Center Enterprise

Heartbeats Dropped / Loss of Connectivity

Techzone Article content

Document ID: 116081

Updated: May 24, 2013

Contributed by Holger Esser and Jason Hutchinson, Cisco TAC Engineers.

   Print

Introduction

This document describes a loss of heartbeats between the Voice Response Unit Peripheral Interface Manager (VRU PIM) and the Customer Voice Portal (CVP) server. This caused a failover and intermittent issues.

Symptoms

  • From the PIM server, the errors in the logs appear as:
    pim1 Error receiving data from VRU.
    Last API Error [10054]: An existing connection was
    forcibly closed by the remote host.

    pim1 TCP connection to VRU has been broken.
  • From the CVP Call Server error logs, the errors in the logs appear as:
    Mar 30 2013 19:36:46.105 -0500: 
    %CVP_8_5_ICM-1-LOGMSG_ICM_SS_STATE:
    Shutting down VRU PIM connection. Transition to
    partial service. [id:2006]

    Mar 30 2013 19:36:46.136 -0500:
    %CVP_8_5_MSGBUS-3-MESSAGING_LAYER:
    ConnectionServer(GED125)::
    terminateConnection on plugin(GED125)
    with connection(Socket[addr=/161.135.182.16,
    port=4335,localport=5000])
    due to: Plugin was stopped by the application [id:1]
  • From the CVP Call Server logs, the errors in the logs are not as important as the time stamps:

    Mar 30 2013 19:36:46.531 -0500: %CVP_8_5_IVR-7-CALL:  
    {Thrd=http-8000-1} VXMLManager:generateVXML:
    CALLGUID=E1D13C7998D111E288360013C39AE710
    Generated VXML from template 'PlayMediaIOS.template' for
    client: 161.135.211.38 clientType: IOS

    Mar 30 2013 19:36:57.328 -0500:
    %CVP_8_5_ICM-6-LOGMSG_ICM_SS_GENERAL_INFO: Missed 2 VRU PIM
    heartbeats. Closing session and waiting for new connection
    from PIM. [id:2007]

    Note: Notice the 11 second delay in the CVP logs. This coincides with the PIM logs and the heartbeat loss.

Perfmon Collection from the CVP Side

Collect Perfmon (CSV format) from both affected servers. In this case it was the CVP server and the Peripheral Gateway (PG) server where the affected VRU was hosted. Open perfmon on a local system. Identify the time frame in which a heartbeat is missing or a gap in communication (logs) appears. Select the Deferred Procedure Call (DPC) rate and identify if there was upward movement at the time frame. In this scenario, there was an increase from 0 to 10 at the exact second the gap in logging was present (refer to the figure). If you verify that the log gap coincides with the DPC spikes (no matter the percentage), then the DPC is the probable culprit of the dropped User Datagram Protocol (UDP) packets.

Cause/Problem Description

Deferred Procedure Calls

% DPC Time shows the percentage of time that the processor spent to receive and service deferred procedure calls (DPCs) in the sample interval time period. DPCs are interrupts that run at a lower priority than standard interrupts. % DPC Time is a component of % Privileged Time because DPCs are executed in privileged mode. They are counted separately and are not a component of the interrupt counters. This counter displays the average busy time as a percentage of the sample time.

Refer to Windows Server Processor Object- By clicking on the link, you will be directed to a third party website that is not affiliated with Cisco.

How does DPC rate affect our communications and application?

The Ndis.sys driver queues the DPC routines at a low importance level on the same processor that services the interrupt service routine (ISR). Therefore, the UDP related DPC routine goes to the end of the queue, and this DPC routine is processed last. Additionally, the DPC queue of the processor may not be empty, and these DPCs for other I/O drivers are processed first. If the DPC rate is sufficiently high for all I/O drivers, not just for NDIS, there could be a noticeable delay.

Under a heavy stress situation, this delay could cause the system to drop packets when the Ethernet adapter's receive buffers fill while the receive buffers wait for the queued DPC routine to finish.

Refer to Applications that use the UDP protocol may encounter poor performance on a computer that is running Windows Server 2003- By clicking on the link, you will be directed to a third party website that is not affiliated with Cisco.

Conditions/Environment

This affects UDP traffic only.

The normal suspects such as NIC settings, TCP offloading, and VM snapshots could also cause similar issues.

Resolution

Refer to Applications that use the UDP protocol may encounter poor performance on a computer that is running Windows Server 2003- By clicking on the link, you will be directed to a third party website that is not affiliated with Cisco.

Hotfix Information

A supported hotfix is available from Microsoft. However, the hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that experience the problem described in this article. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, Cisco recommends that you wait for the next software update that contains this hotfix.

If the hotfix is available for download, there is a "Hotfix download available" section at the top of the Knowledge Base article. If the section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.

Note: If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the Microsoft Support Contact page- By clicking on the link, you will be directed to a third party website that is not affiliated with Cisco.

Note: The 'Hotfix download available' form displays the languages for which the hotfix is available. If you do not see your language, a hotfix is not available for that language.

Prerequisites

To apply this hotfix, your computer must run Windows Server 2003 Service Pack 2 (SP2).

Restart Requirement

You must restart the computer after you apply this hotfix.

Registry Information

You do not have to make any change to the registry.

File Information

The English version of this hotfix has the file attributes (or later file attributes) that are listed in these tables. The dates and times for these files are listed in Coordinated Universal Time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time item in Control Panel.

For all Supported x86-based Versions of Windows Server 2003

File nameFile versionFile sizeDateTimePlatform
Ndis.sys5.2.3790.4524210,43204-Jun-200913:29x86

For all Supported x64-based Versions of Windows Server 2003 and of Windows XP

File nameFile versionFile sizeDateTimePlatform
Ndis.sys5.2.3790.4524361,98404-Jun-200917:48x64

For all Supported Itanium-based Versions of Windows Server 2003

File nameFile versionFile sizeDateTimePlatform
Ndis.sys5.2.3790.4524646,65604-Jun-200917:49IA-64

Note: In order to work around the issue, enable the receive-side scaling (RSS) feature on the affected computer.

Related Information

Updated: May 24, 2013
Document ID: 116081