Serviceability Best Practices Guide for Cisco Unified ICM/Contact Center Enterprise, Release 10.0(1)
Understanding Unified ICM/Unified CCE SNMP Notifications
Downloads: This chapterpdf (PDF - 1.27MB) The complete bookPDF (PDF - 6.12MB) | The complete bookePub (ePub - 2.36MB) | Feedback

Understanding Unified ICM/Unified CCE SNMP Notifications

Understanding Unified ICM/Unified CCE SNMP Notifications

Most Unified ICM/Unified CCE SNMP notifications are "stateful" events; each event correlates to a managed object. An object is defined as having dual state or single state.

Unified ICM/Unified CCE Notification Type

cccaIcmEvent

An ICM event is a notification that is sent by a functional component of the Cisco Unified Intelligent Contact Management (Unified ICM) and the Cisco Unified Contact Center Enterprise (Unified CCE), and Cisco Unified Contact Center Hosted (Unified CCH) contact center applications.

The following table details the objects which comprise the notification type:

Table 1 Unified ICM/Unified CCE Notification Type Objects

Object Name

Description

cccaEventComponentId

A unique identifier used to correlate multiple notifications generated by a single enterprise contact center application functional component or subcomponent. A functional component constructs its unique identifier based upon configured parameters; all notifications by that component include this event component ID.

cccaEventState

The state (not to be confused with severity) of the notification and potentially the current state of the functional component that generated the notification. The possible states are:

'clear' (0): The clear state indicates that the condition that generated a previous raise notification is resolved.

'applicationError' (2): The application error state alerts the recipient that an error exists in the enterprise contact center application but that the error does not affect the operational status of the functional component.

'raise' (4): A raise state identifies a notification received because of a health-impacting condition, such as a process failure. A subsequent clear state notification follows when the error condition is resolved.

'singleStateRaise' (9): The single state raise state indicates that a health-impacting error occurred and that a subsequent clear state notification is not forthcoming. An example of a single state raise condition is an application configuration error that requires the system to be stopped and the problem resolved by an administrator before the affected component functions properly.

cccaEventMessageId

The unique notification message identifier (value) that was assigned by the enterprise contact center application. This identifier is unique for each different notification but consistent for each instance of the same notification.

cccaEventOriginatingNode

The application-defined name of the enterprise contact center application functional component that generated this notification. This name varies, both in content and in format, based on the component that generated the notification. For example, the name for a Router component may be 'RouterA', a combination of the component identification and the 'side' identifier, while the name 'PG1A' is a combination of the peripheral gateway acronym followed by the peripheral gateway number and the ‘side’ identifier.

cccaEventOriginatingNodeType

The type of enterprise contact center application functional component or subcomponent that generated this notification. The node types are:

'unknown' (0): The notification originates from an unknown source.

'router' (1): The notification was generated by the Router functional component.

'pg' (2): The notification was generated by the peripheral gateway functional component.

'nic' (3): The notification was generated by the network interface controller functional component.

'aw' (4): The notification was generated by the administrator workstation functional component.

'logger' (5): The notification was generated by the Logger functional component.

'listener' (6): The notification was generated by the listener functional component. The listener is an enterprise contact center application process that collects event messages from the Logger for display in a Cisco proprietary event management application that is part of the Remote Management Suite (RMS).

'cg' (7): The notification was generated by the CTI gateway functional component.

'ba' (8): The notification was generated by the Blended Agent functional component. Blended Agent is an enterprise contact center 'outbound option' functional component that manages campaigns of Outbound Dialing.

cccaEventOriginatingProcessName

Each enterprise contact center application functional component includes one or more operating system processes, each of which performs a specific function. The event originating process object identifies the name of the application process that generated this notification.

cccaEventOriginatingSide

The enterprise contact center application functional component fault tolerant side (either 'A' or 'B') that generated this notification.

cccaEventDmpId

The Device Management Protocol (DMP) is a session layer protocol used for network communication between enterprise contact center application functional components. The DMP ID uniquely identifies the session layer addresses of an application functional component. A single component may have multiple DMP IDs because a functional component communicates with other functional components (or its duplex pair) via multiple physical network interfaces and maintain multiple DMP session connections on each interface. Should a communications failure occur, the event DMP ID identifies the physical and logical address that the error occurred.

cccaEventSeverity

The severity level of this notification. The severity levels are:

'informational' (1): The notification contains important health or operational state information that is valuable to an administrator; however, the event itself does not indicate a failure or impairment condition.

'warning' (2): The notification contains serious health or operational state information that could be a precursor to system impairment or eventual failure.

'error' (3): The notification contains critical health or operational state information and indicates that the system has experienced an impairment and/or a functional failure.

cccaEventTimestamp

The date and time that the notification was generated on the originating node.

cccaEventText

The full text of the notification. This text includes a description of the event that was generated, component state information, and potentially a brief description of administrative action that may be necessary to correct the condition that caused the event to occur.

Dual State Objects

Most objects are defined as dual state; they have either a raise or clear state. The raise state indicates that there is a problem or fault associated with the object. The clear state indicates the object is operating normally.

A dual state Unified ICM/Unified CCE SNMP notification contains a raise(4) or clear(0) value in the cccaEventState field. In some cases, multiple raise notifications can correlate to the same object. For example, an object can go offline for a variety of reasons: process termination, network failure, software fault, and so on. The SNMP notification cccaEventComponentId field specifies a unique identifier that you can use to correlate common raise and clear notifications to a single managed object.

The following example shows a pair of raise and clear notifications with the same cccaEventComponentId.


Note


The first notification has a raise state; the notification that follows has a clear state.
    snmpTrapOID.0 = cccaIcmEvent
    cccaEventComponentId = 4_1_CC-RGR1A_ICM\acme\RouterA
    cccaEventState = raise(4
    cccaEventMessageId = 2701295877
    cccaEventOriginatingNode = CC-RGR1A\acme
    cccaEventOriginatingNodeType = router(1)
    cccaEventOriginatingProcessName = nm
    cccaEventOriginatingSide = sideA(1)
    cccaEventDmpId = 0
    cccaEventSeverity = warning(2)
    cccaEventTimestamp = 2006-03-31,14:19:42.0
    cccaEventText = The operator/administrator has shutdown the ICM software on ICM\acme\RouterA
  
    snmpTrapOID.0 = cccaIcmEvent
    cccaEventComponentId = 4_1_CC-RGR1A_ICM\acme\RouterA
    cccaEventState = clear(0)
    cccaEventMessageId = 1627554051
    cccaEventOriginatingNode = CC-RGR1A\acme
    cccaEventOriginatingNodeType = router(1
    cccaEventOriginatingProcessName = nm
    cccaEventOriginatingSide = sideA(1)
    cccaEventDmpId = 0
    cccaEventSeverity = informational(1)
    cccaEventTimestamp = 2006-03-31,13:54:12.0
    cccaEventText = ICM\acme\RouterA Node Manager started. Last shutdown was by operator request.

The CCCA-Notifications.txt file is installed in the icm\snmp directory as part of Unified ICM/Unified CCE installation. It contains the complete set of SNMP notifications, which you can use to identify grouped events. The Correlation ID is the data used to generate the cccaEventComponentId, which is determined at run time. The following entries correspond to the SNMP notifications in the preceding example.

Table 2 Example: Raise Notification

Field

Value / Description

NOTIFICATION

1028105

cccaEventMessageId

2701295877 (0xA1028105)

DESCRIPTION

Node Manager on the ICM node has been given the command to stop ICM services. This occurs when an operator/administrator stops ICM services using ICM Service Control, 'nmstop', 'netstop', Control Panel Services, or shuts down the node.

cccaEventState

Raise

SUBSTITUTION STRING

The operator/administrator has shut down the ICM software on %1.

ACTION

Contact the operator/administrator to determine the reason for the shutdown.

cccaEventComponentId

{cccaEventOriginatingNode %1}

CorrelationId

{ CLASS_NM_INITIALIZING cccaEventOrginatingNode %1 }

Table 3 Example: Clear Notification

Field

Value / Description

NOTIFICATION

1028103

cccaEventMessageId

1627554051 (0x61028103)

DESCRIPTION

The Node Manager successfully started. The last reason the Node Manager stopped was because a clean shutdown of the ICM code was requested by the operator.

cccaEventState

Clear

SUBSTITUTION STRING

%1 Node Manager started. Last shutdown was by operator request.

ACTION

No action is required.

cccaEventComponentId

{ cccaEventOriginatingNode %1 }

CorrelationId

{ CLASS_NM_INITIALIZING cccaEventOrginatingNode %1 }

Correlating Notifications

The cccaEventComponentId is the primary means of matching a clear event to a raise event. When a clear event is received, all pending raise events with the same alarm class and with a matching cccaEventComponentId should be cleared.

"Raise" Event:

cccaEventComponentId:"4_1_acme-rgr_ICM\acme\RouterA"

Event Class:CLASS_NM_INITIALIZING

cccaEventState:raise(4)

cccaEventMessageId:2701295877

cccaEventSeverity:warning(2)

cccaEventText:The operator/administrator has shutdown the ICM software on ICM\acme\RouterA.

"Clear" Event

cccaEventComponentId:"4_1_acme-rgr_ICM\acme\RouterA"

Event Class:CLASS_NM_INITIALIZING

cccaEventState:clear(0)

cccaEventMessageId:1627554051

cccaEventSeverity:informational(1)

cccaEventText:ICM\acme\RouterA Node Manager started. Last shutdown was by operator request.

Upon receipt of "Raise" event, categorize by severity

Upon receipt of "Clear" event, match to "Raise" using ‘cccaEventComponentId’

In the above example notifications, a simple string comparison of "" can suffice in matching the clear to the raise. cccaEventComponentId has the event class built into this value and the rest of the string was crafted to be sufficiently unique to ensure that the appropriate raises are cleared by the clear notification. (Remember: Multiple raise notifications can be cleared by a single clear notification.)

Sample logic:

If (cccaEventState == "clear")

set ID = cccaEventComponentId;

for (all "raise" events where cccaEventComponentId == ID)

Acknowledge();

There is no one-to-one mapping of alarms by event message ID.


Note


SNMP Notifications do not have a unique OID assigned to each alarm. The static assignment of an OID to a notification requires that that notification be explicitly documented (in Cisco customer-facing documents) and maintained following an established deprecation schedule. With so many Cisco devices in service, maintaining such a list is impossible. The event definition method in the CISCO-CONTACT-CENTER-APPS-MIB is consistent with the Cisco Unified Communications Manager (CISCO-CCM-MIB) and Cisco Unified Contact Center Express (CISCO-VOICE-APPS-MIB) product MIBs.

Single State Objects

A single state object has only a raise state. Because there is no corresponding clear event, the administrator must manually clear the object. Single state objects are typically used when a corresponding clear event cannot be tracked, for example the database is corrupt. Single state Unified ICM/Unified CCE SNMP notifications contain raise (9) value in the cccaEventState field.

The following example shows a value of Single-state Raise in the cccaEventState field to identify a single state object.

Table 4 Example "Single-State Raise" Notification

Field

Value / Description

NOTIFICATION

105023C

cccaEventMessageId

3775201852 (0xE105023C)

DESCRIPTION

The Router has detected that it is no longer synchronized with its partner. One result of this is that the Router might be routing some calls incorrectly.

cccaEventState

Single-state Raise

SUBSTITUTION STRING

The Router has detected that it is no longer synchronized with its partner.

ACTION

Action: Stop the Router on both sides. After both sides are completely stopped, restart both Routers.

Alternate Action: Restart the Router on one side. After doing this, the Routers might still route some calls incorrectly, but they will be in sync. Other actions: Collect all rtr, mds, ccag process logs from both Routers from the entire day. Collect all sync*.sod files (where * is some number) that exist in the icm\<instance>\ra directory of Router A and in the icm\<instance>\rb directory of Router B. Contact the .

cccaEventComponentId

{ cccaEventOriginatingNode cccaEventOriginatingProcessName cccaEventOriginatingSide }

CorrelationId

{ CLASS_RTR_SYNC_CHECK cccaEventOriginatingNode cccaEventOriginatingProcessName cccaEventOriginatingSide }

Organizing SNMP Notifications

Using the contents of the following Unified ICM/Unified CCE SNMP notification fields, an SNMP Monitoring tool can group Unified ICM/Unified CCE SNMP notifications in an organized, hierarchical manner.

cccaEventOriginatingNode = CC-RGR1A\acme
cccaEventOriginatingNodeType = router(1)
cccaEventOriginatingSide = sideA(1)



where:

Unified ICM/CCE Node Name = left side of cccaEventOriginatingNode

Instance Name = right side of cccaEventOriginatingNode

Component Name = cccaEventOriginatingNodeType + cccaEventOriginatingSide letter

For example:



Within this node, raise and clear events with the same cccaEventComponentId can be grouped as a single object.

CSFS Heartbeat Notification

The Customer Support Forwarding Service (CSFS) heartbeat notification should be monitored specifically as it is a critical SNMP notification.

Table 5 CSFS Heartbeat Notification

Field

Value / Description

NOTIFICATION

12A0003

cccaEventMessageId

1630142467 (0x612A0003)

DESCRIPTION

Periodic message to indicate MDS is in service and that the event stream is active.

cccaEventState

 

SUBSTITUTION STRING

HeartBeat Event for %1

ACTION

No action is required.

cccaEventComponentId

{ cccaEventOriginatingNode %1 }

CorrelationId

n/a


Note


The CCCA-Notifications.txt file defines the decimal value of cccaEventMessageId for this event incorrectly as 19529731.

The heartbeat notification is sent periodically by the Logger CSFS process to indicate a healthy connection exists between the Router and the Logger, and that the Logger SNMP notification feed is active. The heartbeat interval is set to 720 minutes (12 hours) by default. The reason the interval is set this high is to accommodate using a modem to communicate notifications.

You can modify the interval via the Windows Registry value: heartbeatIntervalMinutes, in:

HKLM\SOFTWARE\Cisco Systems, Inc.\ICM\<instance>\Logger<A or B>\CSFS\CurrentVersion

The interval can be as much as one minute longer than the configured interval, so the logic that reacts to these events should employ a certain "deadband" – in other words, allow for at least 60 seconds beyond the scheduled interval before assuming the worst.

Important: Monitoring this heartbeat notification provides an additional measure of safety; if the communication infrastructure that sends notifications were to fail, one might assume that the system is operating normally when in fact, it is not. If this heartbeat event ceases to arrive at the management station, this indicates that that communication infrastructure is impaired and immediately attention is necessary.