User Guide for Cisco Unified Operations Manager 2.2
Events Processed
Downloads: This chapterpdf (PDF - 785.0KB) The complete bookPDF (PDF - 13.05MB) | Feedback

Events Processed

Table Of Contents

Events Processed

Event Information

Supported Events

Obsolete Events

Suppressing or Unsuppressing Events

Suppressing or Unsuppressing Events from the User Interface

Suppressing or Unsuppressing Events from the Command Line

Understanding Event Flooding Control


Events Processed


This section covers the following topics:

Event Information

Supported Events

Obsolete Events

Suppressing or Unsuppressing Events

Understanding Event Flooding Control

Event Information

Operations Manager is now configured as a Cisco Unified Communications Manager remote syslog receiver for Cisco CallManager (referred to in later releases, Unified Communications Manager), Call Detail Records (CDRs), Disaster Recovery Framework (DRF), and AONS Management Console (AMC) services.

Table E-1 lists all possible events you might see on a Monitoring Dashboard, along with the following:

Description —A summary of the event, including typical causes (if known).

Default Polling Interval—How often this data will be monitored.

Default Threshold—The set place or point at which monitoring will begin.

Trigger —How Operations Manager learns of the event: from normal polling, RTMT, a syslog that was received, a threshold that was exceeded, a diagnostic test result, a trap that was received, or an event received from Windows Event Manager. For a list of thresholds and events that they trigger, see Table 19-15.

Severity —The severity that Operations Manager assigns to the event: critical, warning, or informational.

Device Type —The devices, as classified in Operations Manager, on which the event can occur.

Clear Interval—The interval at which Operations Manager clears the event from the Alert Detail View by Operations Manager. Some syslog-based events have a predetermined clear interval. Upon reaching the expiry of the clear interval, Operations Manager moves the events from Active to Cleared. (For more information, see Responding to Events Using the Alert Details Page, page 3-31.)

Event Code —The code used by Notifications to track changes to default Operations Manager event names using the Notification event customization feature. (For more information, see Customizing Events, page 15-26.)

Recommended Action—An action or actions to take to resolve the problem which caused the event to be displayed.

Some events are no longer available for monitoring and have been made obsolete. See Obsolete Events for a list. If you use any of these obsolete events, refer to the equivalent events that can be used instead of the obsolete events.

Events listed in Table E-1 are displayed on:

The Alert Details page—Shows the majority of events generated. Event names correspond to those displayed in the Description column of the Alert Details page.

The IP Phone Outage Status display—Shows information about the IP phones in your network that have become disconnected from the switch, are no longer registered to a Cisco Unified Communications Manager, or have gone into SRST mode. The following events cause activity to be displayed on the IP Phone Outage Status display:

SRSTEntered

SRSTSuspected

The Service Quality Alerts display—Shows events generated as a result of traps received from Service Monitor. The following events cause activity to be displayed on the Service Quality Alerts display:

CriticalServiceQualityIssue

ServiceQualityIssue


Note The SensorDown event is also generated as a result of traps received from Service Monitor. However, SensorDown appears under Unidentified Trap.


Supported Events

Table E-1 lists the events supported by Operations Manager.


Note When Operations Manager displays event information in event displays, Fault History reports, and notifications, it replaces a single quote with a space. (CSCsx81860)


Table E-1 Events that Operations Manager Supports 

Event
Description, Cause, Severity and Event Code

Authentication Failed

Description: Occurs when there is authentication failure in a login attempt. This event is generated by monitoring the syslog messages received from Communications Manager.

Trigger: Polling.

Severity: Warning.

Device Type: Communications Manager version 6.x or later.

Clear Interval: Time-based auto-clear in Event Promulgation Module (EPM) after 30 minutes.

Event Code: 7006.

Recommended Action: Check for gateway port unavailability or an out-of-service issue. Check security logs for further details.

AverageLatency_ThresholdExceeded

Description: There is a violation in the latency threshold for a node-to-node UDP Jitter for a VoIP test that the user has configured. This will result in poor voice quality.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4004.

Recommended Action: Check the connectivity between the source and destination with which the test was configured. Check QoS settings that may impact latency in the intermediate routers. This may be a result of a heavy load or crash on a node in the path.

CCMEDown

Description: Indicates that the telephony configuration is disabled on the Cisco Communications Manager Express. In this case, no SCCP-based calls go though the Communications Manager Express.

Trigger: Polling.

Severity: Critical.

Device Type: Router.

Event Code: 2038.

Recommended Action: Check the telephony configuration on the Communications Manager Express.

For Cisco Unified Communications Manager, see CallManagerDown.

CCMEEphoneDeceased

Description: The state of an ephone registered to Cisco Unified Communications Manager Express changed to deceased.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2076.

Recommended Action: Check the connectivity between phone, access switch, and Communications Manager Express-hosted router.

CCMEEphoneLoginFailed

Description: Login through the web or TAPI to the CME failed.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2078.

Recommended Action: Check and validate why the login was rejected or failed.

CCMEEphoneRegistrationFailed

Description: An ephone attempt to register with Cisco Unified Communications Manager Express failed.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2077.

Recommended Action: For SCCP phones:

Use the show ephone command to display the status of SCCP phones that are not registered or are trying to register.

For SIP phones:

Use the show voice register all command to display configuration and registration information for SIP phones in Communications Manager Express.

Typically, an individual Cisco Unified IP Phone fails to successfully complete registration for any of the following reasons:

The configuration file for this IP phone is incorrect or empty.

The IP phone cannot download its configuration file.

Auto-register is disabled and an individual phone is not explicitly configured.

Auto-assign is enabled and there are more IP phones than there are available telephone or extension numbers.

The appropriate Cisco IP Phone firmware is not or cannot be installed on this IP phone.

CCMEEphoneUnregistrationsExceeded

Description: The number of ephones unregistered to Communications Manager Express was exceeded.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Default Threshold: NA.

Severity: Critical.

Device Type: Router or VoiceGateway.

Event Code: 2075.

Recommended Action: You may have to check the connectivity between phone, switch, and Communications Manager Express.

If this is a newly installed and configured Communications Manager Express and all of the phones fail to register, the configuration is incorrect or unavailable. If all IP phones have lost their registration, the network interface is the most likely source of the failure.

All Cisco Unified IP phones of a particular type can fail to register for one of the following reasons:

The wrong or incorrect Cisco phone firmware filename is specified for a particular phone type.

The phones cannot download the correct Cisco phone firmware file from the TFTP server.

Auto-assign for a specified phone type is enabled and there are more phones of that type than there are available telephone or extension numbers.

Check if the unregistered threshold configured on the Communications Manager Express is large enough. If not, increase the threshold.

Check if any instances of the Communications Manager Express are redundant.

CCMEKeyEphoneRegistrationChange

Description: Registration status changed for a key IP ephone with respect to Cisco Unified Communications Manager Express.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2080.

Recommended Action: Check if the indicated key IP ephone is unregistered and investigate the root causes.

CCMELivefeedMOHFailed

Description: Music on hold (MOH) live feed failed on Cisco Unified Communications Manager Express.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2074.

Recommended Action: You may need to check if the live-feed source is shut down or broken for any reason. Check the MOH configurations on the Communications Manager Express. Check the port to which the MOH feed is connected.

CCMEMaximumConferencesExceeded

Description: Maximum number of simultaneous three-party conferences supported was exceeded on Cisco Unified Communications Manager Express.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Default Threshold: NA.

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2073.

Recommended Action: The maximum number of simultaneous conferences is platform-specific to the type of Cisco Unified Communications Manager Express router. Try adjusting the configuration based on the support limit.

CCMENightServiceChange

Description: Night service status changed on an ephone registered to Cisco Unified Communications Manager Express.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Default Threshold: NA.

Severity: Critical.

Device Type: Router or VoiceGateway.

Event Code: 2079.

Recommended Action: You may need to check the event description for the appropriate reason. Run show running-config to verify the night-service parameters. Ensure that any changes are authorized changes.

CCMEStatusChange

Description: Cisco Unified Communications Manager Express enabled state has changed.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2072.

Recommended Action: Check if Communications Manager Express is enabled on the device and if the status change is expected.

CDR Agent Send File Failed

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

Description: The CDR Agent cannot send CDR files from a Communications Manager node to the CDR Repository node within the Communications Manager cluster. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Polling.

Severity: Critical.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 60 minutes.

Event Code: 7010.

Recommended Action: Do the following:

1. Check network link status.

2. Verify that the CDR Repository node (first node in the cluster) is running.

3. Verify that the CDR Repository Manager is activated on the first node.

4. Check the CDR configuration using Serviceability > Tools.

5. Check the CDR agent trace on the specific node where the error occurred.

6. Check the CDR Repository Manager trace.

7. Check to determine whether the Publisher is being upgraded. If the CDRAgentSendFileFailureContinues event is no longer present, the condition is corrected.

Cisco DRF Failure

Supported for CCM 5.x and later.

Description: The DRF backup or restore process has encountered errors. The event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Polling.

Severity: Critical.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 4 days.

Event Code: 7007.

Recommended Action: Verify that /common/drf has the required permission and enough disk space for the DRF user. Check the application logs for further details.

CDR File Delivery Failed

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

Description: The FTP delivery of CDR files to the outside billing server failed. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Polling.

Severity: Warning.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 60 minutes.

Event Code: 7011.

Recommended Action: Perform the following checks:

1. Check network link status.

2. Verify that the billing server is running.

3. Verify that the SFTP Server on the billing server is running and accepting requests.

4. Verify that the CDR configuration is correct, under Serviceability > Tools.

5. Check the CDR Repository Manager trace.

CDR High Water Mark Exceeded

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: The high water mark for CDR files has been reached, and some successfully delivered CDR files have been deleted. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Polling.

Severity: Warning.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear after 30 minutes.

Event Code: 7014.

Recommended Action: Do the following:

1. Check for too many undelivered CDR files accumulated due to some condition.

2. Check network link status.

3. Verify that the billing server is operational.

4. Verify that the SFTP Server on the billing server is running and accepting requests.

5. Verify that the CDRM Configuration for billing servers is correct using Serviceability > Tools.

6. Check to determine if CDR files maximum disk allocation is too low using Serviceability > Tools.

7. Check the CDR Repository Manager trace in /var/log/active/cm/trace/cdrrep/log4j.

CDR Maximum Disk Space Exceeded

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

Description: The CDR files disk usage exceeded the maximum allocation. Some undelivered files have been deleted. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

The CDR files disk usage has exceeded the maximum disk allocation. Some undelivered files have been deleted.

Trigger: Polling.

Severity: Critical.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 60 minutes.

Event Code: 7008.

Recommended Action: Do the following:

1. Check for too many undelivered CDR files accumulated due to some condition.

2. Check network link status.

3. Verify that the billing server is operational.

4. Verify that the SFTP Server on the billing server is running and accepting requests.

5. Verify that the CDRM Configuration for billing servers is correct using Serviceability > Tools.

6. Check to determine if CDR files maximum disk allocation is too low using Serviceability > Tools.

7. Check the CDR Repository Manager trace in /var/log/active/cm/trace/cdrrep/log4j.

Code Red

Description: Indicates that Cisco Unified Communications Manager has remained in a Code Yellow state for an extended period and cannot recover. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Syslog.

Severity: Critical.

Device Type: Cisco Unified Communications Manager or Cluster.

Clear Interval: Time-based auto-clear in Event Promulgation Module (EPM) after 4 days.

Event Code: 2048.

Recommended Action: When Cisco Unified Communications Manager enters a Code Red state, the Communications Manager service restarts, which also produces a memory dump that may be helpful for analyzing the failure. Generally, repeated call throttling events require assistance from your support team. Cisco Communications Manager SDI and SDL trace files record call-throttling events and can provide useful information. Your support team may request these trace files for closer examination.

Code Yellow

Description: This event is generated when Communications Manager has entered a Code Yellow state (call throttling) due to an unacceptably high delay in handling incoming calls. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Syslog.

Severity: Critical.

Device Type: Cisco Unified Communications Manager or Cluster.

Event Code: 2049.

Recommended Action: While this event generates, check process CPU usage and memory usage. Check for call bursts and an increased number of registered devices (phones, gateways, and so on) generated. Continuously monitor to see if Communications Manager is out of the Code Yellow state. You can launch synthetic tests such as the Dial Tone Test to check for any impact on call processing.

To try to circumvent the possibility of a Code Yellow event, consider the possible causes of a system overload, such as heavy call activity, low CPU availability to Cisco Unified Communications Manager, routing loops, disk I/O limitations, disk fragmentation, and so on, and investigate those possibilities.

For more information, go to the following URL:

http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/admin/5_1_3/ccmfeat/fsclthrt.html.

ComponentDown

Description: A component within Cisco Unified Contact Center (what used to be referred to as IPCC) is down. There are different Contact Center components: Router, Logger, CG, and Distributor. Each individual component is affected differently:

Router down—Call Center call routing will be impacted.

Logger down—Copying configuration to Administrative workstation will be impacted.

CG down—Computer Telephony Integration (CTI) Gateway down would impact CTI integration with agent desktop and contact center servers.

Distributor down—Contact Center administration through web view is impacted.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2039.

Recommended Action:

1. Check the component name and check the corresponding service status on the Contact Center device. Try starting the service if it is stopped.

2. If the service does not start, contact your Contact Center support team.

Core Dump File Found

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: A core dump file has been found in the system, which indicates a service crash. The event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: RTMT.

Severity: Critical.

Device Type: Cisco Unified Communications Manager or Cluster.

Clear Interval: Time-based auto-clear in EPM after 4 days.

Event Code: 7009.

Recommended Action:

1. Use RTMT Trace and Log Central to collect the new core files and the last trace log files from the corresponding service. Run gdb to get the back trace of each core file for further debugging.

2. From the Communications Manager Service Control page, you can verify whether the service was restarted successfully. If not, start it manually.

CPALoginFailureThresholdExceeded

Description: The attempts to log in to the web interface of the Cisco Personal Assistant (CPA) exceeds the threshold value.

Trigger: Exceeded the CPA Login Failure threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2086.

Recommended Action: Check the login access by trying to connect to Cisco Personal Assistant server.

CPATransferFailedThresholdExceeded

Description: Cisco Personal Assistant fails to transfer the call after threshold number of attempts.

Trigger: Exceeded the CPA Transfer Failed threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2087.

Recommended Action: Check to see if the call transfer service is up and running.

CPAVoicemailThresholdExceeded

Description: Attempts to log in to voicemail exceed the threshold value.

Trigger: Exceeded the CPA Voice Mail threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2088.

Recommended Action: Check the login access by trying to connect to Cisco Personal Assistant server.

CPUPegging

Description: The percentage of CPU load on a server is over the configured threshold. This event is generated based on polling RTMT precanned counters.

Trigger: RTMT.

Severity: Critical.

Device Type: Communications Manager or Cluster.

Event Code: 1013, 2126.

Recommended Action: Check the Communications Manager Windows Task Manager or Real Time Monitoring Tool (RTMT) to verify CPU high utilization. The most common reason is that one or more processes are using excessive CPU space. The event has information on which process is using the most CPU. Once the process is identified, you may want to take action, which could include restarting the process.

It is helpful to check the trace setting for that process. Using detailed trace level is known to take up excessive CPU space. Also check for events such as Code Yellow, and launch Operations Manager synthetic tests such as Dial Tone Test to see if there is any impact on call processing. If so, you may want to take more drastic measures, such as stopping nonessential services.

For more details, go to the following links: http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a00808ef0f4.shtml and http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a00807f32e9.shtml.

CPUUtilizationExceeded

Description: CPU utilization of individual voice services (Unity/CPA) or the whole system exceeds the threshold value.

Trigger: Exceeded the CPU Utilization threshold.

Severity: Warning.

Device Type: Media Server.

Event Code: 2085.

Recommended Action: Identify the services and applications that are taking too much CPU and stop them.


Note Events are removed for Communications Manager only. You may need to manually clear these Communications Manager events after your upgrade is complete.


CriticalServiceQualityIssue

 

Description: Operations Manager has received a MOS violation trap from Service Monitor and MOS has fallen below the value set on the Event Settings page. See Configuring Service Quality Event Settings, page 20-7.

Trigger: Event settings. See Configuring Service Quality Event Settings, page 20-7.

Severity: Critical.

Device Type: Service Quality events pertain to the call destination, which might be a device (voice gateway) or a phone.

Event Code: 8002.


Note This event is shown on the Service Quality Alert Details display. (See Using the Service Quality Alerts Display, page 4-3.) This event can be generated only when you have a licensed copy of Service Monitor.


CTILinkDown

Description: Communications Manager performance counter CcmLinkActive indicates that the total number of active links from CTI Manager to all active Cisco Unified Communications Managers in the cluster is zero. This event indicates that CTI Manager has lost communication with all Communications Managers in the cluster.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2096.

Recommended Action: CTI Manager maintains links to all active Communications Managers in the cluster. Investigate to determine the following:

Is the CTI Manager service running?

Is Communications Manager in the cluster running?

Does a network problem exist between the CTI Manager and Communications Managers in the cluster?

CUEApplicationStatusChange

Description: An application on Cisco Unity Express has come online or gone offline.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router.

Event Code: 2063.

Recommended Action: Check the status of the reported application. Investigate the root cause if it is offline.

CUEBackupFailed

Description: Cisco Unity Express voicemail backup failed.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Router.

Event Code: 2068.

Recommended Action:

1. Verify IP connectivity between the Microsoft FTP Server and the Cisco Unity Express.

2. Verify that the Microsoft Windows user account has the appropriate read and write access to the Microsoft FTP Server site directory.

3. Verify that the FTP Publishing Service is started on the Microsoft FTP Server.

4. View the history.log file on the Microsoft FTP Server to determine why the FTP transfer failed.

For more details, go to the following link: http://www.cisco.com/en/US/docs/voice_ip_comm/unity_exp/design/CP_CIPExpress/cipce19.html#wp1015702.

CUECallAgentConnectionLost

Description: Connection to the Communications Manager is lost. Communications Manager is integrated with Unity Express through JTAPI for voicemail and auto-attendant functionality. If a connection to the Communications Manager is lost, playing a greeting, leaving a message, or interacting with the system through dual tone multifrequency (DTMF) tones may be impacted.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Router.

Event Code: 2066.

Recommended Action:

1. From the Unity Express module, run show ccn status ccm-manager.

2. Verify IP connectivity between the Communications Manager and the Unity Express.

3. If IP reachability is working, check CTI Route Point and JTAPI configuration.

CUENTPIssue

Description: Cisco Unity Express clock is managed entirely by NTP. If NTP has an issue, many Unity Express features, such as voicemail envelope information and trace logging are affected.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router.

Event Code: 2069.

Recommended Action: Many of the Cisco Unity Express features depend on a reliable clock. You can verify the status of NTP and the location of the clock source with the show NTP status CLI command. Run trace NTP to debug NTP synchronization issues.

CUEResourceExhausted

Description: Notification indicates that the Unity Express has run out of a certain type of resource. For example, when all JTAPI or SIP ports are in used and new incoming calls cannot be made, this notification is generated.

Default Threshold: N/A.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Router.

Event Code: 2067.

Recommended Action: Install additional Cisco Unity Express resources, if needed.

CUESecurityIssue

Description: A security violation occurred in accessing the Unity Express administration page. This can be due to a login or PIN security alert.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router.

Event Code: 2064.

Recommended Action: Check the event description for the reason for the security problem.

CUEStorageIssue

Description: Cisco Unity Express has degradation issues with the Flash storage.

Default Threshold: N/A.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router.

Event Code: 2065.

Recommended Action: Check the event description for the appropriate reason and take action.

DataPhysicalDiskDown

Description: Hard drive failure event detected on Compaq boxes.

Default Polling Interval: 4 minutes.

Default Threshold: N/A.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2060.

Recommended Action: Contact Cisco for hardware replacement.

Event Attribute compaq_DaPhyDrvStatus provides the status of the physical drive. The following values are valid for the physical drive status:

other (1)—The instrument agent does not recognize the drive. You may need to upgrade your instrument agent and/or driver software.

failed (3)—The drive is no longer operating and should be replaced.

predictiveFailure(4)—The drive has a predictive failure error and should be replaced.

DBReplicationFailure

Supports Unified Communications Manager version 5.1.3 or later in RTMT polling (it is not applicable for Windows-based Communication Managers). For Communications Manager version 5.0 to 5.1.2, the event name is IDS Replication Failure.

Description: There is a Communications Manager database replication failure. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: RTMT.

Severity: Critical.

Device Type: Media Server.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 2091.

Recommended Action: Monitor syslog for IDSEngineCritical syslog messageClassID of 30-39, which indicates a replication problem. This message will have more detailed information on the cause of the event.

D Channel Out of Service

Description: Indicates that the MGCP D Channel is out of service. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Syslog.

Severity: Critical.

Device Type: Voice Gateway.

Clear Interval: Time-based auto-clear after 4 days.

Event Code: 7021.

Recommended Action: Check the status of affected D channel in the affected gateway to verify out of service. Investigate the root cause.

DPAPortCallManager
LinkDown

Description: This event indicates that the physical port connection to the Communications Manager is down. This is not applicable to DPA ports connected to the digital PBX system.

Trigger: Polling.

Severity: Critical.

Device Type: Voice Mail Gateway.

Event Code: 2013.

Recommended Action: Contact your support team.

DPAPortTelephonyLinkDown

Description: Indicates that the physical ports connected to Octel voicemail systems or digital PBX systems are down.

Trigger: Polling.

Severity: Critical.

Device Type: Voice Mail Gateway.

Event Code: 2014.

Recommended Action: Contact your support team.

Duplicate IP Address

Description: Same IP address is configured on multiple managed systems.

Trigger: Polling (often during rediscovery).

Severity: Critical.

Device Type: Host, Hub, Router, Optical Switch, or Switch.

Event Code: 1001.

Recommended Action: Check if any system is assigned a static IP that is present in the range of IP addresses assigned by the DHCP.

ExcessiveFragmentation

Description: System memory is highly fragmented.

Trigger: Exceeded Memory Fragmentation Threshold.

Severity: Critical.

Device Type: Host, Router, Switch, or Optical Switch.

Event Code: 1003.

ExpertAdvisorSystemDown

Description: Any of the subsystems of Expert Advisor are down. Sub-systems are Contact Manager, Media Platform Adapter, Business rule engine, and so on.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2112.

Recommended Action:

1. Check the subsystems name and check the corresponding service status on the Expert Advisor device. Try starting the service if it is stopped.

2. If the service does not start, contact your Expert Advisor support center.

FanDegraded

Description: This event indicates that an optional fan is not operating correctly. The event is based on polling or processing the SNMP trap cpqHeThermalSystemFanDegraded received from monitored Cisco Unified Communications Managers.

Default Threshold: N/A.

Trigger: Polling, or processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Media Server or Voice Gateway.

Event Code: 2015.

Recommended Action: Check the status of the reported fan and monitor for recurrence.

System chassis temperature is high.

FanDown

Description: Indicates that a required fan is not operating correctly. The event is based on processing the SNMP trap cpqHeThermalSystemFanFailed received from monitored Cisco Unified Communications Managers.

Default Threshold: N/A.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Media Server or Voice Gateway.

Event Code: 2016.

Recommended Action: Check the status of the reported fan and contact Cisco for hardware replacement.

Flapping

Description: Port or interface is repeatedly alternating between Up and Down states over a short period of time. Operations Manager issues this event by monitoring the number of link downs received within the link window for a particular network adapter (using the Link threshold and Link Window parameters).

Trigger: Exceeded Link Trap Threshold for Link Trap Window; or processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Host, Hub, Router, Optical Switch, or Switch.

Event Code: 1004.

Recommended Action: Reconfigure the Link Trap Threshold and Link Trap Window threshold parameters under Interface/Port Flapping Settings for the interface groups.

Hardware Failure

Description: Indicates that a hardware failure has occurred in the Communications Manager. This event is generated by monitoring the syslog messages received from Unified Communications Manager.


Note By default this event is not enabled.


Default Polling Interval: N/A.

Default Threshold: N/A.

Recommended Action: Check the RTMT Syslog Viewer for further details.

HighAnalogPortUtilization

Description: Percentage utilization of an analog port has exceeded one of the following:

Cisco Unified Communications Manager Analog Port Utilization.

FXS Port Utilization Threshold.

FXO Port Utilization Threshold.

MGCP Gateway Analog Port Utilization.

FXS Port Utilization Threshold.

FXO Port Utilization Threshold.

H323 Gateway Analog Port Utilization.

FXS Port Utilization Threshold.

FXO Port Utilization Threshold.

EM Port Utilization Threshold.


Note You must enable polling for Voice Utilization Settings to monitor this event.


Default Polling Interval: 4 minutes.

Default Threshold: 90%.

Severity: Critical.

Device Type: Media Server or Voice Gateway.

Event Code: 4100.

Recommended Action: Use this event to assess whether you should install additional resources. Check event details and identify which resource has exceeded the threshold. Use the performance graph to monitor resource utilization in real time over the past 72 hours, which will help you determine if you need to add resources in the voice network infrastructure.

HighBackplaneUtilization

Description: Utilization of the backplane bandwidth exceeds the backplane utilization threshold.

Trigger: Exceeded Backplane Utilization Threshold.

Severity: Critical.

Device Type: Host, Router, Switch, or Optical Switch.

Event Code: 1005.

HighBroadcastRate

Description: Input packet broadcast percentage exceeds the broadcast threshold. The input packet broadcast percentage calculates the percentage of total capacity that was used to receive broadcast packets.

Trigger: Exceeded Broadcast Threshold.

Severity: Critical.

Device Type: Host, Router, Switch, or Optical Switch.

Event Code: 1006.

Recommended Action: Reconfigure the Broadcast Threshold parameter under Generic Interface/Port Performance Settings for the interface groups.

HighBufferMissRate

Description: Rate of buffer misses exceeds the Memory Buffer Miss Threshold.

Trigger: Exceeded Memory Buffer Miss Threshold.

Severity: Critical.

Device Type: Host, Router, Switch or Optical Switch.

Event Code: 1007.

HighBufferUtilization

Description: Number of buffers used exceeds the Memory Buffer Utilization Threshold.

Trigger: Exceeded Memory Buffer Utilization Threshold.

Severity: Critical.

Device Type: Host, Router, Switch, or Optical Switch.

Event Code: 1008.

HighCollisionRate

Description: Rate of collisions exceeds the Collision Threshold.

Trigger: Exceeded Collision Threshold.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1009.

Recommended Action: Reconfigure the threshold parameter Collision Threshold under Generic Interface/Port Performance Settings for the interface groups.

HighDigitalPortUtilization

Description: Percentage utilization of a digital port has exceeded the threshold.

Trigger: Exceeded one of these thresholds:

Cisco Unified Communications Manager Digital Port Utilization:

BRI Channel Utilization Threshold.

T1 PRI Channel Utilization Threshold.

E1 PRI Channel Utilization Threshold.

T1 CAS Channel Utilization Threshold.

MGCP Gateway Digital Port Utilization:

BRI Channel Utilization Threshold.

T1 PRI Channel Utilization Threshold.

E1 PRI Channel Utilization Threshold.

T1 CAS Channel Utilization Threshold.

H323 Gateway Digital Port Utilization:

BRI Channel Utilization Threshold.

T1 PRI Channel Utilization Threshold.

E1 PRI Channel Utilization Threshold.

T1 CAS Channel Utilization Threshold.

E1 CAS Channel Utilization Threshold.

Note You must enable polling for Voice Utilization Settings to monitor this event.

Default Polling Interval: 4 minutes.

Default Threshold: 90.

Severity: Critical.

Device Type: Media Server or Voice Gateway.

Event Code: 4101.

Recommended Action: Use this event to assess whether additional resources need to be installed. When this event is generated, check event details and identify which resource has exceeded the threshold. Use the performance graph to monitor the resource utilization in real time for the past 72 hours to verify high utilization. Then determine if you need to add additional resources in the voice network infrastructure.

HighDiscardRate

Description: A HighDiscardRate event occurs when:

The input packet queued rate is greater than the minimum packet rate, and the input packet discard percentage is greater than the discard threshold.

The input packet queued rate is the rate of packets received without error. The input packet discard percentage is calculated by dividing the rate of input packets discarded by the rate of packets received.

The output packet queued rate is greater than the minimum packet rate, and the output packet discard percentage is greater than the discard threshold. The output packet queued rate is the rate of packets sent without error. The output packet discard percentage is calculated by dividing the rate of output packets discarded by the rate of packets sent.

Trigger: Exceeded Discard Threshold.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1010.

Recommended Action: Reconfigure the Discard Threshold parameter under Generic Interface/Port Performance Settings for the interface groups.

HighErrorRate

Description: A HighErrorRate event occurs for input or output packets when both of the following thresholds are exceeded:

Error Threshold—Percentage of packets in error.

Error Traffic Threshold—Percentage of bandwidth in use.

Trigger: Exceeded Error Threshold and Equaled or Exceeded Error Traffic Threshold.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1011.

Recommended Action: Reconfigure the Error Threshold and Error Traffic Threshold parameters under Generic Interface/Port Performance Settings for the interface groups.

HighPortUtilization

Description: Percentage of port utilization exceeds a threshold.

Note You must enable polling for Voice Utilization Settings to monitor this event.

Trigger: Exceeded one of these thresholds:

Voice Mail Gateway Port Utilization:

Voice Mail Port Utilization Threshold.

PBX Port Utilization Threshold.

Cisco Unity and Cisco Unity Connection Port Utilization:

Active InBound Ports Threshold.

Active OutBound Ports Threshold.

Severity: Critical.

Device Type: Media Server or Voice Mail Gateway.

Event Code: 4102.

Recommended Action: This event is generated when the percentage of Cisco Unity port utilization exceeds the Active InBound Ports Threshold (90%) or the Active OutBound Ports Threshold (90%). After receiving this event, find out the percentage of active inbound/outbound ports by checking the Detailed Device View for the percentage of active inbound/outbound port. Then determine if additional Unity ports need to be configured.

HighResourceUtilization

Description: A hardware resource threshold has been exceeded.

Trigger: Exceeded one of these thresholds:

Cisco Unified Communications Manager Resource Utilization:

MOH Multicast Resources Active Threshold.

MOH Unicast Resources Active Threshold.

MTP Resources Active Threshold.

Transcoder Resources Active Threshold.

Hardware Conference Resources Active Threshold.

Software Conference Resources Active Threshold.

Conference Streams Active Threshold.

MOH Streams Active Threshold.

MTP Streams Active Threshold.

Location Bandwidth Available Threshold.

H323 Gateway Resource Utilization:

DSP Utilization Threshold.

Gatekeeper Resource Utilization:

Total Bandwidth Utilization for Local Zone Threshold.

Interzone Bandwidth Utilization for Local Zone Threshold.

Cisco Unified Communications Manager Express Utilization:

Registered IP Phones Threshold.

Registered Key IP Phones Threshold.

Cisco Unity Express Utilization:

Capacity Utilization Threshold.

Session Utilization Threshold.

Orphaned Mailboxes Threshold.

Note You must enable polling for Voice Utilization Settings to monitor this event.

Default Polling Interval: 4 minutes.

Default Threshold: 90%.

Severity: Critical.

Device Type: Cisco Unified Communications Manager or cluster, gatekeeper, Media Server, Router, or Voice Gateway.

Event Code: 4103.

HighResourceUtilization (continued)

Recommended Action: Assess whether you should install additional resources. While this event is generated, click the event ID to view event details and identify which resource has exceeded the threshold. Use the performance graph or RTMT (for Communications Manager) to monitor the resource utilization in real time over the past 72 hours to verify high utilization and determine if you need to install additional resources.

HighUtilization

Description: Current utilization exceeds the utilization threshold configured for this processor. The most common reason for this event is that one or more processes are using excessive CPU space.

Trigger: Exceeded one of these thresholds:

Utilization Threshold.

Processor Utilization Threshold.

Severity: Critical.

Device Type: Host, Media Server, Router, Switch, Optical Switch, or Voice gateway.

Event Code: 1013.

Recommended Action: Identify the processes using excessive CPU space. You may want to take action, which may include restarting the identified process or processes.

HTTPInaccessible

Description: HTTP service cannot be used to communicate to all Communications Managers in the cluster. This might be due to one or both of the following:

The Web Services for all Communications Managers in the cluster is down.

The credentials (HTTP username, password) for at least one of the running Web Services were not found or are incorrect.

Default Polling Interval: 4 minutes.

Default Threshold: N/A.

Trigger: Polling.

Severity: Critical.

Device Type: Cisco Unified Communications Manager or Cluster.

Event Code: 2009.

Recommended Action: Verify that all Communications Managers are accessible via Web Service with the credentials provided in Operations Manager. Provide the correct username and password if the credentials are wrong. You might need to restart the web server if Web Service is down. Make sure that Communications Managers are patched to protect against viruses.

IAJitterDS_ThresholdExceeded

Description: There is a violation in the IAJitterDS threshold configured. This results in poor voice quality.This may be a result of any node in the path being heavily loaded or experiencing a failure.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4010.

Recommended Action: Check the connectivity between the destination and source test configuration.

Check the QOS settings in intermediate routers which may impact Jitter.

IBMDiskTrapEvent

Description: The IBM RAID Drive failure on the device or hard disk is removed from the slot.

This may degrade the performance of the disk drive and result in data loss in logical drives, depending on the RAID level.

Recommended Actions:

Depending on the RAID level, remedial actions to be taken vary:

RAID recovery software may have to be used to recover/restore the data.

Replace the failed drive.

ICT Call Throttling

Description: Cisco Unified Communications Manager has detected a route loop over the H323 trunk. As a result, Communications Manager has temporarily stopped accepting calls for the indicated H323 device. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Syslog.

Severity: Informational.

Device Type: Communications Manager or Cluster.

Recommended Action: Check the route pattern configured in the Communications Manager and remove the route loop.

IdeAtaDiskDown

Description: The Compaq IDE/ATA hard disk drive is down.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2062.

Recommended Action:

1. Check whether or not the hard disk can be seen by the hard disk controller. Usually on a hard disk failure, the disk will not be detectable by the controller (but this is not always the case).

2. If you can see the hard disk when you auto-detect, the problem is more likely to be software than hardware.

3. If the drive is detected in the BIOS setup but cannot be booted or accessed when booting from a floppy disk, the disk itself may be faulty.

IDS Replication Failure

Supports Unified Communications Manager version 5.0, 5.1.1, and 5.1.2 (Not for Windows- based Communications Managers) in Syslog/RTMT.

Description: A subscriber in a Cisco Unified Communications Manager cluster experienced a failure while replicating the data to the publisher database. This event needs to be manually cleared to delete it.

This event is generated by monitoring the syslog messages received from Communications Manager.

Trigger: RTMT.

Severity: Critical.

Device Type: Media Server.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 2091.

Recommended Action: Monitor syslog for IDSEngineCritical syslog messageClassID of 30-39, which indicates a replication problem. This message will have more detailed information on the cause of the event.

InformAlarm

Description: Critical event generated from processed traps.

Trigger: SNMP trap (see Processed SNMP Traps, page C-1).

Severity: Informational.

Event Code: 1014.

InsufficientFreeHardDisk

Description: Free Hard Disk Memory available is insufficient. This may degrade the performance of the device.

Trigger: Exceeded Free Hard Disk Threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2020.

Recommended Action:

1. Uninstall unnecessary applications.

2. Free up disk space by deleting temporary files.

3. Cleanup unnecessary files.

See also InsufficientFreeMemory and InsufficientFreeVirtualMemory.


Note Events are removed for Unified Communications Manager only. You may need to manually clear these CCM events after your upgrade is complete.


InsufficientFreeMemory

 

Description: System is running out of memory resources or there has been a failure to allocate a buffer due to lack of memory.

Trigger: Exceeded Free Memory Threshold.

Severity: Critical.

Device Type: Host, Media Server, Router, Switch, or Optical Switch.

Event Code: 1015.

Recommended Action: On Cisco IOS devices, run show memory to check memory utilization. Sometimes high memory utilization is indicative of a memory leak. Identify which process is using excessive memory and take action (including restarting the process).

On the other devices, do the following:

1. Close any unnecessary applications.

2. Stop the services that are not being used or are not required.

See also InsufficientFreeHardDisk and InsufficientFreeVirtualMemory.


Note Events are removed for CCM only. You may need to manually clear these CCM events after your upgrade is complete.


InsufficientFreeVirtualMemory

 

Description: System is running out of virtual memory resources. This may degrade the performance of the device.

Trigger: Exceeded Free Virtual Memory Threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2022.

Recommended Action:

1. Stop the services which are not used or that are not required to be running.

2. Increase the virtual memory on the device.

See also InsufficientFreeHardDisk and InsufficientFreeMemory.


Note Events are removed for CCM only. You may need to manually clear these CCM events after your upgrade is complete.


IPCCDualStateNotification

Description: The Unified Contact Center sent a notification with a value of cccaEventState in the trap details.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Media Server.

Event Code: 2070.

Recommended Action: See the event details page, which provides information on any actions that need to be taken.

IPCCSingleStateNotification

Description: The Unified Contact Center sent a notification with a value of singleStateRaise for ccaEventState in the trap details.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Media Server.

Event Code: 2070.

Recommended Action: See the event details page, which provides information on any actions that need to be taken.

JitterDS_ThresholdExceeded

Description: There is a violation in the JitterDS threshold. This results in poor voice quality.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4007.

Recommended Action: Check the connectivity between the destination and source (DS) on which the test is configured.

Check the QoS settings that may impact latency in the intermediate routers. This may be the result of a heavy load or crash on a node in the path.

For more information, see Using Node-To-Node Tests, page 11-1.

JitterSD_ThresholdExceeded

Description: There is a violation in the JitterSD threshold. This results in poor voice quality.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4006.

Recommended Action: Check the connectivity between the source and destination (SD) on which the test is configured.

Check the QoS settings that may impact latency in the intermediate routers. This may be the result of a heavy load or crash on a node in the path.

For more information, see Using Node-To-Node Tests, page 11-1.

LocationBWOutOfResources

Description: A call through a Communications Manager location failed due to lack of bandwidth in the cluster.


Note Polling must be enabled for Voice Utilization Settings to monitor this event.


Trigger: Exceeded Location Out of Resources Threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2094.

Recommended Action: Indicates location CAC issue and is useful for ensuring that Voice over IP trunk sizing is adequate. Check event details to identify the location that has the CAC issue. Check the Detailed Device View to find out the available bandwidth for the location and determine if additional bandwidth needs to be configured.

LogPartitionHighWatermarkExceeded

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

Description: The percentage of used disk space in the log partition has exceeded the configured high-water mark. This event is generated based on polling RTMT precanned counters.

Trigger: Exceeded high-water mark threshold.

Severity: Informational.

Device Type: Communications Manager.

Event Code: 2132.

Recommended Action: Log partition usage can be monitored from the RTMT Disk Usage page. It shows up as Common Partition. Check for trace settings and core dump files. Note that core dump files are fairly large. Typically, a core dump file is 200 to 300 MB in size, but it can be as large as 1 to 2 GB.


Note Once the log partition disk usage goes above the high-water mark threshold, Cisco Log Partition Monitoring Tool (LPM) will start deleting files to put log partition disk usage under the low-water mark threshold. Since LPM may delete the trace, log, or core dump files you want may want to keep, it is very important to act when you receive a LogPartitionLowWaterMarkExceeded alert. You can use Trace & Log Central (TLC) to download files and delete them from the server.


LogPartitionLowWatermarkExceeded

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

Description: Free disk space is low. The percentage of used disk space in the log partition has exceeded the configured low water mark. There are no files to be purged under such a situation. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: Exceeded free hard disk threshold.

Severity: Informational.

Device Type: Media Server.

Event Code: 2131.

Recommended Action: Log partition usage can be monitored from the RTMT Disk Usage page. It shows up as Common Partition. Check for trace settings and for core dump files. Note that core dump files are fairly large. Typically, a core dump file is 200 to 300 MB in size, but it can be as large as 1 to 2 GB.


Note Once the log partition disk usage hits the low-water mark threshold, Cisco Log Partition Monitoring Tool (LPM) will start deleting files to put log partition disk usage under the low-water mark threshold. Since LPM may delete the trace, log, or core dump files you want may want to keep, it is very important to act when you receive a LogPartitionLowWaterMarkExceeded alert. You can use Trace & Log Central (TLC) to download files and delete them from the server.


See also LogPartitionHighWatermarkExceeded and LowInactivePartitionAvailableDiskSpace.

LostContactWithCluster

Description: MGCP voice gateway ports/interface are unregistered with the Cisco Unified Communications Manager cluster.

Trigger: Polling.

Severity: Critical.

Device Type: Voice Gateway (voice port, voice interface), Voice Mail Gateway (voice port), Digital Voice Gateway, gatekeeper.

Event Code: 2035.

Recommended Action: Go to the Communications Manager to verify the registration status of the reported gateways. Verify that IP connectivity exists between the cluster and the gateways.

LowActivePartitionAvailableDiskSpace

Supports Unified Communications Manager version 5.x or later.

Description: The percentage of available disk space on the active partition is lower than the configured value. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: Threshold.

Severity: Critical.

Device Type: Communications Manager.

Recommended Action: Some of the symptoms of low active disk space include:

Communications Manager Admin page does not operate correctly.

BAT does not operate correctly.

RTMT does not operate correctly.

Since there are no user-manageable files in the Active Partition, check the alert threshold. If the alert threshold is at the Cisco default, then contact your support team for guidance.

LowAvailableDiskSpace

Supports Unified Communications Manager version 4.x or later.

Description: The percentage of available disk space is lower than the configured value. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: Threshold.

Severity: Critical.

Device Type: Communications Manager.

Recommended Action: Check available disk space percentage and deleted unnecessary files.

LowAvailableInboxLicenses

Description: The number of available Unity inbox licenses is lower than the configured Unity Inbox License Threshold. Cisco Unity Subscriber Feature - Unity Inbox licenses allow you to enable subscribers for the add-on feature called Unity Inbox. Each subscriber enabled for this feature uses one of these licenses.

Trigger: Threshold.

Severity: Critical.

Device Type: Media Server.

Recommended Action: Check to see if you need additional Unity inbox licenses.

LowAvailableSubscriberLicenses

Description: The number of available Unity licenses is lower than the threshold. Cisco Unity Subscriber licenses allow you to add basic voicemail subscribers to the system. Each subscriber uses one license.

Trigger: Fell below Unity License Threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2104.

Recommended Action: Check to see if additional subscriber licenses are needed.

LowAvailableVirtualMemory

Description: The percentage of available virtual memory is lower than the configured value. This event indicates that the available Virtual Memory is running low.This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: Threshold.

Severity: Critical.

Device Type: Communications Manager.

Event Code: 2022.

Recommended Action: Check Cisco Unified Communications Manager Windows Task Manager or the RTMT tool to verify insufficient memory. This event may be due to a memory leak. It is important to identify which process is using excessive memory. Once the process is identified, if you suspect a memory leak (for example, if the memory usage for a process increases continually, or a process is using more memory than it should), you may want to contact your support team.

LowInactivePartitionAvailableDiskSpace

Supports Unified Communications Manager version 5.x or later.

Description: The percentage of available disk space of the inactive partition is lower than the configured value. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A

Trigger: Threshold.

Severity: Critical.

Device Type: Communications Manager.

Event Code: 2020.

Recommended Action: Since there are no user-manageable files in Inactive Partition, check the alert threshold. If the threshold is at the Cisco default, then contact your support team for guidance.

LowSwapPartitionAvailableDiskSpace

Supports Unified Communications Manager version 5.x or later.

Description: The percentage of available disk space of the swap partition is lower than the configured value. This event indicates that available swap partition is running low.


Note The swap partition is part of virtual memory. Therefore, low available swap partition disk space also means low virtual memory. This event is generated based on polling RTMT precanned counters.


Default Polling Interval: 30 seconds.

Default Threshold: N/A

Trigger: Threshold.

Severity: Critical.

Device Type: Communications Manager.

Event Code: 2121.

Recommended Action: Find out how much swap space and virtual memory are still available. Also find out which process is using the most memory. This event may be due to a memory leak. Once you determine that there is a memory leak and virtual memory is running low, you may want to restart the service after saving the necessary troubleshooting information. Please consult your support team for further information.

MajorAlarm

Description: Critical event generated from processed traps.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Informational.

Event Code: 1016.

Recommended Action: Contact your support team.

Media List Exhausted

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: All available media resources defined in the media list are busy. This event is generated by monitoring the syslog messages received from Communications Manager.

Default Polling Interval: N/A.

Default Threshold: N/A.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 2056, 4103, 2052, 2053, 2057 and SNMP trap.

Recommended Action: Install additional resources to the indicated media resource list. This event indicates a network failure or device failure.

MeetingPlaceSwAlarm

Description: MeetingPlace device reports any software alarm.

Trigger: Trap-Based Event.

Severity: Warning.

Device Type: Media Server.

Event Code: 2113.

Recommended Action: Each notification includes an integer exception code and an error string which indicates the software module and server reporting the alarm. Read through the MeetingPlace alarms documentation or contact the MeetingPlace team to find out more on the issue.

MinorAlarm

Description: Critical event generated from processed traps.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Informational.

Event Code: 1017.

MosCQDS_ThresholdExceeded

Description: There is a violation in the configured MosCQDS threshold. This results in poor voice quality. This may be a result of any node in the path being heavily loaded or experiencing a failure.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4012.

Recommended Action: Check the connectivity between the destination and source (DS) test configuration.

Check the QOS settings in intermediate routers which may impact Jitter, Packet loss, or delay which may impact the MOS.

MosLQDS_ThresholdExceeded

Description: There is a violation in the configured MosLQDS threshold. This results in poor voice quality. This may be a result of any node in the path being heavily loaded or experiencing a failure.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4013.

Recommended Action: Check the connectivity between the destination and source (DS) test configuration.

Check the QOS settings in intermediate routers which may impact Jitter, Packet loss, or delay which may impact the MOS.

MWIOnTimeExceeded

Description: This threshold is related to Message Wait Indicator (MWI) synthetic test and provides information on the VoiceMail lamp of the phone.

MWIOnTime threshold is the difference between the timestamp when MWI (voicemail ) message is left and timestamp when the MWI light goes ON. If the time taken is more than this threshold, then the MWIOnTime event is raised.

Trigger: Exceeded MWI on time threshold.

Severity: Warning.

Device Type: Media Server.

Event Code: 2024.

Recommended Action: The network administrator should check the load on the Unity server.

NicDown

Description: A Network Interface Controller on a Cisco Unified Contact Center system is down. This impacts TDM telephony services.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2040.

Recommended Action:

1. Check if NIC service is running. Try starting the service if it is stopped.

2. If the service does not start, contact the Cisco Unified Contact Center support team.

NodeToNodeTestFailed

Description: The configured IPSLA test has failed on the source device. The reason for failure is indicated based on the error code.

Trigger: Node-to-node tests.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4000.

Recommended Action: Take action based on the error code. If the error code indicates there is a timeout or a problem with reaching the destination, then repair the problem in the path or the destination.

Number Of Registered Phones Dropped

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: The number of registered phones in the cluster dropped more than the configured percentage between consecutive polls. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Threshold.

Severity: Warning.

Device Type: Phone.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 7016.

Recommended Action: Phone registration status must be monitored for sudden changes. If the registration status changes slightly and readjusts quickly over a short time frame, it could indicate a phone move, addition, or change. A sudden smaller drop in the phone registration counter could indicate a localized outage; for instance, an access switch or a WAN circuit outage or malfunction. A significant drop in registered phone level requires immediate attention from the administrator.

Number Of Registered Gateways Decreased1

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: The number of registered gateways decreases between two consecutive RTMT polls.

Trigger: Polling.

Severity: Warning.

Device Type: Voice Gateway.

Clear Interval: Time-based auto-clear after 60 minutes.

Number Of Registered Gateways Increased1

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: The number of registered gateways increases between two consecutive RTMT polls.

Trigger: Polling.

Severity: Informational.

Device Type: Voice Gateway.

Clear Interval: Time-based auto-clear after 60 minutes.

Number Of Registered MediaDevices Decreased1

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: A registered media device count decreases between two consecutive RTMT polls.

Trigger: Polling.

Severity: Informational.

Device Type: Cluster.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 7018.

Number Of Registered MediaDevices Increased1

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: A registered media device count increases between two consecutive RTMT polls.

Trigger: Polling.

Severity: Informational.

Device Type: Cluster.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 7017.

OperationallyDown

Description: Interface card or network adapter operational state is not normal.

Trigger: Polling, or processed trap (see Processed SNMP Traps, page C-1).

Note For interfaces, Operations Manager will only generate an OperationallyDown clear event if the card is reinserted into the same slot, and if the module index is the same before and after the card is reinserted.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1018.

Recommended Action: Check the status of indicated interface, port, or card and investigate the root cause.


Note For interfaces, Operations Manager generates an OperationallyDown clear event only if the card is reinserted into the same slot, and if the module index is the same before and after the card is reinserted.


OutofRange

Description: Device temperature or voltage is outside the normal operating range. When an OutofRange event is generated, you will normally also see fan, power supply, or temperature events.

Trigger: Exceeded one of these thresholds:

Relative temperature threshold.

Relative voltage threshold.

Severity: Critical.

Device Type: Host, Router, Switch, or Optical Switch.

Event Code: 1019.

PacketLossDS_ThresholdExceeded

Description: The configured value for the PacketLossDS threshold was violated. This results in poor voice quality. This may be a result of any node in the path being heavily loaded or crashed.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4005.

Recommended Action: Check the connectivity between the destination and source (DS) test configuration.

Check the QOS settings in intermediate routers which may impact Packet loss. Revisit the queue sizes of the nodes in the path and the drop policy if the problem persists.

For more information, see Using Node-To-Node Tests, page 11-1.

PacketLossSD_ThresholdExceeded

Description: The configured value for the PacketLossSD threshold was violated. This results in poor voice quality. This may be a result of any node in the path being heavily loaded or crashed.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4005.

Recommended Action: Check the connectivity between the destination and source (DS) test configuration.

Check the QOS settings in intermediate routers which may impact Packet loss. Revisit the queue sizes of the nodes in the path and the drop policy if the problem persists.

For more information, see Using Node-To-Node Tests, page 11-1.

PhoneReachabilityTestFailed

Description: Operations Manager cannot reach an IP phone. The IP phone has not responded to three or more successive pings from Operations Manager or the IP SLA device.

Trigger: Polling.

Severity: Critical.

Device Type: IP Phone.

Event Code: 9002.

Recommended Action: Check the connectivity between the phone and the IPSLA device.

PhoneUnregistered

Description: The selected phone-based notification group's phones unregistered count is less than the Unified Communications Manager/Unified Communications Manager Express-based event threshold for the same Unified CM Express.

Default Polling Interval: 4 minutes.

Default Threshold: 5.

Trigger: Polling (Notification).

Severity: Warning.

Device Type: IP Phone.

Event Code: N/A.

Recommended Action: Phone registration status must be monitored for sudden changes. If the registration status changes slightly and readjusts quickly over a short time frame, it may indicate a phone move, addition, or change. A sudden smaller drop in the phone registration counter may indicate a localized outage; for instance, an access switch or a WAN circuit outage or malfunction. A significant drop in registered phone level requires immediate attention from the administrator.

PhonesUnregisteredThresholdBased

Description: Indicates that the selected phone-based notification group's phones unregistered count is more than the Unified Communications Manager-based event threshold.

Default Polling Interval: 4 minutes.

Default Threshold: 5.

Trigger: Polling (Notification).

Severity: Warning.

Device Type: Unified Communications Manager/Unified Communications Manager Express.

Event Code: N/A.

Recommended Action: Phone registration status must be monitored for sudden changes. If the registration status changes slightly and readjusts quickly over a short time frame, it may indicate a phone move, addition, or change. A sudden smaller drop in the phone registration counter may indicate a localized outage; for instance, an access switch or a WAN circuit outage or malfunction. A significant drop in registered phone level requires immediate attention from the administrator.

PimDown

Description: Cisco Unified Contact Center Peripheral Interface Manager (PIM) module acts as a gateway to a peripheral device (Communications Manager/IVR/CTI Agents). This event indicates that the PIM is down on the Cisco Unified Contact Center device and connectivity to these peripheral devices is lost.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2041.

Recommended Action:

1. Check if PIM service is running. Try starting the service if it is stopped.

2. If the service does not start, contact your Cisco Unified Contact Center support team.

3. Check network connectivity across the peripheral devices and Cisco Unified Contact Center.

PowerSupplyDegraded

Description: Power supply state is degraded.

Default Polling Interval: 4 minutes.

Default Threshold: N/A.

Trigger: Polling.

Severity: Warning.

Device Type: Media Server or Voice Gateway.

Event Code: 2026.

Recommended Action: Check the status of reported power supply and monitor for recurrence.

PowerSupplyDown

Description: Power supply state is down.

Default Polling Interval: 4 minutes.

Default Threshold: N/A.

Trigger: Trap and polling.

Severity: Critical.

Device Type: Media Server or Voice Gateway.

Event Code: 2027.

Recommended Action: Check the status of reported power supply and contact Cisco for hardware replacement if the primary power supply is down.

Quality Dropped Below Threshold

Description: The voice quality expectation defined by the MOS score has not been met. This results in poor voice quality based on the delay, latency, packet loss, and jitter in the network.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4009.

Recommended Action: Check the connectivity and QoS settings of routers configured as part of the test. Fix problems with the delay, latency, packet loss, or jitter parameters in the network that may result in poor voice quality.

For more information, see Using Node-To-Node Tests, page 11-1.

RegistrationResponseTime_ThresholdExceeded

Description: The registration response time threshold configured as part of the gatekeeper registration test has been violated.

The endpoints trying to register with the gatekeeper may experience a delay which results in some voice calls not being established properly.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4003.

Recommended Action: Check the network performance path to the gatekeeper from the endpoints. Check if the gatekeeper is heavily loaded. This may result in some delay in accepting the registration requests from the endpoints.

For more information, see Using Node-To-Node Tests, page 11-1.

RepeatedRestarts

Description: System repeatedly restarts over a short period of time. Device Fault Manager issues this event. It does this by monitoring the number of system cold and warm starts received within the restart window.

Trigger: Exceeded Restart Trap Threshold for Restart Trap Window; or processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1020.

Recommended Action: Reconfigure the Restart Trap Threshold and Restart Trap Windows parameters under Reachability Settings > Data Settings.

RFactorDS_ThresholdExceeded

Description: The configured value for RFactorDS threshold was violated. This results in poor voice quality.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4011.

Recommended Action: Check the the QoS settings and network issues which affect the voice quality between the source and destination.

RingBackResponseTime_ThresholdExceeded

Description: Ring-back response time exceeds the node-to-node test threshold.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4002.

Recommended Action: Verify that there is a CPU load issue in the gatekeeper. The delay may be the network delay between the endpoint and the gatekeeper.

For more information, see Using Node-To-Node Tests, page 11-1.

RoundTripResponseTime_ThresholdExceeded

Description: Round-trip response time fallen below the node-to-node test threshold.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4001.

Recommended Action: Verify that there is a CPU load issue in the gatekeeper. The delay may be the network delay between the endpoint and the gatekeeper.

For more information, see Using Node-To-Node Tests, page 11-1.

Route List Exhausted

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: This indicates that all available channels defined in the route list are busy. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Default Polling Interval: N/A.

Default Threshold: N/A.

Trigger: Polling.

Severity: Critical.

Device Type: Voice Gateway.

Clear Interval: Time-based auto-clear after 60 minutes.

Event Code: 4104, 4106, and 4107.

Recommended Action: Check the RTMT Syslog Viewer for verification and further details. Assess whether additional resources should be added in the indicated route.

RTMTDataMissing

Description: Even though the Publisher is known to Operations Manager, a query to RTMT resulted in errors for all the nodes in the cluster. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Trigger: Polling.

Severity: Informational.

Device Type: Unified Communications Manager.

Recommended Action: Check if all the nodes in the cluster are running and whether a network problem exists.

RTPPacketLossDS_ThresholdExceeded

Description: The configured value for the RTPPacketLossDS was violated. This results in poor voice quality.

Trigger: Node-to-node test.

Severity: Warning.

Device Type: Router or Switch.

Event Code: 4014.

Recommended Action: Check the the QoS settings and network issues which affect the voice quality between the destination and source.

SCSIControllerDown

Description: Indicates that the bridge between a hard disk drive's low-level interface and a host computer, which needed to read blocks of data, is down.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2101.

Recommended Action: If the event attribute compaq_ScsiCtrlStatus is other(1) and failed(3), then for:

other(1)

The controller is not operational due to reasons other than hardware failure. The controller may be intentionally disabled.

The controller configuration may be in conflict with the configuration of other hardware or software in the system. Reconfigure any conflicts.

The controller may have failed initialization. Corrective action is Operating System dependent. Check for diagnostic messages which may have occurred at system boot time.

failed(3)

The controller has failed and is no longer operating. Run Compaq Diagnostics on the computer system to help identify the problem.

SCSIDriveDown

Description: Compaq SCSI hard disk drive is down. SCSI controller may be unable to communicate with the device hard disk.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2061.

Recommended Action:

The physical drive can be in one of the following states. The event attribute compaq_ScsiPhyDrvStatus indicates the following states:

other(1)—The drive is in a state other than one of those listed below.

failed(3)—The physical drive has failed and can no longer return data. When convenient, you should bring down the server and run Compaq Diagnostics to help identify the problem. The drive may need to be replaced.

notConfigured(4)—The physical drive is not configured. You need to insure all of the drive's switches are properly set and that the Compaq Configuration Utility has been run.

badCable(5)—IA physical drive is not responding. You should check the cables connected to it. You can bring down the server and run Compaq Diagnostics to help identify the problem.

missingWasOk(6)—A physical drive had a status of OK and is no longer present. The drive has been removed.

missingWasFailed(7)—A physical drive had a status of FAILED and is no longer present. The drive has been removed.

predictiveFailure(8)—The physical drive has exceeded a threshold value for one of it's predictive indicators. When convenient, you should bring down the server and run Compaq Diagnostics to help identify the problem. The drive may need to be replaced.

missingWasPredictiveFailure(9)—A physical drive had a status of PREDICTIVE FAILURE and is no longer present. The drive has been removed.

offline(10)—The physical drive is offline and can no longer return data. No further status is available.

missingWasOffline(11)—A physical drive had a status of OFFLINE and is no longer present. The drive has been removed.

SDL Link Out Of Service

Description: This event indicates that the local Cisco Unified Communications Manager has lost communication with the remote Communications Manager.

Default Polling Interval: N/A.

Default Threshold: N/A.

Trigger: Syslog.

Severity: Warning.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 4 days.

Event Code: 7002.

Recommended Action: Investigate why the remote Communications Manager is not running or whether a network problem exists.

Sensor Down

Description: A Cisco 1040 or NAM has stopped responding to keepalives from Service Monitor. This event appears on the Alert Details page and can be generated only when you have a licensed copy of Service Monitor.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Unidentified Trap (because Operations Manager does not monitor Cisco 1040 or NAM.)

Event Code: 8004.

Recommended Action: This event appears on the Alert Details page and can be generated only when you have a licensed copy of Service Monitor.

ServiceDown

Description: One of the critical services (any of the services in the Detailed Device View) is not running. The problem may be due to someone manually stopping the service (not applicable for Communications Manager). If you intend to stop the service for a long period of time, disabling monitoring for the service is highly recommended and is required to avoid this event. Go to Service Level View > Detailed Device View, select the specific service, and change the managed state to False.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Trigger: Polling.

Severity: Critical (dependent on service).

Device Type: Media Server.

Event Code: 2007.

Recommended Action: Identify which services are not running. You can start the service manually from the Administrator Service Control page.

Check to see if there are any core files. Download the core files, if any, as well as service trace files.


Note Events are removed for Communications Managers only. You may need to manually clear these Communications Manager events after your upgrade is complete.


ServiceQualityIssue

 

Description: Operations Manager has received a MOS violation trap from Service Monitor. This indicates that MOS has dropped below a threshold that is set in Service Monitor.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Service Quality events pertain to the call destination, which might be a device (voice gateway) or a phone.

Event Code: 8001.

Note This event is shown on the Service Quality Alert Details display. (See Using the Service Quality Alerts Display, page 4-3.) This event can be generated only when you have a licensed copy of Service Monitor.

ServerUnreachable

Description: Host is not reachable through RTMT polling. This event is generated based on polling RTMT precanned counters.

Default Polling Interval: 30 seconds.

Default Threshold: N/A.

Recommended Action: Investigate if the indicated host is running and whether a network problem exists.

SOAPNotReachable

Description: A device experienced Simple Object Access Protocol (SOAP) connectivity failure while polling. SOAP attributes will not be polled.

Default Polling Interval: 4 minutes.

Trigger: Cisco Unified Communications Manager device experienced SOAP communication failure with the management application. Unified Communications Manager may be overloaded or the Web Service may be down. Rediscover the device in Cisco Unified Operations Manager.

Severity: Critical.

Device Type: Media Server.

Recommended Action: Restart IIS Admin Service and Cisco RIS Data Collector on Communications Manager.

Event Code: 2109.

SoftwareAlarm

Description: Event indicates alarm generated from Windows Event Log trap processing.

Default Polling Interval: N/A.

Default Threshold: N/A.

Trigger: Processed trap.

Severity: Critical.

Device Type: MediaServer.

Event Code: 2141.

Recommended Action: N/A.

SRSTEntered

Description: An IP telephony router is functioning in Survivable Remote Site Telephony (SRST) mode, performing call management for phones in place of the central Cisco Unified Communications Manager. The event is generated when a WAN link is down, preventing IP phone TCP keepalive messages from reaching the Communications Manager.

Trigger: Polling (See also Table 18-1 on page 18-3).

Severity: Critical.

Device Type: Router, Switch, or Optical Switch.

Event Code: 9000.

Recommended Action: Check the connectivity between the phone and the Communications Manager to which it is registered.

Note This event triggers activity on the IP Phone Outage Status monitoring dashboard.

SRSTRouterFailure

Description: There is a SRST router failure. This trap comes with a notification reason (csrstSysNotifReason) which describes the failure. The SRST feature will probably not work on the branch site when this condition occurs.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Router or VoiceGateway.

Event Code: 2071.

Recommended Action:

1. Check the event attributes to find out the cause of the failure.

2. Telnet to the router console port address and type reload at the router prompt.

SRSTSuspected

Description: IP Phone Information Facility reports that all phones associated with the SRST router are unregistered, but the WAN link between phones and the central Cisco Unified Communications Manager is up.

Trigger: Polling.

Severity: Critical.

Device Type: Router, Switch, or Optical Switch.

Event Code: 9001.

Recommended Action: Check the connectivity between the phone and the SRST router.

Note This event triggers activity on the IP Phone Outage Status monitoring dashboard.

StateNotNormal

Description: A fan, power supply, temperature sensor, or voltage sensor is not acting normally. When an OutofRange event is generated, you will also see a fan, power supply, or temperature event.

Trigger: Polling.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, or Optical Switch.

Event Code: 1021.

Recommended Action: Contact your support team.

SyntheticTestFailed

Description: Synthetic tests are CPU intensive. There is a threshold set for high-CPU utilization which ensures that tests will not run when system CPU utilization is more than 80%. When the high-CPU Utilization threshold of 80% is reached, synthetic tests are stopped and a SyntheticTestFailed event is created. This signifies a failure in execution of the synthetic text due to high CPU, and not a failure in its result. This event is also raised when the synthetic test fails for other reasons.

Trigger: Polling.

Severity: Critical.

Device Type: Media Server.

Event Code: 2011.

Recommended Action: The network administrator should ensure that the CPU utilization of the system is below 80%. Check the event properties for details of the failure.

SyntheticTestThresholdExceeded

Description: Raised when a synthetic test exceeds some threshold value. The following are thresholds based on the type of synthetic tests:

RegistrationTimeThreshold—Time limit for phone registration exceeded.

DialtoneTimeThreshold—Time limit for dial-tone test exceeds.

EndToEndCallSetupTimeThreshold—Time limit for setting up an end-to-end call exceeded.

Trigger: Polling.

Severity: Warning.

Device Type: Media Server.

Event Code: 9003.

Recommended Action: The network administrator should check the connectivity between the Communications Manager, the calling phone, and the called phone, depending on the type of synthetic test (MWI, End-to-end, phone registration, CER, or off-hook).


Note For the MWI test, check the Unity configuration. For the CER test, check the configuration of the PSAP server, OSAN server, and CER server.


SystemVersionMismatched

Supports Unified Communications Manager version 6.0.x in Syslog/RTMT.

Description: There is a mismatch in the system version among all servers in the cluster. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Polling.

Severity: Informational.

Device Type: Unified Communications Manager.

Recommended Action: Ensure that all servers in the cluster are running the same system version.

TemperatureHigh

Description: Current temperature of temperature sensor exceeds the Relative Temperature threshold.

Default Polling Interval: 4 minutes.

Default Threshold: 10%.

Trigger: Exceeded Relative Temperature Threshold.

Severity: Critical.

Device Type: Media Server, Router, or Switch.

Event Code: 2029.

Recommended Action: Verify that environmental temperatures are set up optimally. Check other events, such as FanDown or FanDegraded, to verify that fans are operating normally. If fans are not operating normally, you should contact Cisco for hardware replacement.

See also OutofRange.

TemperatureSensorDegraded

Description: The server temperature is outside of the normal operating range. The event is based on polling or processing the cpqHeThermalTempDegraded SNMP traps received from monitored Cisco Unified Communications Managers.

Default Threshold: N/A.

Trigger: Polling, or processed trap (see Processed SNMP Traps, page C-1).

Severity: Warning.

Device Type: Media Server or Voice Gateway.

Event Code: 2030.

Recommended Action: Identify the reported temperature sensor location (ioboard/cpu) and verify status. Check other events, such as FanDown or FanDegraded, to verify that system fans are operating normally. Contact Cisco for hardware replacement, if needed.

TemperatureSensorDown

Description: Indicates that the server temperature is outside of the normal operating range and the system will be shut down. The event is based on processing the cpqHeThermalTempFailed SNMP trap received from monitored Cisco Unified Communications Managers.

Fault Condition Event Details

Default Threshold: N/A.

Trigger: Processed trap (see Processed SNMP Traps, page C-1).

Severity: Critical.

Device Type: Media Server or Voice Gateway.

Event Code: 2031.

Recommended Action: Verify that environmental temperatures are set up correctly. Identify the reported temperature sensor location (ioboard/cpu) and verify status. Check other events, such as FanDown or FanDegraded, to verify that system fans are operating normally. Contact Cisco for hardware replacement, if needed.

Thread Counter Update Stopped

Supports Unified Communications Manager version 5.1.3 or later in Syslog/RTMT.

Description: Total number of processes and/or threads exceeded the maximum number of tasks. This situation could indicate that some process is leaking or has thread leaking. System Access must stop thread counter update to avoid CPU pegging, and only provide process counter information for up to the maximum number of processes. This event is generated by monitoring the syslog messages received from Unified Communications Manager.

Trigger: Exceeded defined threshold.

Severity: Informational.

Device Type: Communications Manager.

Clear Interval: Time-based auto-clear in EPM after 4 days.

Event Code: 7023.

Recommended Action: Check the alert detail for the process which has the highest number of threads and the process which has the most instances. If the process has an unusual number of threads or instances, save the trace for the service and try restarting the service. Make sure to download trace files associated with the service.

TotalTimeUsedThresholdExceeded

Description: Indicates that the Cisco Unity Express has reached maximum allocated voicemail capacity. This impacts the voicemail features for users serviced by this Unity Express.

Trigger: Exceeded Total Time Used Threshold.

Severity: Critical.

Device Type: Media Server.

Event Code: 2047.

Recommended Action:

1. Delete old voicemail messages to increase available capacity.

2. On the router, run show voicemail usage to get overall usage information.

3. On the router in enable mode, run show voicemail mailboxes to see mailboxes for all users. Delete invalid user voicemail accounts.

4. Set default message expiry to 30 days. Set the expiry time individually for each mailbox to a shorter duration. To check this expiry period, run show voicemail limit and check the Default Message Age value.

5. Check if a license for more voicemail capacity can be purchased. On the router in enable mode, run show software licenses to get details of licenses applied.

UMRCommunicationError

Description: This event is based on WMI. It indicates that the Cisco Unity Message Repository (UMR) cannot communicate with the Partner Mail Server to deliver messages. Messages will be held in the temporary store until the mail server is available.

Trigger: WMI event.

Severity: Critical.

Device Type: Media Server.

Event Code: 2106.

Recommended Action:

On-Box Exchange Troubleshooting:

Verify that the required services related to MS Exchange are up and running, Try to Access Exchange Mail store using the system manager tool. Ensure that there is no red X there.

Review the Windows event log for any events related to Exchange Messaging application failure or Exchange-related service failure (System attendant, Information Store, MTA, and so on).

Check to see if the CuMdbStoreMonitor service is running on the Cisco Unity Server. Try to restart this service and see if Unity can communicate with the Exchange server.

Verify Active Directory (AD) and Global Catalog server functionality on the same system.

Verify that the DNS is configured properly and working on the same system for AD to work properly; the exchange relies on AD.

If all of the previous checks are good, then reboot the Unity server, if permitted.

If the problem still persists, then contact your customer support case.

Off-Box Exchange Troubleshooting

Perform the on-box troubleshooting, then complete the following steps:

From the Unity server try to ping the Exchange server by both IP address and hostname.

Check to see if Virus Scan is blocking mail delivery to the Partner Mail Server.

Ensure that no firewall changes have taken place to block any higher layer communications between the Cisco Unity and the Exchange server.

UnknownPublisher

Description: The publisher in the cluster is unknown to Operations Manager. This event is generated based on polling RTMT precanned counters.

Trigger: RTMT.

Severity: Critical.

Device Type: Unified Communications Manager.

Recommended Action: Check if the Publisher is managed by Operations Manager.

UnityFailOverOrRestart

Description: One of the following has occurred:

In standalone Cisco Unity configuration—The Cisco Unity system has restarted.

In Cisco Unity failover configuration—A failover between the primary and secondary Unity servers has occurred.

Note UnityFailOverOrRestart is automatically cleared after 30 minutes. Clearing of this event does not indicate that failback has occurred. When failback does occur from secondary to primary, you will see the UnityFailOverOrRestart event on the primary Unity server.

Trigger: WMI event.

Severity: Critical.

Device Type: MediaServer.

Event Code: 2105.

Recommended Action: Check the Cisco Unity event view for any error messages related to failover or restart. When a failover occurs, the changes made to the data in the SQL Database (UnityDb) are replicated from the primary server to the secondary server. However, there may be instances when these changes are not replicated from the primary to the secondary server. For more details, go to the following URL: http://www.cisco.com/en/US/products/sw/voicesw/ps2237/products_tech_note09186a0080837de4.shtml.

Unresponsive

Description: Device does not respond to ICMP or SNMP requests. Probable causes are:

On a system: ICMP ping requests and SNMP queries to the device timeout received no response.

On an SNMP Agent: Device ICMP ping requests are successful, but SNMP requests time out with no response.

Note A system might also be reported as unresponsive if the only link (for example, an interface) to the system goes down. Operations Manager performs root cause analysis for any unresponsive events. If Operations Manager receives a device unresponsive event, it will clear any interface unresponsive events from that device until the device is recognized as responsive.

Trigger: Polling.

Severity: Critical.

Device Type: Host, Hub, Router, Switch, Optical Switch, Media Server, Phone Access Switch, Voice Mail Gateway, or Voice Gateway.

Event Code: 1022.

Recommended Action: Check if the device is reachable from Operations Manager.

VoicePortOperationallyDown

Description: Voice port's operational state is not normal.

Trigger: Polling.

Severity: Critical.

Device Type: Voice Gateway.

Event Code: 2037.

Note This event only applies to switch ports, not voice gateway ports.

1 This event is not available out of the box. To activate this pair of events, perform the following steps: (a) Open the NMSROOT\conf\seg\sysLogConfig.xml file. (b) Uncomment the Syslog by removing the lines marked. (c) Restart the SEGServer process.


Obsolete Events

Table E-2 contains a list of events removed from the Monitoring Dashboard in Operations Manager. Also listed are equivalent events, if available, that can be used in place of the obsolete events.


Note Some obsolete events may not be removed from the Alerts and Events display immediately after upgrade. It may take up to an hour for these events to be removed. If any of these events are displayed after this time frame, you must manually clear them. Events that may need to be manually removed appear in the footnote in Table E-2.


Table E-2 Operations Manager Obsolete Events and Replacement Events 

Obsolete Event
Event Replacement Option

ActivePortThresholdExceeded

Consider using FXS/FXO PortsInService/Active counter from Cisco Communications Manager or Cisco MGCP Gateways objects.

ApplicationDown

None.

BackupActivated

None.

CallManagerDown

For Cisco Unified Communications Manager Express, see CCMEDown.

CCMCDRFilesBackupFailed

Supports Unified Communications Manager version 5.x or later in Syslog/RTMT.

No replacement event; event is replaced by Real Time Monitor and syslog receiver.

CCMHttpServiceInaccessible

None.

CCMLineLinkDown

Replaced by syslog event.

CiscoCCMAttendantConsoleHeartBeatExceeded

No replacement event; event is replaced by Real Time Monitor and syslog receiver.

CiscoMessagingInterfaceHeartBeatExceeded

No replacement event; event is replaced by Real Time Monitor and syslog receiver.

CiscoTftpHeartBeatExceeded

No replacement event; event is replaced by Real Time Monitor and syslog receiver.

CiscoTranscoderAvailResourceLow

See HighResourceUtilization.1

CodeYellowStateEntered

Code Yellow.

CodeRedStateEntered

Code Red.

ConnectionToDistributorFailed

None.

cpuUtilizationExceeded1

Removed for Unified Communications Manager only. Other devices are supported by CPUUtilizationExceeded event.

This event should be manually cleared after upgrade.

Consider using CPUPegging or CallProcessingNodeCPUPegging for Unified Communications Manager.

CTIDeviceNotRegistered

None.

ExceededMaximumUptime

Use CiscoWorks LAN Management Solution (LMS) for data network monitoring.

ExcessiveDAFaults

None. (HP MIB incorrectly reported faults on certain MCS platforms.

ExcessiveTFTPRequestsAborted

None.

GatewayLostContactWithCluster

None.

HardwareConferenceOutOfResources

MediaListResourceExhausted syslog message replaces out-of-resource messages for all media types.

HeartBeatThresholdExceeded

No replacement event; event is replaced by RTMT (Real Time Monitor) and syslog receiver.

HighCapacityUtilization

None.

HighPriorityQueueFull

CodeYellow.

HighQueueDropRate

None.

HighRouteGroupUtilization

None.

HighRouteListUtilization

None.

InsufficientFreeHardDisk1

Replaced only for Unified Communications Manager. Other devices still support this.

This event should be manually cleared after upgrade.

Consider using LogPartitionHighWatermarkExceeded, LogPartitionLowWatermarkExceeded, and LowActivePartitionAvailableDiskSpace.

InsufficientFreeMemory1

Replaced only for Unified Communications Manager. Other devices still support this.

This event should be manually cleared after upgrade.

Consider using LowAvailableVirtualMemory.

InsufficientFreePhysicalMemory1

Consider using InsufficientFreeMemory.

InsufficientFreeVirtualMemory1

Replaced only for Unified Communications Manager. Other devices still support this.

This event should be manually cleared after upgrade.

InterfaceOperationallyDown

Consider using OperationallyDown.

InvalidResponse

None.

 

None.

MOHConnectionLost

None.

MOHOutOfResource

MediaListResourceExhausted syslog message replaces out-of-resource messages for all media types.

MTPOutOfResource

MediaListResourceExhausted syslog message replaces out-of-resource messages for all media types.

 

None.

OutboundBusyAttemptsThresholdExceeded

Consider using OutboundBusyAttempt from counter in Cisco MGCP FXS/FXO device object or Cisco MGCP Gateways object.

PortsOutOfServiceThresholdExceeded

Consider using Port Status under Cisco FXS/FXO Gateway devices counter.

Resumed

None.

RouteGroupExhausted

None.

RouteGroupFailed

Route List Exhausted,

RouteListFailed

Check Route List Exhausted (generated using syslog) for route list status.

ServicePartiallyRunning

None.

ServiceRestarted

Replaced by Real-Time Monitoring Threshold (RTMT).

ServiceTerminated

None.

SoftwareConferenceOutOfResources

MediaListResourceExhausted syslog message replaces out-of-resource messages for all media types.

Suspended

None.

SYSLOGNotificationsEvent

This generic event was based on Syslog SNMP Traps and is now 
replaced by the more specific syslog message-based events.

TooManyUnityPortsActive

See HighPortUtilization.2

TooManyInboundPortsActive

See HighPortUtilization.1

TooManyOutboundPortsActive

See HighPortUtilization.1

TranscoderOutOfResources

MediaListResourceExhausted syslog message replaces out-of-resource messages for all media types.

UnityPortHung

None.

1 If these events appear in the Alerts and Events display, you should manually remove them. These events are removed only for CCM.

2 To view the equivalent HighPortUtilization and HighResourceUtilization events, enable performance polling.

2


Suppressing or Unsuppressing Events

You can suppress the events that you do not want to monitor at the device level or at global levels. Individual component suppression is not allowed. However, you can suppress a particular event for all components at individual device or global levels.

You can suppress or unsuppress events using the following methods:

Suppressing or Unsuppressing Events from the User Interface

Suppressing or Unsuppressing Events from the Command Line


Note Operations Manager has event flood control that reduces the number of events processed until after event flooding has stopped. When Operations Manager detects a flood of events, it activates event flood control. After the flooding stops, events are processed normally. For more details on how Operations Manager manages flooding, see Understanding Event Flooding Control.


Suppressing or Unsuppressing Events from the User Interface

To configure events by suppressing or unsuppressing them, perform these steps.


Step 1 If you have installed Operations Manager in a directory other than the default installation directory (C:\PROGRA~1\CSCOpx), then you must edit the \bin\suppressevent.bat file to set NMSROOT to the directory where you installed Operations Manager. For example, set NMSROOT=F:\CSCOpx.

Step 2 Open a command prompt and run NMSROOT\bin\suppressevent.bat.

A list of options appears.

Step 3 Type the letter corresponding to the task you wish to perform. To suppress an event, type A.

A new list of options appears. See Table E-3.

Table E-3 Event Suppression Script Details

Required Data
Description

Managed Device Name

Enter the Managed Device Name as shown in the Detailed Device View or Alert and Events Display. To suppress or unsuppress a particular event for all the devices, type ALL.

Event Name

Enter the event name that you want to suppress or unsuppress. Be sure to enter uppercase and lowercase characters exactly as they appear in the event name. If event names have spaces in them, enter the event name inside double quotes. If the event name is misspelled or capitalization is incorrect, the event will not be suppressed/unsuppressed. See Table E-1 for the list of event names.

If you enter the event name correctly, future events are suppressed or unsuppressed for the device indicated.


Step 4 Enter the managed device name and event name you want to suppress. See Table E-3 for details.

Step 5 Restart the daemon manager from the command line for changes to take effect. Enter:

net stop crmdmgtd 
net start crmdmgtd 

Step 6 To unsuppress an event, type B.

A list of options appears. See Table E-3.

Step 7 Restart the daemon manager from the command line for changes to take effect. Enter:

net stop crmdmgtd 
net start crmdmgtd 

Step 8 To view the list of suppressed events, type C.

The list of all the events that are being suppressed by Operations Manager is displayed.

Step 9 To quit the script, type Q.

The command window closes.


Suppressing or Unsuppressing Events from the Command Line

You can suppress or unsuppress events from the command line as follows. Change the default directory based on where you installed Operations Manager.

C:\PROGRA~1\CSCOpx\bin\suppressevent.bat [suppress/unsuppress] DeviceName EventName

To list the current suppressed events, enter:

C:\PROGRA~1\CSCOpx\bin\suppressevent.bat List


Note Event names with spaces in them require double quotes around them.


Understanding Event Flooding Control

When Operations Manager detects a flood of active events, it can reduce the number of active events processed by enabling event flood control. After event flooding stops, the events that were temporarily not monitored are again processed by Operations Manager.

To understand more about event flood control, see the following topics:

Event Flooding Rules

Viewing Event Flood Logs

Event Flooding Rules

Operations Manager detects flooding using the following rules. If any of these rules occur, it is considered a flood.

For a specific device and a specific component, the same event occurs X number of times in Y number of minutes. The X and Y values are preconfigured in the system and stored in the database. Each event name has its own values.

For example, if an OperationallyDown event occurs for device 10.1.20.2 for Interface I/0 16 times in 2 minutes, then it is considered an event flood.

For a specific device, the same type of event occurs X number of times in Y number of minutes. The X and Y values are preconfigured in the system and stored in the database. Each event name has its own values.

For example, if an OperationallyDown event occurs for device 10.1.20.2 800 times in 4 minutes, then it is considered an event flood.

For a specific device, different types of events occur X number of times in Y number of minutes. The X and Y values are preconfigured in the system and stored in the database. The global setting for all devices is 1,000 events in 4 minutes.

Once an event flood is detected for a given event, the subsequent events are not monitored until the device rule indicates that the event flood has stopped. Flood control is performed only for active events. The other event states (Cleared, UserCleared, and Acknowledged) are not considered flooding.

When the event flood is detected in Operations Manager, events get dropped and are logged in to the CSCOpx\log\CUOM\EPM\FloodDroppedEvents.log file. An example of an event that violates the event flood rule appears below:

2009|11:36:18.437|ERROR|Flood|EventBinder|EventFloodController|checkFlood|null|Minute:20762286 Received 28 in last 16 minutes. To suppress la-ccm-11.cisco.com^#!$la-ccm-11.cisco.comServerUnreachable
23 Jun 2009|11:36:18.437|ERROR|Flood|EventBinder|EPMPluginImpl|processEventFilteration|null|Flood has occured for device = la-ccm-11.cisco.com, component= la-ccm-11.cisco.com, eventName= ServerUnreachable, state= Active

An example of a dropped event appears below:

2009|12:16:03.984|ERROR|Event|EventBinder|EventBinder|processNormalizedEvent|null| Event Dropped for component: 172.25.109.221 ; Reason :-> Exceeded limit for eventStatus = Active

For more details about how Operations Manager handles events during high CPU utilization, see Event Processing for the Alerts and Events Display During High CPU Utilization, page 3-10.

Viewing Event Flood Logs

During event flooding, check the following files in CSCOpx\log\cuom\EPM for event details:

1. FloodDroppedEvents.log—When events violate the flood control rules an entry appears in this file.

2. EPMDroppedEvents.log—When the events are huge in number and exceed the product limits an entry appears in this file.

3. ClearedDroppedEvents.log—When the events are huge in number, exceed the product limits, and are Cleared events that can't be processed, an entry appears in this file.

For more details on EPM logs update notification, see Notifying Users About EPM Log Updates, page 15-25.