Table Of Contents
Voice Faults and Exceptions
Overview of Faults and Exceptions
Device Types that Generate Faults
Contents of the Fault Tables
Media Server Faults
ICS 7750 Faults
Voice Gateway and Phone Access Switch Faults
Voice Mail Gateway Faults
Monitored Phone Faults
Gatekeeper Fault
Suspect Phone Fault
Voice Cluster Faults
General VHM Faults
Pass-Through Traps
Voice Faults and Exceptions
A fault is an abnormal condition that occurs when a system or a system component violates a performance threshold or is not functioning properly. An exception is a group of related faults.
VHM groups related faults into a single exception. That is, it generates a single exception of a given type per device, regardless of the number of faults that exist.
The following topics are discussed:
•
Overview of Faults and Exceptions
•
Media Server Faults
•
ICS 7750 Faults
•
Voice Gateway and Phone Access Switch Faults
•
Voice Mail Gateway Faults
•
Monitored Phone Faults
•
Gatekeeper Fault
•
Suspect Phone Fault
•
Voice Cluster Faults
•
General VHM Faults
•
Pass-Through Traps
Overview of Faults and Exceptions
By polling SNMP MIBs and subscribing to voice-related events received by DFM, VHM obtains event information to analyze, and generates faults for voice-enabled devices.
Users can review a summary of faults on the Real-Time Dashboard (see the "Using the Real-Time Dashboard" section) and view the generated alarms on the Monitoring Console (see the "Using the Monitoring Console" section).
For additional overview information, see the following topics:
•
Device Types that Generate Faults
•
Contents of the Fault Tables
Device Types that Generate Faults
VHM generates faults for the voice device groups that comprise the following types of voice-enabled devices:
•
Voice Clusters
•
Media Servers
•
ICS 7750
•
Voice Gateways—VHM obtains some of the event information for voice gateways from DFM.
•
Phone Access Switches—VHM obtains some of the event information for Phone Access Switches from DFM.
•
Voice Mail Gateways
•
Monitored Phones
Contents of the Fault Tables
Fault tables include the following types of information:
•
Managed entity type where faults can be detected.
•
Faults that are detected.
•
User-configurable thresholds that are used to define the tolerance limits of each fault condition.
•
Exceptions raised by VHM when a fault condition exceeds thresholds.
Media Server Faults
Table 3-1 lists the media server faults diagnosed by VHM.
Note
IBM environment attributes (temp, fan, and power supply) are not supported.
Table 3-1 Media Server Faults
Managed Entity
|
Faults
|
Thresholds
|
Notification
|
Media Convergence Server (MCS)
|
• Unresponsive
• SNMP Agent Not Responding
|
|
Operational exception
|
• High Processor Utilization
• Insufficient Free Hard Disk Space
• Insufficient Free Physical Memory
• Insufficient Free Virtual Memory
|
ProcessUtilitizationThreshold
FreeHardDiskThreshold
FreePhysicalMemoryThreshold
FreeVirtualMemoryThreshold
|
Resource exception
|
Media Convergence Server (continued)
|
• Temperature High
• Temperature Sensor Down
• Temperature Sensor Degraded
• Fan Down
• Fan Degraded
|
TemperatureCelsiusThreshold
|
Temperature exception
|
• Power Supply Down
• Power Supply Degraded
|
|
Power supply exception
|
Interface
|
Interface Operationally Down
|
|
Operational exception
|
Application (Cisco CallManager, Workflow Application, Database Server, Conference Bridge, TFTP Server)
|
Transaction Failed
|
|
Application monitor exception
|
Too Many Failed Synthetic Transactions
|
FailureThreshold
|
Application monitor exception
|
Application Down
|
|
Application Exception
|
Cisco CallManager
|
CallManager Down
|
|
Application exception
|
Cisco CallManager (release 3.1 and 3.2 only)
|
Discovery Failed
|
|
|
Cisco CallManager (release 3.2 only)
|
TooManySuspectPhones
|
|
|
Remote Insight Board (RIB)
Applies to MCS-7830 models only.
|
• Battery Low
• Battery Failed
• Battery Disconnected
|
BatteryPercentChargedThreshold
|
Power supply exception
|
Media server faults are described in more detail, grouped by notification type:
•
Operational Exceptions
•
Resource Exceptions
•
Temperature Exceptions
•
Power Supply Exceptions
•
Application Exceptions
•
Application Monitor Exceptions
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults (for a complete list of media server faults, see Table 3-1):
•
Unresponsive—The device is unreachable from the DFM server.
•
Interface Operationally Down—The interface is nonoperational.
Resource Exceptions
VHM generates a resource exception for multiple occurrences of the following faults (for a complete list of media server faults, see Table 3-1):
•
High Processor Utilization—The processor utilization exceeds the threshold. ProcessUtilizationThreshold defines the upper limit for CPU utilization and is expressed as a percentage of total CPU capacity.
•
Insufficient Free Hard Disk Space—The free disk space is less than the low free disk space threshold (FreeHardDiskThreshold).
•
Insufficient Free Physical Memory—The system is running out of memory resources, and the threshold value is less than the FreePhysicalMemoryThreshold value.
•
Insufficient Free Virtual Memory—The system is running out of virtual memory resources, and the threshold value is less than the FreeVirtualMemoryThreshold value.
Temperature Exceptions
VHM generates a temperature exception for multiple occurrences of the following faults (for a complete list of media server faults, see Table 3-1):
•
System Temperature Sensor Down/Degraded—The temperature sensor is reporting abnormal temperature measurements. Possible conditions are OK, Degraded, and Failed.
•
Temperature High—The operating temperature is higher than the threshold.
•
Fan Down/Degraded—The system fan condition is not normal. The possible conditions are OK, Degraded, and Failed.
Power Supply Exceptions
VHM generates a power supply exception for multiple occurrences of the following faults (for a complete list of media server faults, see Table 3-1):
•
Battery Low/Failed—Remote Insight Board battery status is not normal. It is either Low or Failed.
•
Battery Disconnected—Remote Insight Board battery is disconnected.
•
Power Supply Down/Degraded—The power supply is not in a normal state. The possible states are OK, Degraded, and Failed.
Application Exceptions
VHM generates an application exception for multiple occurrences of the following fault (for a complete list of media server faults, see Table 3-1):
•
CallManager Down—Cisco CallManager is not running.
•
Application Down—Application is not running.
Application Monitor Exceptions
VHM generates an application exception for multiple occurrences of the following fault (for a complete list of media server faults, see Table 3-1):
•
Transaction Failed—Synthetic Transactions on this application were unsuccessful.
ICS 7750 Faults
Table 3-2 lists the ICS 7750 faults diagnosed by VHM.
Table 3-2 ICS 7750 Faults
Managed Entity
|
Faults
|
Thresholds
|
Notification
|
ICS 7750
|
Unresponsive
|
|
Operational exception
|
Power Supply Down
|
|
Power supply exception
|
Fan Down
|
|
Temperature exception
|
SPEs
|
CallManager Down
|
|
Application exception
|
• Insufficient Free Disk Space
• Insufficient Free Virtual Memory
|
FreeHardDiskThreshold
FreeVirtualMemoryThreshold
|
Resource exception
|
Unresponsive
|
|
Operational exception
|
Multiservice Route Processor (MRP)
|
• High Utilization
• Insufficient Free Memory
|
ProcessUtilizationThreshold
FreePhysicalMemoryThreshold
|
Resource exception
|
• Interface Operationally Down
• Unresponsive
|
|
Operational exception
|
System Switch Processor (SSP)
|
• High Utilization
• Insufficient Free Memory
|
ProcessUtilizationThreshold
FreePhysicalMemoryThreshold
|
|
• Interface Operationally Down
• Unresponsive
|
|
Operational exception
|
ICS 7750 faults are described in more detail, grouped by notification type, in the following:
•
Operational Exceptions
•
Resource Exceptions
•
Power Supply Exceptions
•
Application Exceptions
•
Temperature Exceptions
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults (for a complete list of ICS 7750 faults, see Table 3-2):
•
Unresponsive—If one of the ICS 7750 entities (for example, the media server, trunk card, or BPS) is down, a fault is generated.
•
Interface Operationally Down—Status of Interface is down.
Resource Exceptions
VHM generates a resource exception for multiple occurrences of the following faults (for a complete list of ICS 7750 faults, see Table 3-2):
•
High Utilization—The processor utilization exceeds the processor utilization threshold. ProcessUtilizationThreshold defines the upper limit for CPU utilization and is expressed as a percentage of total CPU capacity.
•
Insufficient Free Disk Space—The free disk space is less than the low free disk space threshold (FreeHardDiskThreshold).
•
Insufficient Free Memory—The system is running out of memory resources, and the threshold value is less than FreePhysicalMemoryThreshold.
•
Insufficient Free Virtual Memory—The system is running out of virtual memory resources, and the threshold value is less than FreeVirtualMemoryThreshold.
Power Supply Exceptions
VHM generates a power supply exception for multiple occurrences of the following faults (for a complete list of ICS 7750 faults, see Table 3-2):
•
Power Supply State Down/Degraded—The power supply is not in a normal state. The possible states are OK, Degraded, and Failed.
Application Exceptions
VHM generates an application exception for multiple occurrences of the following fault (for a complete list of ICS 7750 faults, see Table 3-2):
•
CallManager Down—Application is not running.
Temperature Exceptions
VHM generates an application exception for multiple occurrences of the following fault (for a complete list of ICS 7750 faults, see Table 3-2):
•
Fan Down/Degraded—The system fan condition is not normal. The possible conditions are OK, Degraded, and Failed.
Voice Gateway and Phone Access Switch Faults
Table 3-3 displays the Voice Gateway and Phone Access Switch Faults diagnosed by DFM and further processed by VHM.
.
Table 3-3 Voice Gateway and Phone Access Switch Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Digital Voice Gateway
|
Interface Operationally Down
|
|
Operational exception
|
Lost Contact with Cluster
|
|
Connectivity exception
|
Voice Gateway
Phone Access Switch
|
• Unresponsive
• Unresponsive (SNMP agent)
• Voice Port Operationally Down
• Interface Operationally Down
• Voice Port Administratively Down
• Interface Administratively Down
• Phone Removed
• Card Down
|
|
Operational exception
|
• High Utilization (CPU)
• Insufficient Free Memory
|
ProcessUtilizationThreshold
FreePhysicalMemoryThreshold
|
Resource exception
|
Voice Gateway
Phone Access Switch
(continued)
|
• Temperature Sensor Degraded
• Temperature Sensor Down
• Fan Down
• Fan Degraded
|
|
Temperature exception
|
• Power Supply Degraded
• Power Supply Down
|
|
Power supply exception
|
• Port Lost Contact with Cluster
• Gateway Lost Contact with Cluster
• Voice Interface Lost Contact with Cluster
• Voice Card Lost Contact with Cluster
|
|
Connectivity exception
|
Voice Gateway and Phone Access Switch faults are described in more detail, grouped by notification type, in the following:
•
Operational Exceptions
•
Resource Exceptions
•
Temperature Exceptions
•
Power Supply Exceptions
•
Connectivity Exceptions
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults (for a complete list of Voice Gateway and Phone Access Switch faults, see Table 3-3):
•
Unresponsive—The device is not reachable. The ICMP pings sent by the VHM server timed out without responding.
•
Unresponsive (SNMP agent)—The device is not responding to SNMP requests. ICMP pings are OK, but SNMP requests are timed out.
•
Interface Operationally Down—A voice interface is down.
•
Interface Administratively Down—A voice interface is down.
•
Voice Port Operationally Down—A voice port is down.
•
Voice Port Administratively Down—A voice port is down.
•
Phone Removed—IP phone lost network connection to the switch. This fault occurs only during rediscovery of the switch (through either manual rediscovery or nightly inventory collection).
•
Card Down—A voice card is down.
Resource Exceptions
VHM generates a resource exception for multiple occurrences of the following faults (for a complete list of Voice Gateway and Phone Access Switch faults, see Table 3-3):
•
High Utilization—The processor utilization exceeds the CPU utilization threshold. ProcessUtilizationThreshold defines the upper limit for CPU utilization and is expressed as a percentage of total CPU capacity.
•
Insufficient Free Memory—The system is running out of memory resources and the threshold value is less than the FreePhysicalMemoryThreshold value.
Temperature Exceptions
VHM generates a temperature exception for multiple occurrences of the following faults (for a complete list of Voice Gateway and Phone Access Switch faults, see Table 3-3):
•
Temperature Sensor Degraded—The temperature sensor condition is Degraded.
•
Temperature Sensor Down—The temperature sensor condition is Failed.
•
Fan Degraded—The fan condition is Degraded.
•
Fan Down—The fan condition is Failed.
Power Supply Exceptions
VHM generates a power supply exception for multiple occurrences of the following faults (for a complete list of Voice Gateway and Phone Access Switch faults, see Table 3-3):
•
Power Supply Degraded—The power supply is not in a normal state. The state is Degraded.
•
Power Supply Down—The power supply is not in a normal state. The state is Down.
Connectivity Exceptions
VHM generates a connectivity exception for multiple occurrences of the following fault (for a complete list of Voice Gateway and Phone Access Switch faults, see Table 3-3):
•
Lost Contact with Cluster—A digital voice interface lost registration with a Cisco CallManager cluster.
•
Port Lost Contact with Cluster—A voice port lost registration with a Cisco CallManager cluster.
•
Gateway Lost Contact with Cluster—A voice gateway lost registration with a Cisco CallManager cluster.
•
Voice Interface Lost Contact with Cluster—A voice interface lost registration with a Cisco CallManager cluster.
•
Voice Card Lost Contact with Cluster—A voice card lost registration with a Cisco CallManager cluster.
Voice Mail Gateway Faults
Table 3-4 displays the Voice Mail Gateway faults diagnosed by DFM and further processed by VHM.
Table 3-4 Voice Mail Gateway Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Voice Mail Gateways
|
• Unresponsive
• Interface Operationally Down
• Interface Administratively Down
|
|
Operational exception
|
Port Lost Contact with Cluster
|
|
Connectivity exception
|
DPA Port CallManager Link Down
|
|
DPA CallManager link exception
|
DPA Port Telephony Link Down
|
|
DPA telephony link exception
|
High Utilization (CPU)
|
ProcessUtilizationThreshold
|
Resource exception
|
Voice Mail Gateway faults are described in more detail, grouped by notification type, in the following:
•
Operational Exceptions
•
Resource Exceptions
•
Connectivity Exceptions
•
Other Exceptions
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults (for a complete list of Voice Mail Gateway faults, see Table 3-4):
•
Unresponsive—The device is not reachable.
•
Interface Operationally Down—A voice interface is down.
•
Interface Administratively Down—A voice interface is down.
Resource Exceptions
VHM generates a resource exception for multiple occurrences of the following faults (for a complete list of Voice Mail Gateway faults, see Table 3-4):
•
High Utilization—The processor utilization exceeds the CPU utilization threshold. ProcessUtilizationThreshold defines the upper limit for CPU utilization and is expressed as a percentage of total CPU capacity.
Connectivity Exceptions
VHM generates a connectivity exception for multiple occurrences of the following fault (for a complete list of Voice Mail Gateway faults, see Table 3-4):
•
Port Lost Contact with Cluster—The DPA port lost contact with the cluster.
Other Exceptions
VHM generates exceptions for multiple occurrences of the following faults (for a complete list of Voice Mail Gateway faults, see Table 3-4):
•
DPA Port CallManager Link Down—There is no connectivity between the DPA port and the CallManager.
•
DPA Port Telephony Link Down—There is no connectivity between the DPA port and the Octel voice mail.
Monitored Phone Faults
Table 3-5 displays the Monitored Phone faults diagnosed by DFM and further processed by VHM.
Table 3-5 Monitored Phone Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Monitored Phones
|
Unresponsive
|
|
Operational exception
|
Monitored Phone Lost Contact with Cluster
|
|
Connectivity exception
|
Extension Number Removed
|
|
|
Phone Discovery Error
|
|
|
Monitored Phone faults are described in more detail, grouped by notification type, in the following:
•
Operational Exceptions
•
Connectivity Exceptions
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults (for a complete list of Monitored Phone faults, see Table 3-5):
•
Unresponsive—The phone is not reachable. The ICMP pings sent by the VHM server timed out without responding.
Connectivity Exceptions
VHM generates a connectivity exception for multiple occurrences of the following fault (for a complete list of Monitored Phone faults, see Table 3-5):
•
Monitored Phone Lost Contact with Cluster—The monitored phone lost contact with all Cisco CallManagers in the cluster.
Gatekeeper Fault
Table 3-6 displays the gatekeeper fault diagnosed by DFM and further processed by VHM.
Table 3-6 Gatekeeper Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Gatekeeper
|
Gatekeeper Lost Contact with Cluster
|
|
Connectivity exception
|
Connectivity Exceptions
VHM generates a connectivity exception for multiple occurrences of the following fault:
•
Gatekeeper Lost Contact with Cluster—Gatekeeper lost registration with the Cisco CallManager cluster.
Suspect Phone Fault
Table 3-7 displays the suspect phone fault diagnosed by DFM and further processed by VHM.
Table 3-7 Suspect Phone Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Suspect Phone
|
Suspect Phone Detected
|
|
Operational exception
|
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following fault:
•
Suspect Phone Detected—The phone cannot register to a Cisco CallManager.
Voice Cluster Faults
Table 3-8 displays voice cluster faults diagnosed by DFM and further processed by VHM.
Table 3-8 Voice Cluster Faults
Managed Entities
|
Faults
|
Thresholds
|
Notification
|
Voice Cluster
|
Too Many Inactive Phones
|
InactivePhoneThreshold
|
Operational exception
|
CCM HTTP Service Down
|
|
|
Operational Exceptions
VHM generates an operational exception for multiple occurrences of the following faults:
•
Too Many Inactive Phones—The number of inactive phones exceeds the phone threshold. InactivePhoneThreshold is expressed as a percentage of the total phones connected to a Cisco CallManager cluster.
•
CCM HTTP Service Down—VHM cannot use HTTP service to communicate to all Cisco CallManagers in the cluster.
General VHM Faults
The following are general faults that VHM displays:
•
DFM Server Down—VHM lost contact with the DFM server.
•
Synthetic Transaction Server Down—VHM lost contact with the Synthetic Transaction server.
•
ESS Connectivity Lost—VHM cannot communicate with the ESS bus.
•
VHM Domain Connectivity Lost—VHM lost contact with the domain.
•
Discovery Error—Discovery did not complete.
Pass-Through Traps
Table 3-9 lists the pass-through traps that VHM processes for Cisco CallManager.
Table 3-9 Pass-Through Traps—Cisco CallManager
Pass-Through Trap
|
Description
|
CCMGateWayFailedException
|
A gateway has failed in its attempted to register or communicate with a Cisco CallManager.
|
CCMMediaResourceListExhaustedException
|
Cisco CallManager has run out of resources.
|
CCMCallManagerFailedException
|
Cisco CallManager detects a failure in one of its critical subsytems.
|
CCMGatewayLayer2ChangeException
|
The D-Channel/Layer 2 of an interface in a digital gateway that is registered with Cisco CallManager changes state.
|
Table 3-10 lists the pass-through traps that VHM processes for Media Servers (IBM systems).
Table 3-10 Pass-Through Traps—Media Servers (IBM systems)
Pass-Through Trap
|
Description
|
IBMFanEventException
|
A fan is down.
|
IBMVoltageEventException
|
The voltage is not correct.
|
IBMTemperatureEventException
|
The temperature is high.
|
Table 3-11 lists the pass-through traps that VHM processes for voice services.
Table 3-11 Pass-Through Traps—Voice Services
Pass-Through Trap
|
Description
|
VoiceServiceModuleStopException
|
An application module or subsystem has stopped.
|
VoiceServiceModuleStartException
|
An application module or subsystem has successfully started and has transitioned to in-service state.
|
VoiceServiceRunTimeFailureException
|
A run time failure has occured.
|
VoiceServiceProcessStartException
|
A process has just started.
|
VoiceServiceProcessStopException
|
A process has just stopped.
|