Cisco Unified Real-Time Monitoring Tool Administration Guide for Cisco Unified Contact Center Express and Cisco Unified IP IVR Release 9.0(1)
Alerts
Downloads: This chapterpdf (PDF - 374.0KB) The complete bookPDF (PDF - 2.35MB) | Feedback

Alerts

Alerts

This chapter contains information on the following topics:

RTMT alerts

The system generates alert messages to notify the administrator when a predefined condition is met, such as when an activated service goes from up to down. The system can send alerts as email or epage.

RTMT, which supports alert defining, setting, and viewing, contains preconfigured and user-defined alerts. Although you can perform configuration tasks for both types, you cannot delete preconfigured alerts (whereas you can add and delete user-defined alerts). The Alert menu comprises the following menu options:

  • Alert Central—This option comprises the history and current status of every alert in the system.

    Note


    You can also access Alert Central by clicking the Alert Central icon in the hierarchy tree in the system drawer.


  • Set Alert/Properties—This menu category allows you to set alerts and alert properties.
  • Remove Alert—This menu category allows you to remove an alert.
  • Enable Alert—With this menu category, you can enable alerts.
  • Disable Alert—You can disable an alert with this category.
  • Suspend cluster/node Alerts—This menu category allows you to temporarily suspend alerts on a particular server or on an entire cluster (if applicable).
  • Clear Alerts—This menu category allows you to reset an alert (change the color of an alert item to black) to signal that an alert has been handled. After an alert has been raised, its color will automatically change in RTMT and will stay that way until you manually clear the alert.

    Note


    The manual clear alert action does not update the System cleared timestamp column in Alert Central. This column is updated only if alert condition is automatically cleared.


  • Clear All Alerts—This menu category allows you to clear all alerts.
  • Reset all Alerts to Default Config—This menu category allows you to reset all the alerts to the default configuration.
  • Alert Detail—This menu category provides detailed information on alert events.
  • Config Email Server—In this category, you can configure your email server to enable alerts.
  • Config Alert Action—This category allows you to set actions to take for specific alerts; you can configure the actions to send the alerts to desired email recipients.

In RTMT, you configure alert notification for perfmon counter value thresholds and set alert properties for the alert, such as the threshold, duration, frequency, and so on. RTMT predefined alerts are configured for perfom counter value thresholds as wells as event (alarms) notifications.

You can locate Alert Central under the Tools hierarchy tree in the quick launch. Alert Central provides both the current status and the history of all the alerts in the system.

Related Information

Preconfigured and custom alerts

RTMT displays both preconfigured alerts and custom alerts in Alert Central. RTMT organizes the alerts under the applicable tabs—System, Unified CCX, and Custom.

You can enable or disable preconfigured and custom alerts in Alert Central; however, you cannot delete preconfigured alerts.

System alerts


Note


For alert descriptions and default configurations, see System Alert Descriptions and Default Configurations.


The following list comprises the preconfigured system alerts.

  • AuthenticationFailed
  • CiscoDRFFailure
  • CoreDumpFileFound
  • CpuPegging
  • CriticalServiceDown
  • HardwareFailure
  • LogFileSearchStringFound
  • LogPartitionHighWaterMarkExceeded
  • LogPartitionLowWaterMarkExceeded
  • LowActivePartitionAvailableDiskSpace
  • LowAvailableVirtualMemory
  • LowInactivePartitionAvailableDiskSpace
  • LowSwapPartitionAvailableDiskSpace
  • ServerDown
  • SparePartitionHighWaterMarkExceeded
  • SparePartitionLowWaterMarkExceeded
  • SyslogSeverityMatchFound
  • SyslogStringMatchFound
  • SystemVersionMismatched
  • TotalProcessesAndThreadsExceededThreshold

Unified CCX alerts

The following list comprises the preconfigured Unified CCX alerts.


Note


For alert descriptions and default configurations, see Cisco Unified Contact Center Express alert descriptions and default configurations.


  • DB CRA % Space Used
  • DBReplicationStopped
  • HistoricalDataWrittenToFiles
  • Intelligence Center CUIC_DATABASE_UNAVAILABLE
  • Intelligence Center CUIC_DB_REPLICATION_FAILED
  • Intelligence Center CUIC_REPORT_EXECUTION_FAILED
  • Intelligence Center CUIC_UNRECOVERABLE_ERROR
  • PurgeInvoked
  • UnifiedCCXEngineMemoryUsageHigh

See the Additional information.

Alert fields

You can configure both preconfigured and user-defined alerts in RTMT. You can also disable both preconfigured and user-defined alerts in RTMT. You can add and delete user-defined alerts in the performance-monitoring window; however, you cannot delete preconfigured alerts.


Note


Severity levels for Syslog entries match the severity level for all RTMT alerts. If RTMT issues a critical alert, the corresponding Syslog entry also specifies critical.


The following table provides a list of fields that you may use to configure each alert; users can configure preconfigured fields, unless otherwise noted.

Table 1 Alert customization

Field

Description

Comment

Alert Name

High-level name of the monitoring item with which RTMT associates an alert

Descriptive name. For preconfigured alerts, you cannot change this field. For a list of preconfigured alerts, see Preconfigured and custom alerts.

Description

Description of the alert

You cannot edit this field for preconfigured alerts. For a list of preconfigured alerts, see Preconfigured and custom alerts.

Performance Counter(s)

Source of the performance counter

You cannot change this field. You can associate only one instance of the performance counter with an alert.

Threshold

Condition to raise alert (value is...)

Specify up < - > down, less than #, %, rate greater than #, %, rate. This field is applicable only for alerts based on performance counters.

Value Calculated As

Method used to check the threshold condition

Specify value to be evaluated as absolute, delta (present - previous), or % delta. This field is applicable only for alerts based on performance counters.

Duration

Condition to raise alert (how long value threshold has to persist before raising alert)

Options include the system sending the alert immediately or after a specified time that the alert has persisted. This field is applicable only for alerts based on performance counters.

Number of Events Threshold

Raise alert only when a configurable number of events exceeds a configurable time interval (in minutes).

This field is applicable only for event based alerts.

Node IDs

Cluster or list of servers to monitor

Alert Action ID

ID of alert action to take (System always logs alerts no matter what the alert action.)

Alert action gets defined first (see Alert fields). If this field is blank, that indicates that email is disabled.

Enable Alerts

Enable or disable alerts.

Options include enabled or disabled.

Clear Alert

Resets alert (change the color of an alert item from to black) to signal that the alert has been resolved

After an alert has been raised, its color will automatically change to and stay that way until you manually clear the alert. Use Clear All to clear all alerts.

Alert Details

Displays the detail of an alert (not configurable)

Alert Generation Rate

How often to generate alert when alert condition persists

Specify every X minutes. (Raise alert once every X minutes if condition persists.)

Specify every X minutes up to Y times. (Raise alert Y times every X minutes if condition persists.)

User Provide Text

Administrator to append text on top of predefined alert text

N/A

Severity

For viewing purposes (for example, show only Sev. 1 alerts)

Specify defaults that are provided for predefined (for example, Error, Warning, Information) alerts.

Related Information

Alert actions

In RTMT, you can configure alert actions for every alert that is generated and have the alert action sent to email recipients that you specify in the alert action list.

The following table provides a list of fields that you will use to configure alert actions. Users can configure all fields, unless otherwise marked.

Table 2 Alert action configuration

Field

Description

Comment

Alert Action ID

ID of alert action to take

Specify descriptive name.

Mail Recipients

List of email addresses. You can selectively enable/disable an individual email in the list.

N/A

Related Information

Trace download

Some preconfigured alerts allow you to initiate a trace download based on the occurrence of an event. You can automatically capture traces when a particular event occurs by checking the Enable Trace Download check box in Set Alert/Properties for the following alerts:

  • CriticalServiceDown–CriticalServiceDown alert gets generated when any service is down.

    Note


    The RTMT backend service checks status (by default) every 30 seconds. If service goes down and comes back up within that period, CriticalServiceDown alert may not get generated.



    Note


    CriticalServiceDown alert monitors only those services that are listed in RTMT Critical Services.


  • CoreDumpFileFound–CoreDumpFileFound alert gets generated when RTMT backend service detects a new Core Dump file.

    Note


    You can configure both CriticalServiceDown and CoreDumpFileFound alerts to download corresponding trace files for troubleshooting purposes. This helps preserve trace files at the time of crash.



Caution


Enabling Trace Download may affect services on the server. Configuring a high number of downloads will adversely impact the quality of services on the server.


Related Information

Alert logs

The alert log stores the alert, which is also stored in memory. The memory gets cleared at a constant interval, leaving the last 30 minutes of data in the memory. When the service starts/restarts, the last 30 minutes of the alert data load into the memory by the system reading from the alert logs on the server or on all servers in the cluster (if applicable). The alert data in the memory gets sent to the RTMT clients on request.

Upon RTMT startup, RTMT shows all logs that occurred in the last 30 minutes in the Alert Central log history. Alert log periodically gets updated, and new logs get inserted into the log history window. After the number of logs reaches 100, RTMT removes the oldest 40 logs.

The following file name format for the alert log applies: AlertLog_MM_DD_YYYY_hh_mm.csv.

The alert log includes the following attributes:

  • Time Stamp—Time when RTMT logs the data
  • Alert Name—Descriptive name of the alert
  • Node—Server name for where RTMT raised the alert
  • Alert Message—Detailed description about the alert
  • Type—Type of the alert
  • Description—Description of the monitored object
  • Severity—Severity of the alert
  • PollValue—Value of the monitored object where the alert condition occurred
  • Action—Alert action taken
  • Group ID—Identifies the source of the alert

The first line of each log file comprises the header. Details of each alert get written in a single line, separated by a comma.

Related Information

Log Partition Monitoring

Log Partition Monitoring, which is installed automatically with the system, uses configurable thresholds to monitor the disk usage of the log partition on a server. The Cisco Log Partition Monitoring Tool service starts automatically after installation of the system.

Every 5 minutes, Log Partition Monitoring uses the following configured thresholds to monitor the disk usage of the log partition and the spare log partition on a server:

  • LogPartitionLowWaterMarkExceeded (% disk space)—When the disk usage is above the percentage that you specify, LPM sends out an alarm message to syslog and an alert to RTMT Alert central. To save the log files and regain disk space, you can use trace and log central option in RTMT.
  • LogPartitionHighWaterMarkExceeded (% disk space)—When the disk usage is above the percentage that you specify, LPM sends an alarm message to syslog and an alert to RTMT Alert central.
  • SparePartitionLowWaterMarkExceeded (% disk space)—When the disk usage is above the percentage that you specify, LPM sends out an alarm message to syslog and an alert to RTMT Alert central. To save the log files and regain disk space, you can use trace and log central option in RTMT.
  • SparePartitionHighWaterMarkExceeded (% disk space)—When the disk usage is above the percentage that you specify, LPM sends an alarm message to syslog and an alert to RTMT Alert central.

In addition, Cisco Log Partitioning Monitoring Tool service checks the server every 5 seconds for newly created core dump files. If new core dump files exist, Cisco Log Partitioning Monitoring Tool service sends a CoreDumpFileFound alarm and an alert to Alert Central with information on each new core file.

To utilize log partition monitor, verify that the Cisco Log Partitioning Monitoring Tool service, a network service, is running on Cisco Unified Serviceability on the server or on each server in the cluster (if applicable). Stopping the service causes a loss of feature functionality.

When the log partition monitoring services starts at system startup, the service checks the current disk space utilization. If the percentage of disk usage is above the low water mark, but less than the high water mark, the service sends a alarm message to syslog and generates a corresponding alert in RTMT Alert central.

To configure Log Partitioning Monitoring, set the alert properties for the LogPartitionLowWaterMarkExceeded and LogPartitionHighWaterMarkExceeded alerts in Alert Central. For more information, see Set alert properties.

To offload the log files and regain disk space on the server, you should collect the traces that you are interested in saving by using the Real-Time Monitoring tool.

If the percentage of disk usage is above the high water mark that you configured, the system sends an alarm message to syslog, generates a corresponding alert in RTMT Alert Central, and automatically purges log files until the value reaches the low water mark.


Note


Log Partition Monitoring automatically identifies the common partition that contains an active directory and inactive directory. The active directory contains the log files for the current installed version of the software (Unified CCX), and the inactive directory contains the log files for the previous installed version of the software. If necessary, the service deletes log files in the inactive directory first. The service then deletes log files in the active directory, starting with the oldest log file for every application until the disk space percentage drops below the configured low water mark. The service does not send an email when log partition monitoring purges the log files.


After the system determines the disk usage and performs the necessary tasks (sending alarms, generating alerts, or purging logs), log partition monitoring occurs at regular 5 minute intervals.