Troubleshooting Faults Using the Alarm Browser
This chapter explains the steps to monitor events appearing on dashboard. Using the procedures, you can detect faults, get details and perform next steps to investigate and resolve the issues reported.
Based on the type of event, the RCA and SIA events are listed in the following views:
•Alarm Browser - All Events
•Alarm Browser - Root Cause Events
•Alarm Browser - Service Events
To add these portlets, see Adding a Portlet.
The workflows explained in this chapter document how an operator can use the Alarm Browser to use the RCA, SIA or Other events as a starting point to drill down further. Drilling down allows the operator to identify problems in the HCS system which need to be resolved.The following assumptions are applicable:
•Single workflow irrespective of the presence of RCA/SIA for certain failures.
•All event-driven workflows originate from notifications observed at manager of managers in the service provider system and include cross launch to Prime Central for HCS.
•Operator expects to see all events—without loss of events and no events being filtered out.
At a high level, events are classified into three types. The following description of the various types of events helps you understand the events in the context of RCA and SIA workflows:
•Synthetic events are generated by Prime Central for HCS in response to raw events it receives. A Synthetic event is a container event and represents a group of similar raw events. All contained raw events have the same EventTypeId and the Synthetic Event's type is the EventTypeId of the raw events. For example, a synthetic event of type OM_CUCM_Processes is a container event that groups all CUCM Service Down events from a CUCM node. These Service Down events are all assigned with an EventTypeId of OM_CUCM_Processes. The synthetic events are actionable events, meaning you can use the data to understand next steps. You can drill down further to see the symptomatic and raw events that led to the generation of this synthetic events. These Synthetic RCA events can be viewed in Alarm Browser - All Events or in Alarm Browser - Root Cause Events.
•Service Events are generated in response to customer services, such as Voice or VoiceMail, which are impacted. For example, an event is generated when the service state changes from Marginal to Down. These events are listed on the Alarm Browser - All Events, and can further be filtered using Service Events filter. These are also available on the Service Availability portlet.
•Other Events are those that do not participate in RCA and SIA. This list contains all synthetic and raw events that do not participate in RCA and SIA. These events are displayed when the Other Events are selected from filter and view from the drop down available on Alarm Browser - All Events view.
The following flowchart helps you understand the workflow of the events. Usually, an event falls under one of the following buckets:
•Participates only in RCA
•Participates only in SIA
•Participates both in RCA and SIA
•Others—Does not participate in RCA and SIA.
When symptom and root-cause events arrive, it is possible that a symptom event is marked as a root-cause event until such time the root-cause event arrives. After the real root-cause event arrives, the CauseType of the symptom event changes to symptom. In the GUI, the CauseType is reflected as soon as the updated CauseType arrives.
As an operator, you may want to leverage the RCA/SIA features to understand more about the underlying devices and the faults. You may also want to view just the Raw Events, without any requirement to understand their RCA and SIA. The following sections explain the procedure you can adopt based on your requirements:
•Troubleshooting Faults Using RCA/SIA Events
•Troubleshooting Faults Using Other Events
•Viewing Events Based on Filter and View Settings
Troubleshooting Faults Using RCA/SIA Events
When an event participates in RCA or SIA or both, it is listed in the Alarm Browser - Root Cause Events or Alarm Browser - Service Events, respectively. You can filter based on the type and view the next steps. Events can be filtered for SIA only if the service impacted belongs to one specific customer.
Troubleshooting Faults Using RCA Events
The RCA events appear in the Alarm Browser - All Events or the Alarm Browser - Root Cause Events portlet that you specifically added to your dashboard. The table contains synthetic events, which indicate the probable cause of failure of a range of raw and symptomatic events. The raw and symptomatic events are not listed, and instead the synthetic event, denoted with an EventType ID is generated and listed. However, the events are available for further analysis as contained events.
Follow the steps below for troubleshooting using Synthetic RCA events:
Step 1 From the Alarm Browser - Rootcause Events or RootCauseEvents, right-click a synthetic event, and click Show Contained Events.
The raw events that triggered the synthetic event appear in the Contained Events list.
Step 2 Right-click a raw event, and click Event Details or DM Cross Launch.
If you clicked Event Details, you can further view the description and the next steps that you can perform.
If you clicked DM Cross Launch, the domain manager that reported the event launches.
Step 3 From the details that appear, you can take remedial steps to correct the fault.
Troubleshooting Faults Using SIA Events
When the state of a service changes, the SIA table lists the synthetic event that indicates the cause of the fault. The events are triggered when the state of voice/voicemail/presence service changes to up/marginal/down.
Step 1 To troubleshoot using Alarm Browser - Service Events portlet, select Assure > Alarm Browser - Service Events, or from the filter drop-down list in the Alarm Browser portlet, choose ServiceEvents filter.
Step 2 From the View drop-down list, select the ServiceEvents.
The list of current active service events appears.
Step 3 From the Service Impact Event List View locate the impacted service.
Step 4 Launch the Service Availability view.
Step 5 Navigate to the bottom-most affected service and click the service.
The raw events that triggered the of the service event appears in the Service Details pane.
Step 6 Right-click the raw event to view further details.
Troubleshooting Faults Using Other Events
Some events do not participate in SIA and RCA and the raw events and symptomatic events are listed in Alarm Browser - All Events, and can be further filtered using Other Events filter.
Step 1 To troubleshoot using OtherEvents, from the Alarm Browser portlet, choose OtherEvents filter in the Filter drop-down list.
Step 2 From the View drop-down list, select the OtherEvents. A list of other events—the events that do not participate in RCA or SIA—appears.
It is important to monitor events that appear in this view as not all events are processed for RCA and SIA.
Step 3 From the Alarm Browser - All Events or Other Events portlet, right-click the raw event and click Event Details to view the description or next steps.
Viewing Events Based on Filter and View Settings
Since there are various types of events, you might want to filter out the events that are not of interest to you. Use the Filter Builder option available on the dashboard. To know more about Filter Builder, see Using the Filter Builder.
You can set the fields that need to be displayed in the order you specify. Use the View Builder option available on the dashboard to make your setting. For more information, see Creating and Editing Views.