Table Of Contents
How DFM Calculates Repeated Restarts and Flapping
How DFM Calculates Repeated Restarts and Flapping
Device Fault Manager (DFM) uses similar calculations to diagnose both repeated restarts and flapping. DFM considers a system to be restarting repeatedly when it performs too many cold or warm starts over a short period of time.
Table F-1 lists the elements, traps, and user-definable parameters that DFM uses to calculate repeated restarts.
Table F-1 Elements, Traps, and Parameters Used to Calculate Repeated Restarts
Elements
|
SNMP Traps
|
Threshold Category
|
Parameter
|
Parameter Definition
|
All elements except ports and interfaces
|
Cold Start
Warm Start
|
Reachability Settings
|
Restart trap threshold
|
Minimum number of SNMP traps required in a user-defined period of time to trigger an event.
|
Restart trap window
|
User-defined period within which minimum number of traps must be received to trigger an event.
|
DFM considers a network adapter to be flapping when it fluctuates between the Up and Down states too often over a short period of time.
Table F-2 lists the elements, traps, and user-definable parameters DFM uses to diagnose flapping.
Table F-2 Elements, Traps, and Parameters Used to Calculate Flapping
Elements
|
SNMP Traps
|
Threshold Category
|
Parameter
|
Parameter Definition
|
Ports and Interfaces
|
Link Up
Link Down
|
Interface/port flapping settings
|
Link trap threshold
|
Minimum number of SNMP traps required in a user-defined period of time to trigger an event.
|
Link trap window
|
User-defined period within which minimum number of traps must be received to trigger an event.
|
Note
You can use the CiscoWorks Assistant Link Down/Device Down workflow to troubleshoot a link down problem, as described in User Guide for CiscoWorks Assistant 1.1.
After DFM generates a Repeated Restarts event or a Flapping event, DFM computes the stable time. This is the amount of time that must elapse without further traps before DFM declares the element stable again.
The stable time is at least as long as the time the element was at fault, and at least as long as the trap window. However, it can be no longer than one hour.
Figure F-1 illustrates how a system is diagnosed as performing repeated restarts, or how a network adapter is diagnosed as flapping.
Figure F-1 Diagnosing Repeated System Restarts or Flapping Network Adapters
In Figure F-1, the trap window (Restart trap window or Link trap window parameter) has a value of 30 seconds. The trap threshold (Restart trap threshold or Link trap threshold parameter) has a value of 2.
DFM performs the following actions:
1.
As soon as DFM receives a Link Down Trap from a physical port or interface (or a Warm Start/Cold Start Trap from a system), DFM begins counting the traps.
2.
When DFM receives two or more traps within 30 seconds, it considers the network adapter or system to be at fault and DFM generates a Repeated Restarts event or a Flapping event.
The minimum traps parameter is set by the Link trap threshold or Restart trap threshold and the minimum seconds parameter is set by the Link trap window or restart trap window. DFM must receive a minimum of two traps within the trap window in a minimum of 30 seconds before it considers an element at fault.
3.
DFM continues to receive traps for 80 seconds after the initial trap, resulting in a stable time of 80 seconds.
The stable time is the amount of time that DFM waits before it clears the Repeated Restarts event or Flapping event.