User Guide for Device Fault Manager 3.1 (With LMS 3.1)
Appendix F:How DFM Calculates Repeated Restarts and Flapping

Table Of Contents

How DFM Calculates Repeated Restarts and Flapping


How DFM Calculates Repeated Restarts and Flapping


Device Fault Manager (DFM) uses similar calculations to diagnose both repeated restarts and flapping. DFM considers a system to be restarting repeatedly when it performs too many cold or warm starts over a short period of time.

Table F-1 lists the elements, traps, and user-definable parameters that DFM uses to calculate repeated restarts.

Table F-1 Elements, Traps, and Parameters Used to Calculate Repeated Restarts

Elements
SNMP Traps
Threshold Category
Parameter
Parameter Definition

All elements except ports and interfaces

Cold Start

Warm Start

Reachability Settings

Restart trap threshold

Minimum number of SNMP traps required in a user-defined period of time to trigger an event.

Restart trap window

User-defined period within which minimum number of traps must be received to trigger an event.


DFM considers a network adapter to be flapping when it fluctuates between the Up and Down states too often over a short period of time.

Table F-2 lists the elements, traps, and user-definable parameters DFM uses to diagnose flapping.

Table F-2 Elements, Traps, and Parameters Used to Calculate Flapping

Elements
SNMP Traps
Threshold Category
Parameter
Parameter Definition

Ports and Interfaces

Link Up

Link Down

Interface/port flapping settings

Link trap threshold

Minimum number of SNMP traps required in a user-defined period of time to trigger an event.

Link trap window

User-defined period within which minimum number of traps must be received to trigger an event.



Note You can use the CiscoWorks Assistant Link Down/Device Down workflow to troubleshoot a link down problem, as described in User Guide for CiscoWorks Assistant 1.1.


After DFM generates a Repeated Restarts event or a Flapping event, DFM computes the stable time. This is the amount of time that must elapse without further traps before DFM declares the element stable again.

The stable time is at least as long as the time the element was at fault, and at least as long as the trap window. However, it can be no longer than one hour.

Figure F-1 illustrates how a system is diagnosed as performing repeated restarts, or how a network adapter is diagnosed as flapping.

Figure F-1 Diagnosing Repeated System Restarts or Flapping Network Adapters

In Figure F-1, the trap window (Restart trap window or Link trap window parameter) has a value of 30 seconds. The trap threshold (Restart trap threshold or Link trap threshold parameter) has a value of 2.

DFM performs the following actions:

1. As soon as DFM receives a Link Down Trap from a physical port or interface (or a Warm Start/Cold Start Trap from a system), DFM begins counting the traps.

2. When DFM receives two or more traps within 30 seconds, it considers the network adapter or system to be at fault and DFM generates a Repeated Restarts event or a Flapping event.

The minimum traps parameter is set by the Link trap threshold or Restart trap threshold and the minimum seconds parameter is set by the Link trap window or restart trap window. DFM must receive a minimum of two traps within the trap window in a minimum of 30 seconds before it considers an element at fault.

3. DFM continues to receive traps for 80 seconds after the initial trap, resulting in a stable time of 80 seconds.

The stable time is the amount of time that DFM waits before it clears the Repeated Restarts event or Flapping event.