User Guide for Device Fault Manager 1.1 (With LMS 2.0)
Polling

Table Of Contents

Polling

ICMP Polling

SNMP Polling

Just-In-Time Polling Algorithm

Consolidated Requests

Fast ICMP Polling


Polling


These topics describe the polling that DFM uses to collect device data:

ICMP Polling

SNMP Polling

DFM collects data for its analysis using a combination of ICMP and SNMP polling. Fault and performance data is collected using SNMP while device connectivity is monitored using ICMP. DFM implements SNMP and ICMP polling with cooperative adapters running within a domain manager, combining for a fast and reliable method of obtaining data.

ICMP Polling

DFM uses a high-performance, asynchronous ICMP poller. The ICMP poller performs at a consistent rate that is independent of poll response times. DFM achieves this using two asynchronous threads; one thread sends polls and one thread receives polls. Because the send and receive threads operate asynchronously, slow response times or excessive timeouts do not affect the polling rate.

Figure 9-1 shows the four possible states of an element as determined by its response to an ICMP poll.

Figure 9-1 The Four Possible States of an Element During a Polling Cycle

The four states are: up, notification pending, down, and clear pending.

An element stays in the up state until it fails to respond to an ICMP poll. When it fails to respond, the element moves to the notification pending state until DFM can determine whether it is up or down. If the minimum stabilization period expires or the maximum failure retry count is exceeded before a successful ICMP poll occurs, the element moves to the down state. DFM does not poll the element again until the next scheduled polling cycle.

An element stays in the down state until it responds to an ICMP poll. When the element responds, it moves to the clear pending state. If the maximum success retry count is exceeded or the minimum clear pending time expires, the element returns to the up state.

SNMP Polling

The DFM SNMP poller is a synchronous, multi-threaded SNMP polling engine. By default, the SNMP poller uses 10 synchronous polling threads.

The SNMP poller supports SNMP V1, which enables the analysis model to use high-capacity 64-bit counters in its analysis. This is critical for performance analysis of high-speed data links where 32-bit counters may wrap between polls.

Polling for devices with multiple IP addresses is supported because the SNMP poller supports multiple IP addresses for each SNMP agent. The SNMP poller automatically switches to an alternate IP address during failures, ensuring the integrity of DFM's analysis during outages.

Just-In-Time Polling Algorithm

The SNMP poller's MIB variable poll list is driven by a Just-In-Time polling algorithm. This ensures that only those MIB variables needed for analysis are polled. For example, if a port monitored for performance data is disabled, or goes down, the domain manager revokes the SNMP poller's request to monitor performance data for that port and the SNMP poller automatically removes the relevant MIB variables from the poll list. If the port is re-enabled, or comes back up, the variables are automatically put back onto the MIB poll list.

Consolidated Requests

Issuing a single SNMP GET that requests 10 variables is more efficient than issuing 10 GET requests that each request a single variable. The SNMP poller consolidates as many attributes as possible into a single SNMP GET request. The consolidation is not restricted to variables from the same SNMP table. Polling consolidation continually adapts to changes in the MIB variable poll list.

If recoverable errors are encountered during a GET request, the SNMP poller suspends polling of the affected variable and continues to poll the other variables. For example, a MIB variable might become unavailable due to a configuration change. This enables the SNMP poller to operate efficiently during unexpected changes to a device's configuration.

Fast ICMP Polling

Synchronous polling has one drawback: an attempt to poll a device that is down reduces polling throughput. This is because the poller must wait for the initial poll and the subsequent retry polls to timeout before polling the next SNMP agent. The problem is exacerbated by the large timeout and retry values that are often required to handle agents that are slow to respond.

DFM eliminates this problem by linking its SNMP and ICMP pollers. DFM avoids sending SNMP requests to agent addresses that are known to be unreachable. Remember, the DFM ICMP poller is asynchronous and does not slow down, even in the face of a total network outage.

IP addresses that are unresponsive to ICMP polls are added to a "do not poll" list. The SNMP poller checks this list before sending an SNMP request. If the SNMP agent address is on the "do not poll" list, the request is not sent. If the SNMP agent has multiple IP addresses, each address is checked against the list. If an alternate address does not appear in the list, the request is sent to that address. If all addresses for an agent are on the list, the agent is deemed unreachable, and all SNMP requests to that agent are temporarily suspended. As soon as an agent's IP address becomes responsive, the address is removed from the list, and SNMP polling resumes. The net effect is that DFM can support large SNMP timeout and retry values without suffering from polling slow-downs during network outages.