Software Restrictions—If a device (router or switch) intends to use linear flash memory as its OBFL storage media, Cisco IOS software must reserve a minimum of two physical sectors (or physical blocks) for the OBFL feature. Because an erase operation for a linear flash device is done on per-sector (or per-block) basis, one extra physical sector is needed. Otherwise, the minimum amount of space reserved for the OBFL feature on any device must be at least 8 KB.
Firmware Restrictions—If a line card or port adapter runs an operating system or firmware that is different from the Cisco IOS operating system, the line card or port adapter must provide device driver level support or an interprocess communications (IPC) layer that allows the OBFL file system to communicate to the line card or port adapter. This requirement is enforced to allow OBFL data to be recorded on a storage device attached to the line card or port adapter.
Hardware Restrictions—To support the OBFL feature, a device must have at least 8 KB of nonvolatile memory space reserved for OBFL data logging.
The Onboard Failure Logging (OBFL) feature collects data such as operating temperatures, hardware uptime, interrupts, and other important events and messages from system hardware installed in a Cisco router or switch. The data is stored in nonvolatile memory and helps technical personnel diagnose hardware problems.
The OBFL feature records operating temperatures, hardware uptime, interrupts, and other important events and messages that can assist with diagnosing problems with hardware cards (or modules) installed in a Cisco router or switch. Data is logged to files stored in nonvolatile memory. When the onboard hardware is started up, a first record is made for each area monitored and becomes a base value for subsequent records. The OBFL feature provides a circular updating scheme for collecting continuous records and archiving older (historical) records, ensuring accurate data about the system. Data is recorded in one of two formats: continuous information that displays a snapshot of measurements and samples in a continuous file, and summary information that provides details about the data being collected. The data is displayed using the show logging onboard command. The message “No historical data to display” is seen when historical data is not available.
Temperatures surrounding hardware modules can exceed recommended safe operating ranges and cause system problems such as packet drops. Higher than recommended operating temperatures can also accelerate component degradation and affect device reliability. Monitoring temperatures is important for maintaining environmental control and system reliability. Once a temperature sample is logged, the sample becomes the base value for the next record. From that point on, temperatures are recorded either when there are changes from the previous record or if the maximum storage time is exceeded. Temperatures are measured and recorded in degrees Celsius.
Number of sensors is the total number of temperature sensors that will be recorded. A column for each sensor is displayed with temperatures listed under the number of each sensor, as available.
Sampling frequency is the time between measurements.
Maximum time of storage determines the maximum amount of time, in minutes, that can pass when the temperature remains unchanged and the data is not saved to storage media. After this time, a temperature record will be saved even if the temperature has not changed.
The Sensor column lists the name of the sensor.
The ID column lists an assigned identifier for the sensor.
Maximum Temperature 0C shows the highest recorded temperature per sensor.
Temp indicates a recorded temperature in degrees Celsius in the historical record. Columns following show the total time each sensor has recorded that temperature.
Sensor ID is an assigned number, so that temperatures for the same sensor can be stored together.
The operational uptime tracking begins when the module is powered on, and information is retained for the life of the module.
The operational uptime application tracks the following events:
Date and time the customer first powered on a component.
Total uptime and downtime for the component in years, weeks, days, hours, and minutes.
Total number of component resets.
Total number of slot (module) changes.
Current reset timestamp to include the date and time.
Current slot (module) number of the component.
Current uptime in years, weeks, days, hours, and minutes.
Reset reason; see Table 7-1 to translate the numbers displayed.
Count is the number of resets that have occurred for each reset reason.
Table 7-1 Reset Reason Codes and Explanations
Reset Reason Code (in hex)
Line card hot plug in
Supervisor requests line card off or on
Supervisor requests hard reset on line card
Line card requests Supervisor off or on
Line card requests hard reset on Supervisor
Line card self reset using the internal system register
Momentary power interruption on the line card
Off or on after Supervisor non-maskable interrupts (NMI)
Hard reset after Supervisor NMI
Soft reset after Supervisor NMI
Off or on after line card asks Supervisor NMI
Hard reset after line card asks Supervisor NMI
Soft reset after line card asks Supervisor NMI
Off or on after line card self NMI
Hard reset after line card self NMI
Soft reset after line card self NMI
Off or on after spurious NMI
Hard reset after spurious NMI
Soft reset after spurious NMI
Off or on after watchdog NMI
Hard reset after watchdog NMI
Soft reset after watchdog NMI
Off or on after parity NMI
Hard reset after parity NMI
Soft reset after parity NMI
Off or on after system fatal interrupt
Hard reset after system fatal interrupt
Soft reset after system fatal interrupt
Off or on after application-specific integrated circuit (ASIC) interrupt
Hard reset after ASIC interrupt
Soft reset after ASIC interrupt
Off or on after unknown interrupt
Hard reset after unknown interrupt
Soft reset after unknown interrupt
Off or on after CPU exception
Hard reset after CPU exception
Soft reset after CPU exception
Reset data converted to generic data
Interrupts are generated by system components that require attention from the CPU such as ASICs and NMIs. Interrupts are generally related to hardware limit conditions or errors that need to be corrected.
The continuous format records each time a component is interrupted, and this record is stored and used as base information for subsequent records. Each time the list is saved, a timestamp is added. Time differences from the previous interrupt are counted, so that technical personnel can gain a complete record of the component’s operational history when an error occurs.
Name is a description of the component including its position in the device.
ID is an assigned field for data storage.
Offset is the register offset from a component register’s base address.
Bit is the interrupt bit number recorded from the component’s internal register.
The timestamp shows the date and time that an interrupt occurred down to the millisecond.
The OBFL feature logs standard system messages. Instead of displaying the message to a terminal, the message is written to and stored in a file, so the message can be accessed and read at a later time. System messages range from level 1 alerts to level 7 debug messages, and these levels can be specified in the hw module logging onboard command.
A timestamp shows the date and time the message was logged.
Facility-Sev-Name is a coded naming scheme for a system message, as follows:
– The Facility code consists of two or more uppercase letters that indicate the hardware device (facility) to which the message refers.
– Sev is a single-digit code from 1 to 7 that reflects the severity of the message.
– Name is one or two code names separated by a hyphen that describe the part of the system from where the message is coming.
The error message follows the Facility-Sev-Name codes. For more information about system messages, see the Cisco IOS System and Error Messages guide.
Count indicates the number of instances of this message that is allowed in the history file. Once that number of instances has been recorded, the oldest instance will be removed from the history file to make room for new ones.
The Persistence Flag gives a message priority over others that do not have the flag set.
Default Settings for OBFL
The OBFL feature is enabled by default. Because of the valuable information this feature offers technical personnel, it should not be disabled.
To enable OBFL, perform this task:
Command or Action
Enables privileged EXEC mode (enter your password if prompted).
Note By default, all system messages sent to a device are logged by the OBFL feature. You can define a specific message level (only level 1 messages, as an example) to be logged using the message level keywords.
Ends global configuration mode.
Configuration Examples for OBFL
The important OBFL feature is the information that is displayed by the show logging onboard module privileged EXEC command. This section provides the following examples of how to enable and display OBFL records.