This documentation has been moved
Onboard Failure Logging
Downloads: This chapterpdf (PDF - 202.0KB) The complete bookPDF (PDF - 3.01MB) | Feedback

Onboard Failure Logging

Table Of Contents

Onboard Failure Logging

Contents

Restrictions for OBFL

Information About OBFL

Data Collected

Temperature

Operational Uptime

Interrupts

Message Logging

How to Enable OBFL

Enabling OBFL

Configuration Examples for OBFL

Enabling OBFL Message Logging: Example

OBFL Message Log: Example

OBFL Component Uptime Report: Example

OBFL Report for a Specific Time: Example

Additional References

Related Documents

Standards

MIBs

RFCs

Technical Assistance

Command Reference

Feature Information for OBFL


Onboard Failure Logging


First Published: August 10, 2007
Last Updated: August 10, 2007

The Onboard Failure Logging (OBFL) feature collects data such as operating temperatures, hardware uptime, interrupts, and other important events and messages from system hardware installed in a Cisco router or switch. The data is stored in nonvolatile memory and helps technical personnel diagnose hardware problems.

Finding Feature Information in This Module

Your Cisco IOS software release may not support all of the features documented in this module. To reach links to specific feature documentation in this module and to see a list of the releases in which each feature is supported, use the "Feature Information for OBFL" section.

Finding Support Information for Platforms and Cisco IOS and Catalyst OS Software Images

Use Cisco Feature Navigator to find information about platform support and Cisco IOS and Catalyst OS software image support. To access Cisco Feature Navigator, go to http://www.cisco.com/go/cfn. An account on Cisco.com is not required.

Contents

Restrictions for OBFL

Information About OBFL

How to Enable OBFL

Configuration Examples for OBFL

Additional References

Command Reference

Feature Information for OBFL

Restrictions for OBFL

Software Restrictions

If a device (router or switch) intends to use linear flash memory as its OBFL storage media, Cisco IOS software must reserve a minimum of two physical sectors (or physical blocks) for the OBFL feature. Because an erase operation for a linear flash device is done on per-sector (or per-block) basis, one extra physical sector is needed. Otherwise, the minimum amount of space reserved for the OBFL feature on any device must be at least 8 KB.

Firmware Restrictions

If a line card or port adapter runs an operating system or firmware that is different from the Cisco IOS operating system, the line card or port adapter must provide device driver level support or an interprocess communications (IPC) layer that allows the OBFL file system to communicate to the line card or port adapter. This requirement is enforced to allow OBFL data to be recorded on a storage device attached to the line card or port adapter.

Hardware Restrictions

To support the OBFL feature, a device must have at least 8 KB of nonvolatile memory space reserved for OBFL data logging.

Information About OBFL

To use the OBFL feature, you should understand the following concept:

Data Collected

Data Collected

The OBFL feature records operating temperatures, hardware uptime, interrupts, and other important events and messages that can assist with diagnosing problems with hardware cards (or modules) installed in a Cisco router or switch. Data is logged to files stored in nonvolatile memory. When the onboard hardware is started up, a first record is made for each area monitored and becomes a base value for subsequent records. The OBFL feature provides a circular updating scheme for collecting continuous records and archiving older (historical) records, ensuring accurate data about the system. Data is recorded in one of two formats: continuous information that displays a snapshot of measurements and samples in a continuous file, and summary information that provides details about the data being collected. The data is displayed using the show logging onboard command. The message "No historical data to display" is seen when historical data is not available.

The following sections describe the type of data collected in more detail.

Temperature

Temperatures surrounding hardware modules can exceed recommended safe operating ranges and cause system problems such as packet drops. Higher than recommended operating temperatures can also accelerate component degradation and affect device reliability. Monitoring temperatures is important for maintaining environmental control and system reliability. Once a temperature sample is logged, the sample becomes the base value for the next record. From that point on, temperatures are recorded either when there are changes from the previous record or if the maximum storage time is exceeded. Temperatures are measured and recorded in degrees Celsius.

Temperature Example

--------------------------------------------------------------------------------
TEMPERATURE SUMMARY INFORMATION
--------------------------------------------------------------------------------
Number of sensors          : 12
Sampling frequency         : 5 minutes
Maximum time of storage    : 120 minutes
--------------------------------------------------------------------------------
Sensor                            |   ID  | Maximum Temperature 0C 
--------------------------------------------------------------------------------
MB-Out                              980201     43
MB-In                               980202     28
MB                                  980203     29
MB                                  980204     38
EARL-Out                            910201     0
EARL-In                             910202     0
SSA 1                               980301     38
SSA 2                               980302     36
JANUS 1                             980303     36
JANUS 2                             980304     35
GEMINI 1                            980305     0
GEMINI 2                            980306     0
---------------------------------------------------------------
Temp                         Sensor ID 
0C     1    2    3    4    5    6    7    8    9   10   11   12
---------------------------------------------------------------
No historical data to display
---------------------------------------------------------------
--------------------------------------------------------------------------------
TEMPERATURE CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
Sensor                            |   ID  | 
--------------------------------------------------------------------------------
MB-Out                              980201 
MB-In                               980202 
MB                                  980203 
MB                                  980204 
EARL-Out                            910201 
EARL-In                             910202 
SSA 1                               980301 
SSA 2                               980302 
JANUS 1                             980303 
JANUS 2                             980304 
GEMINI 1                            980305 
GEMINI 2                            980306 

-------------------------------------------------------------------------------
       Time Stamp   |Sensor Temperature 0C 
MM/DD/YYYY HH:MM:SS |  1    2    3    4    5    6    7    8    9   10   11   12
-------------------------------------------------------------------------------
03/06/2007 22:32:51   31   26   27   27   NA   NA   33   32   30   29   NA   NA 
03/06/2007 22:37:51   43   28   29   38   NA   NA   38   36   36   35   NA   NA 
-------------------------------------------------------------------------------

To interpret this data:

Number of sensors is the total number of temperature sensors that will be recorded. A column for each sensor is displayed with temperatures listed under the number of each sensor, as available.

Sampling frequency is the time between measurements.

Maximum time of storage determines the maximum amount of time, in minutes, that can pass when the temperature remains unchanged and the data is not saved to storage media. After this time, a temperature record will be saved even if the temperature has not changed.

The Sensor column lists the name of the sensor.

The ID column lists an assigned identifier for the sensor.

Maximum Temperature 0C shows the highest recorded temperature per sensor.

Temp indicates a recorded temperature in degrees Celsius in the historical record. Columns following show the total time each sensor has recorded that temperature.

Sensor ID is an assigned number, so that temperatures for the same sensor can be stored together.

Operational Uptime

The operational uptime tracking begins when the module is powered on, and information is retained for the life of the module.

Operational Uptime Example

--------------------------------------------------------------------------------
UPTIME SUMMARY INFORMATION
--------------------------------------------------------------------------------
First customer power on : 03/06/2007 22:32:51
Total uptime            :   0 years   0 weeks   2 days  18 hours  10 minutes
Total downtime          :   0 years   0 weeks   0 days   8 hours   7 minutes
Number of resets        : 130
Number of slot changes  : 16
Current reset reason    : 0xA1
Current reset timestamp : 03/07/2007 13:29:07
Current slot            : 2
Current uptime          :   0 years   0 weeks   1 days   7 hours   0 minutes
--------------------------------------------------------------------------------
Reset  |        |
Reason | Count  |
--------------------------------------------------------------------------------
0x5         64 
0x6         62 
0xA1         4 
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
UPTIME CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
Time Stamp          | Reset  | Uptime 
MM/DD/YYYY HH:MM:SS | Reason | years weeks days hours  minutes 
--------------------------------------------------------------------------------
03/06/2007 22:32:51   0xA1      0     0     0     0     0 
--------------------------------------------------------------------------------

The operational uptime application tracks the following events:

Date and time the customer first powered on a component.

Total uptime and downtime for the component in years, weeks, days, hours, and minutes.

Total number of component resets.

Total number of slot (module) changes.

Current reset timestamp to include the date and time.

Current slot (module) number of the component.

Current uptime in years, weeks, days, hours, and minutes.

Reset reason; see Table 1 to translate the numbers displayed.

Count is the number of resets that have occurred for each reset reason.

Table 1 Reset Reason Codes and Explanations

Reset Reason Code (in hex)
Component/Explanation

0x01

Chassis on

0x02

Line card hot plug in

0x03

Supervisor requests line card off or on

0x04

Supervisor requests hard reset on line card

0x05

Line card requests Supervisor off or on

0x06

Line card requests hard reset on Supervisor

0x07

Line card self reset using the internal system register

0x08

0x09

0x0A

Momentary power interruption on the line card

0x0B

0x0C

0x0D

0x0E

0x0F

0x10

0x11

Off or on after Supervisor non-maskable interrupts (NMI)

0x12

Hard reset after Supervisor NMI

0x13

Soft reset after Supervisor NMI

0x14

0x15

Off or on after line card asks Supervisor NMI

0x16

Hard reset after line card asks Supervisor NMI

0x17

Soft reset after line card asks Supervisor NMI

0x18

0x19

Off or on after line card self NMI

0x1A

Hard reset after line card self NMI

0x1B

Soft reset after line card self NMI

0x21

Off or on after spurious NMI

0x22

Hard reset after spurious NMI

0x23

Soft reset after spurious NMI

0x24

0x25

Off or on after watchdog NMI

0x26

Hard reset after watchdog NMI

0x27

Soft reset after watchdog NMI

0x28

0x29

Off or on after parity NMI

0x2A

Hard reset after parity NMI

0x2B

Soft reset after parity NMI

0x31

Off or on after system fatal interrupt

0x32

Hard reset after system fatal interrupt

0x33

Soft reset after system fatal interrupt

0x34

0x35

Off or on after application-specific integrated circuit (ASIC) interrupt

0x36

Hard reset after ASIC interrupt

0x37

Soft reset after ASIC interrupt

0x38

0x39

Off or on after unknown interrupt

0x3A

Hard reset after unknown interrupt

0x3B

Soft reset after unknown interrupt

0x41

Off or on after CPU exception

0x42

Hard reset after CPU exception

0x43

Soft reset after CPU exception

0xA1

Reset data converted to generic data


Interrupts

Interrupts are generated by system components that require attention from the CPU such as ASICs and NMIs. Interrupts are generally related to hardware limit conditions or errors that need to be corrected.

The continuous format records each time a component is interrupted, and this record is stored and used as base information for subsequent records. Each time the list is saved, a timestamp is added. Time differences from the previous interrupt are counted, so that technical personnel can gain a complete record of the component's operational history when an error occurs.

Interrupts Example

--------------------------------------------------------------------------------
INTERRUPT SUMMARY INFORMATION
--------------------------------------------------------------------------------
Name                                              |  ID | Offset | Bit |  Count
--------------------------------------------------------------------------------
No historical data to display
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
CONTINUOUS INTERRUPT INFORMATION
--------------------------------------------------------------------------------
MM/DD/YYYY HH:MM:SS mmm | Name                             |  ID | Offset | Bit
--------------------------------------------------------------------------------
03/06/2007 22:33:06 450   Port-ASIC #2                         9   0x00E7     6
--------------------------------------------------------------------------------

To interpret this data:

Name is a description of the component including its position in the device.

ID is an assigned field for data storage.

Offset is the register offset from a component register's base address.

Bit is the interrupt bit number recorded from the component's internal register.

The timestamp shows the date and time that an interrupt occurred down to the millisecond.

Message Logging

The OBFL feature logs standard system messages. Instead of displaying the message to a terminal, the message is written to and stored in a file, so the message can be accessed and read at a later time. System messages range from level 1 alerts to level 7 debug messages, and these levels can be specified in the hw module logging onboard command.

Error Message Log Example

--------------------------------------------------------------------------------
ERROR MESSAGE SUMMARY INFORMATION
--------------------------------------------------------------------------------
Facility-Sev-Name      | Count | Persistence Flag
MM/DD/YYYY HH:MM:SS
--------------------------------------------------------------------------------
No historical data to display
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
ERROR MESSAGE CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
MM/DD/YYYY HH:MM:SS Facility-Sev-Name
--------------------------------------------------------------------------------
03/06/2007 22:33:35  %GOLD_OBFL-3-GOLD : Diagnostic OBFL: Diagnostic OBFL testing

To interpret this data:

A timestamp shows the date and time the message was logged.

Facility-Sev-Name is a coded naming scheme for a system message, as follows:

The Facility code consists of two or more uppercase letters that indicate the hardware device (facility) to which the message refers.

Sev is a single-digit code from 1 to 7 that reflects the severity of the message.

Name is one or two code names separated by a hyphen that describe the part of the system from where the message is coming.

The error message follows the Facility-Sev-Name codes. For more information about system messages, see the Cisco IOS System and Error Messages guide.

Count indicates the number of instances of this message that is allowed in the history file. Once that number of instances has been recorded, the oldest instance will be removed from the history file to make room for new ones.

The Persistence Flag gives a message priority over others that do not have the flag set.

How to Enable OBFL

This section contains the following procedure:

Enabling OBFL

Enabling OBFL

The OBFL feature is enabled by default. Because of the valuable information this feature offers technical personnel, it should not be disabled. If you find the feature has been disabled, use the following steps to reenable it.

SUMMARY STEPS

1. enable

2. configure terminal

3. hw-module switch switch-number module module-number logging onboard [message level {1-7}]

4. end

DETAILED STEPS

 
Command or Action
Purpose

Step 1 

enable

Example:

Router> enable

Enables privileged EXEC mode.

Enter your password if prompted.

Step 2 

configure terminal

Example:

Router# configure terminal

Enters global configuration mode.

Step 3 

hw-module switch switch-number module 
module-number logging onboard [message level 
{1-7}]
Example:

Router(config)# hw-module switch 2 module 1 logging onboard

Enables OBFL on the specified hardware module.

Note By default, all system messages sent to a device are logged by the OBFL feature. You can define a specific message level (only level 1 messages, as an example) to be logged using the message level keywords.

Step 4 

end

Example:

Router(config)# end

Ends global configuration mode.

Configuration Examples for OBFL

The important OBFL feature is the information that is displayed by the show logging onboard module privileged EXEC command. This section provides the following examples of how to enable and display OBFL records.

Enabling OBFL Message Logging: Example

OBFL Message Log: Example

OBFL Component Uptime Report: Example

OBFL Report for a Specific Time: Example

Enabling OBFL Message Logging: Example

The following example shows how to configure OBFL message logging at level 3:

hw-module switch 2 module 1 logging onboard message level 3

OBFL Message Log: Example

The following example shows how to display the system messages that are being logged for module 2:

Router# show logging onboard module 2 message continuous

--------------------------------------------------------------------------------
ERROR MESSAGE CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
MM/DD/YYYY HH:MM:SS Facility-Sev-Name
--------------------------------------------------------------------------------
03/06/2007 22:33:35 %SWITCH_IF-3-CAMERR : [chars], for VCI [dec] VPI [dec] in stdby data 
path check, status: [dec]
--------------------------------------------------------------------------------

OBFL Component Uptime Report: Example

The following example shows how to display a summary report for component uptimes for module 2:

Router# show logging onboard module 2 uptime

--------------------------------------------------------------------------------
UPTIME SUMMARY INFORMATION
--------------------------------------------------------------------------------
First customer power on : 03/06/2007 22:32:51
Total uptime            :   0 years   0 weeks   0 days   0 hours  35 minutes
Total downtime          :   0 years   0 weeks   0 days   0 hours   0 minutes
Number of resets        : 1
Number of slot changes  : 0
Current reset reason    : 0xA1
Current reset timestamp : 03/06/2007 22:31:34
Current slot            : 2
Current uptime          :   0 years   0 weeks   0 days   0 hours  35 minutes
--------------------------------------------------------------------------------
Reset  |        |
Reason | Count  |
--------------------------------------------------------------------------------
No historical data to display
--------------------------------------------------------------------------------

OBFL Report for a Specific Time: Example

The following example shows how to display continuous reports for all components during a specific time period:

Router# show logging onboard module 3 continuous start 15:01:57 1 Mar 2007 end 15:04:57 3 
Mar 2007

PID: WS-X6748-GE-TX    , VID:    , SN: SAL09063B85

--------------------------------------------------------------------------------
UPTIME CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
Time Stamp          | Reset  | Uptime 
MM/DD/YYYY HH:MM:SS | Reason | years weeks days hours  minutes 
--------------------------------------------------------------------------------
03/01/2007 15:01:57   0xA1      0     0     0    10     0 
03/03/2007 02:29:29   0xA1      0     0     0     5     0 
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
TEMPERATURE CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
Sensor                            |   ID  | 
--------------------------------------------------------------------------------
MB-Out                              930201 
MB-In                               930202 
MB                                  930203 
MB                                  930204 
EARL-Out                            910201 
EARL-In                             910202 
SSA 1                               930301 
SSA 2                               930302 
JANUS 1                             930303 
JANUS 2                             930304 
GEMINI 1                            930305 
GEMINI 2                            930306 
-------------------------------------------------------------------------------
       Time Stamp   |Sensor Temperature 0C 
MM/DD/YYYY HH:MM:SS |  1    2    3    4    5    6    7    8    9   10   11   12
-------------------------------------------------------------------------------
03/01/2007 15:01:57   26   26   NA   NA   NA   NA    0    0    0    0    0    0 
03/01/2007 15:06:57   39   27   NA   NA   NA   NA   39   37   36   29   32   32 
03/01/2007 15:11:02   40   27   NA   NA   NA   NA   40   38   37   30   32   32 
03/01/2007 17:06:06   40   27   NA   NA   NA   NA   40   38   37   30   32   32 
03/01/2007 19:01:09   40   27   NA   NA   NA   NA   40   38   37   30   32   32 
03/03/2007 02:29:30   25   26   NA   NA   NA   NA    0    0    0    0    0    0 
03/03/2007 02:34:30   38   26   NA   NA   NA   NA   39   37   36   29   31   31 
03/03/2007 04:29:33   40   27   NA   NA   NA   NA   40   38   36   30   32   32 
03/03/2007 06:24:37   40   27   NA   NA   NA   NA   40   38   36   29   32   32 
03/03/2007 08:19:40   40   27   NA   NA   NA   NA   40   38   36   29   32   32 
03/03/2007 10:14:44   40   27   NA   NA   NA   NA   40   38   36   30   32   32 
03/03/2007 12:09:47   40   27   NA   NA   NA   NA   40   38   36   30   32   32 
03/03/2007 14:04:51   40   27   NA   NA   NA   NA   40   38   36   30   32   32 
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
CONTINUOUS INTERRUPT INFORMATION
--------------------------------------------------------------------------------
MM/DD/YYYY HH:MM:SS mmm | Name                             |  ID | Offset | Bit
--------------------------------------------------------------------------------
03/01/2007 15:01:59 350   Port-ASIC #0                         7   0x00E7     6
03/03/2007 02:29:34 650   Port-ASIC #0                         7   0x00E7     6
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
ERROR MESSAGE CONTINUOUS INFORMATION
--------------------------------------------------------------------------------
MM/DD/YYYY HH:MM:SS Facility-Sev-Name
--------------------------------------------------------------------------------
03/01/2007 15:02:15  %GOLD_OBFL-3-GOLD : Diagnostic OBFL: Diagnostic OBFL testing
03/03/2007 02:29:51  %GOLD_OBFL-3-GOLD : Diagnostic OBFL: Diagnostic OBFL testing
--------------------------------------------------------------------------------

Additional References

The following sections provide references related to the OBFL feature.

Related Documents

Related Topic
Document Title

Onboard Failure Logging for Cisco 12000 series routers running Cisco IOS XR Software v3.4

Onboard Failure Logging on Cisco IOS Software Release 3.4 feature module

Onboard Failure Logging for Cisco 12000 series routers running Cisco IOS Software Release 12.0(31)S

Onboard Failure Logging Software Release 12.0 feature module

Onboard Failure Logging for Catalyst 3750-E and 3560-E switches running Cisco IOS Software
Release 12.2(35)SE2

"Using On-Board Failure Logging" section in the "Troubleshooting" chapter in the Catalyst 3750-E and 3560-E Switch Software Configuration Guide, 12.2(35)SE2

Catalyst 3750-E and 3560-E Switch Command Reference, 12.2(35)SE2

Onboard Failure Logging for the Cisco MDS 9000

"Using On-Board Failure Logging" section in the "Troubleshooting Tools and Methodology" chapter of the Cisco MDS 9000 Family Troubleshooting Guide


Standards

Standard
Title

None


MIBs

MIB
MIBs Link

None

To locate and download MIBs for selected platforms, Cisco IOS releases, and feature sets, use Cisco MIB Locator found at the following URL:

http://www.cisco.com/go/mibs


RFCs

RFC
Title

None


Technical Assistance

Description
Link

The Cisco Support website provides extensive online resources, including documentation and tools for troubleshooting and resolving technical issues with Cisco products and technologies.

To receive security and technical information about your products, you can subscribe to various services, such as the Product Alert Tool (accessed from Field Notices), the Cisco Technical Services Newsletter, and Really Simple Syndication (RSS) Feeds.

Access to most tools on the Cisco Support website requires a Cisco.com user ID and password.

http://www.cisco.com/techsupport


Command Reference

The following commands are introduced or modified in the feature or features documented in this module. For information about these commands, see the Cisco IOS Network Management Command Reference at http://www.cisco.com/en/US/docs/ios/netmgmt/command/reference/nm_book.html. For information about all Cisco IOS commands, go to the Command Lookup Tool at http://tools.cisco.com/Support/CLILookup or to the Cisco IOS Master Commands List.

clear logging onboard (Cat 6K)

copy logging onboard (Cat 6K)

hw-module logging onboard (Cat 6K)

show logging onboard (Cat 6K)

Feature Information for OBFL

Table 2 lists the release history for this feature.

Not all commands may be available in your Cisco IOS software release. For release information about a specific command, see the command reference documentation.

Use Cisco Feature Navigator to find information about platform support and software image support. Cisco Feature Navigator enables you to determine which Cisco IOS and Catalyst OS software images support a specific software release, feature set, or platform. To access Cisco Feature Navigator, go to http://www.cisco.com/go/cfn. An account on Cisco.com is not required.


Note Table 2 lists only the Cisco IOS software release that introduced support for a given feature in a given Cisco IOS software release train. Unless noted otherwise, subsequent releases of that Cisco IOS software release train also support that feature.


Table 2 Feature Information for OBFL

Feature Name
Releases
Feature Information

Onboard Failure Logging

12.2(33)SXH

The Onboard Failure Logging (OBFL) feature collects data such as operating temperatures, hardware uptime, interrupts, and other important events and messages from system hardware installed in a Cisco router or switch. The data is stored in nonvolatile memory and helps technical personnel diagnose hardware problems.

In Release 12.2(33)SXH, this feature was introduced on the Cisco Catalyst 6000 series switches.

The following commands were introduced for the Cisco Catalyst 6000 series switches by this feature: clear logging onboard (Cat 6K), copy logging onboard module (Cat 6K), hw-module logging onboard (Cat 6K), show logging onboard (Cat 6K).