Cisco 12012 Installation and Configuration Guide
Diagnostics
Downloads: This chapterpdf (PDF - 89.0KB) The complete bookPDF (PDF - 4.27MB) | Feedback

Running Diagnostics on the Cisco 12012

Table Of Contents

Running Diagnostics on the Cisco 12012

Diagnostic Test Overview

Using the diag Command

Diagnostic Testing Sequence

Loading and Running Diagnostics

Diagnostic Examples

Without verbose Option

With verbose Option

Failed Diagnostic


Running Diagnostics on the Cisco 12012


Field diagnostics are available for the Cisco 12012 GSR to help you isolate faulty hardware to the level of a field-replaceable unit (FRU) without disrupting the operation of the system. After you identify the faulty unit, you can replace it with a spare unit.

Field diagnostics are not designed to identify specific components within the router. They simply determine whether a particular card is operational or defective.

Running Diagnostics on the Cisco 12012 GSR is presented in the following sections:

Diagnostic Test Overview

Using the diag Command

Diagnostic Testing Sequence

Loading and Running Diagnostics

Diagnostic Test Overview

There are more than a hundred diagnostic tests for line cards, including the following:

Processor tests

Memory tests

Component tests

Major data path tests

The field diagnostics software image is bundled with the Cisco IOS software and is downloaded from the route processor (RP) to the target card before testing.


Note   When using Cisco IOS Release 12.0(21)S or 12.0(21)ST, or a later release of 12.0S or 12.0ST, the default download method changes from the mbus to the switch fabric. It takes about 1-minute to obtain test results from the switch fabric compared to 15-minutes to obtain test results from the mbus.


While diagnostics are running, the line card being tested is controlled by the diagnostic software. Diagnostics take the line card under test offline. The diagnostics affect just the line card being tested; the rest of the line cards remain online and continue to pass traffic normally.

Except for the tests on the clock and scheduler cards (CSCs) and the switch fabric cards (SFCs), which may temporarily drop throughput on those cards, the diagnostics do not affect system performance.

Diagnostic testing stops at the completion of all of the tests, when terminated by the user, or by default when an error is encountered. If multiple cards are specified for the test cycle, the diagnostics stop testing a card when it fails a test, but continue testing the remaining cards.


Note   You can use the diag slot coe command to force the continuation of tests, even after an error is encountered. This is not recommended for use on operational, business-critical routers.


When testing is finished, a pass or fail message displays on the console, as well as on the alphanumeric LED display on the card being tested.

Using the diag Command

The diagnostic test command, issued at the privileged EXEC mode prompt on the system console, takes the following form:

diag slot [halt] [previous] [mbus] [verbose] [wait] [coe]

where:

slot

Specifies which card cage slot to test. The diagnostic software determines the type of card in the slot and downloads the appropriate tests.

halt

(Optional) Stops the active diagnostic test.

previous

(Optional) Allows you to examine the last test results on the card, stored in EEPROM, specified by the slot parameter.

mbus1

(Optional) Forces the route processor to load diags from the mbus.

verbose

(Optional) Turns on the status messaging capability of the diagnostics. The default is minimum messaging.

wait

(Optional) Stops the diagnostics from reloading the Cisco IOS image following the completion of diagnostic testing. The card must be ejected from the slot, reinstalled in the slot, and reconfigured manually.

coe2
(Continue on error)

(Optional) Continues testing even after a failed test.

1 Using this option results in a 15-minute delay before test results are returned. This command option is available when using Cisco IOS Release 12.0(21)S or 12.0(21)ST, or a later release of 12.0S or 12.0ST.

2 Not recommended for use on operational, business-critical routers. This command option is available when using Cisco IOS Release 12.0(21)S or 12.0(21)ST, or a later release of 12.0S or 12.0ST.


To stop diagnostic testing at any time, enter the halt option in the command, at the privileged EXEC mode prompt on the system console:

diag slot halt

Diagnostic Testing Sequence

When testing a card, the diagnostics perform the following operations in this sequence:

1 Halts the normal operation of the card.

The card is no longer available for network traffic.

2 Downloads a diagnostic image from the RPs running IOS software to the line card before testing.

The Cisco IOS software image is removed from the line card DRAM and is replaced with the diagnostic software image for the duration of the tests.

3 Sends and receives messages across the MBus to and from the card being tested.

During the testing process, messages are passed from the line card under test to the RP. If the verbose option is turned on, interim messages listing the start and completion of each test are displayed at the console. If the verbose option is not specified (default), the console displays the minimum number of messages.

4 Displays pass or fail test results.

At the conclusion of the diagnostic tests, a pass or fail message is sent to the RP, which passes the message to the console and to the alphanumeric LED display on the line card being tested. The message is displayed on the alphanumeric display until the Cisco IOS image is booted following the completion of testing. The pass or fail message is also stored in Flash memory for later factory analysis.

5 Reloads the Cisco IOS software image.

If diagnostic testing was successful, and you do not specify the wait option, the Cisco IOS software image is loaded from the RP to the card under test, bringing it back online.

Loading and Running Diagnostics

Procedures for loading and running diagnostic tests on a card in the router, including sample console display messages, follow. You must run diagnostic tests from the system console in privileged EXEC mode.

To load and run diagnostics on a card, follow these steps:


Step 1 From the EXEC prompt (Router>), enter enable to enter privileged EXEC mode:

Router> enable
Password:

Step 2 Enter the password assigned to the system.

The prompt changes to the privileged EXEC prompt:

Router#

Step 3 Determine the slot number of the card on which you want to run diagnostics.


Note   Although you can run diagnostics concurrently on up to three line cards, the recommended number is only one at a time. The cards will be taken offline and cannot pass traffic.


Step 4 Enter the diag command:

Router# diag slot 

The diagnostic tests are downloaded and run. Test status and administrative messages are returned to the system console. At the end of testing, a pass or fail message is displayed on the console. The number of messages displayed depends on whether you included the verbose option in the command.



Note   Field diagnostics run limited tests of the switch fabric when testing a line card. This provides a good method of troubleshooting switch fabric problems.


Diagnostic Examples

Several examples of diagnostic tests are given in the following sections:

Without verbose Option

With verbose Option

Failed Diagnostic

Without verbose Option

To see how the verbose option changes the messages from the diagnostics to the console, refer to the following examples.

In the first example, diagnostics are run on a line card installed in slot 2 in the card cage. The diagnostics are run without the verbose option set (minimum messaging).

The console displays a message sequence similar to the following, showing the progress of the diagnostic testing. In the following example message sequence, inserted comments describe the type of diagnostic activity by the messages.

Router# diag 2
Running DIAG config check
Running Diags will halt ALL activity on the requested slot.
[confirm]
Router# <Return>
Launching a Field Diagnostic for slot 2
Downloading diagnostic tests to slot 2 (timeout set to 400 sec.)
Field Diag download COMPLETE for slot 2
FD 2> ****************************************************
FD 2> GSR Field Diagnostics V3.0
FD 2> Compiled by award on Tue Aug 3 15:58:13 PDT 2000
FD 2> view: award-bfr_112.FieldDiagRelease
FD 2> ****************************************************
FD 2> BFR_CARD_TYPE_OC48_1P_POS_TTM testing...
FD 2> running in slot 2 (73 tests)

Executing all diagnostic tests in slot 2

The messages in the lines shown above indicate that the diagnostics software checked the card type and status, determined that the card installed in slot 2 could run diagnostics, downloaded the diagnostic software image to the card, and gave it the command to run all diagnostic tests.

(total/indiv. timeout set to 600/220 sec.)

The message in the line shown above indicates the two timeout values set for diagnostics. The first timeout is set to 600 seconds, which is the maximum amount of time allowed for all diagnostic tests to run. The second timeout is set to 220 seconds, which is the maximum amount of time allowed for any one diagnostic test to run.

Field Diagnostic ****PASSED**** for slot 2

The message in the line shown above indicates that the diagnostic tests run on the card in slot 2 all passed.

Shutting down diags in slot 2

Board will reload

SLOT 2:%SYS-5-RESTART: System restarted --
Cisco Internetwork Operating System Software 
IOS (tm) GS Software (GSR-P-MZ), Released Version 12.0(n)GS
Copyright (c) 1986-2000 by cisco Systems, Inc.
Compiled Fri 17-Sep-00 17:58 by ...
Router#

The messages in the lines shown above indicate that the diagnostics software is automatically terminated and the line card is reloaded and restarted.

With verbose Option

If you set the verbose option, that changes the diagnostics message stream to the console. As an example, running diagnostics on the line card in slot 2 with the verbose option set produces a message stream to the console similar to the following (only a partial list of messages is shown). In the following example message sequence, inserted comments describe the type of diagnostic activity indicated by the messages.


Note   In Cisco IOS Release 12.0(21)S or 12.0(21)ST, or a later release of 12.0S or 12.0ST, this option displays the name of the test as testing progresses, and it displays "fatalError" when a failure is detected.


Router# diag 2 verbose
Running DIAG config check
Running Diags will halt ALL activity on the requested slot.
[confirm]
Router# <Return>
Launching a Field Diagnostic for slot 2
Downloading diagnostic tests to slot 2 (timeout set to 400 sec.)
Field Diag download COMPLETE for slot 2
FD 2> ****************************************************
FD 2> GSR Field Diagnostics V3.0
FD 2> Compiled by award on Tue Aug 3 15:58:13 PDT 2000
FD 2> view: award-bfr_112.FieldDiagRelease
FD 2> ****************************************************
FD 2> BFR_CARD_TYPE_OC48_1P_POS_TTM testing...
FD 2> running in slot 2 (73 tests)

Executing all diagnostic tests in slot 2
(total/indiv. timeout set to 600/220 sec.)
FD 2> Verbosity now (0x00000001) TESTSDISP

Field diagnostics, verbose example (continued)

FDIAG_STAT_IN_PROGRESS: test #1 R5K Internal Cache
FDIAG_STAT_IN_PROGRESS: test #2 Burst Operations
FDIAG_STAT_IN_PROGRESS: test #3 Subblock Ordering
FDIAG_STAT_IN_PROGRESS: test #4 Dram Marching Pattern
FDIAG_STAT_IN_PROGRESS: test #5 Dram Datapins
FDIAG_STAT_IN_PROGRESS: test #6 Dram Busfloat
.
.
.
FDIAG_STAT_IN_PROGRESS: test #73 SDRAM Traffic
FDIAG_STAT_DONE

Field Diagnostic ****PASSED**** for slot 2

Field Diag eeprom values: run 0 fail mode 0 (PASS) slot 2
   last test failed was 0, error code 0
Shutting down diags in slot 2

Board will reload
SLOT 2:%SYS-5-RESTART: System restarted --
Cisco Internetwork Operating System Software 
IOS (tm) GS Software (GSR-P-MZ), Released Version 12.0(n)GS
Copyright (c) 1986-2000 by cisco Systems, Inc.
Compiled Fri 17-Sep-00 17:58 by ...
Router#

When you set the verbose option, most of the information returned by the diagnostic tests is status messages that indicate when tests start and when they are completed. At the end of the diagnostic tests, a message indicates whether the card passed or failed the tests.

Failed Diagnostic

If a diagnostic test fails on a line card, testing halts with that test. The line card will not reload or come back online automatically. The following example shows a diagnostic message stream to the console for a line card located in slot 7. In the example, the card fails one of the diagnostic tests, stopping the diagnostic cycle on that test.

Router# diag 7 verbose
Running DIAG config check
Running Diags will halt ALL activity on the requested slot.
[confirm]
Router# <Return>
Launching a Field Diagnostic for slot 7
Downloading diagnostic tests to slot 7 (timeout set to 400 sec.)
Field Diag download COMPLETE for slot 7 
FD 7> ****************************************************
FD 7> GSR Field Diagnostics V3.0
FD 7> Compiled by award on Tue Aug 3 15:58:13 PDT 2000
FD 7> view: award-bfr_112.FieldDiagRelease
FD 7> ****************************************************
FD 7> BFR_CARD_TYPE_OC48_1P_POS testing...
FD 7> running in slot 7 (128 tests)

Executing all diagnostic tests in slot 7
(total/indiv. timeout set to 600/220 sec.)
FD 7> Verbosity now (0x00000001) TESTSDISP

FDIAG_STAT_IN_PROGRESS: test #1 R5K Internal Cache
FDIAG_STAT_IN_PROGRESS: test #2 Burst Operations
FDIAG_STAT_IN_PROGRESS: test #3 Subblock Ordering
.
.
.
FDIAG_STAT_IN_PROGRESS: test #21, error_code 5
Field Diagnostic: ****TEST FAILURE**** slot 7: last test run 21,
To Fabric SOP FIFO SRAM Memory, error 5
Field Diag eeprom values:run 0 fail mode 1 (TEST FAILURE) slot 7
   last test failed was 21, error code 5
Shutting down diags in slot 7
slot 7 done, will not reload automatically

Router#

Note   The DRAM is the only field-replaceable component on a line card; therefore, if a diagnostic test fails, you must replace the line card, which is the field-replaceable unit (FRU).