Cisco IOS XR Troubleshooting Guide for the Cisco CRS-1 Router
Collecting System Information
Downloads: This chapterpdf (PDF - 117.0KB) The complete bookPDF (PDF - 3.46MB) | Feedback

Collecting System Information

Table Of Contents

Collecting System Information

Capturing Logs

Using ping and traceroute

Using Debug Commands

Using Diagnostic Commands

Online Diagnostics

Transient Condition when Standby RP Becomes Active

Offline Diagnostics—FDIAG RUNNING State

Additional Reference for Diagnostic Commands

Commands Used to Display Process and Thread Details


Collecting System Information


This chapter describes techniques that you can use to collect system information for troubleshooting routers using Cisco IOS XR software. It includes the following sections:

Capturing Logs

Using ping and traceroute

Using Debug Commands

Using Diagnostic Commands

Commands Used to Display Process and Thread Details

Capturing Logs

See the "Troubleshooting Techniques and Approaches" section on page 1-1 in Chapter 1, "General Troubleshooting," for information on collecting current system information. You can collect system information using the following commands:

The following commands are used to capture logs:

show tech-support—Displays system information for Cisco Technical Support

show logging—Displays the contents of the logging buffers

show system verify—Displays system verification information

dumpplaneeeprom: displays the serial number of the chassis. This command is executed in ROM monitor (ROMMON) mode. The following example shows the output of the command:

rommon B2 > dumpplaneeeprom 
 
   
EEPORM data backplane
000000 ff 00 01 e0 ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000030 ff ff ff ff ff ff 08 00 45 3a 2d 01 04 00 ff ff ........E:-..... 
000040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000050 54 42 43 30 37 31 39 30 31 37 33 30 30 30 30 30 TBC0719017300000 
000060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000080 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
000090 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ 
0000f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
 
   

The "TBC0719017300000" string is the rack number. This string should be present for every chassis. The number is burnt in by manufacturing.

Using ping and traceroute

For details on the ping and traceroute commands, see the "Basic Troubleshooting Commands" section inCisco IOS XR Getting Started Guide for the Cisco CRS-1 Router.

Using Debug Commands

For details on using debug commands, see Cisco IOS XR Using Debug Guide.

Using Diagnostic Commands

The Cisco IOS XR diagnostic tests verify control Ethernet and fabric data paths between nodes in a system using Cisco IOS XR software. If a diagnostic tests fails, it indicates a bad data path. The integrity of the covered data paths is verified when the diagnostic tests pass.

The system runs diagnostic tests automatically to verify control Ethernet and fabric data paths between nodes in the system. The integrity of the covered data paths is verified when the diagnostic tests pass. If a diagnostic tests fails, it indicates a bad data path and the system alerts you to the problem.

The diagnostic tests generally test data paths between multiple nodes, therefore you need to analyze error reports to narrow down the possible points of failure in a system. For example, if a diagnostic test fails on a fabric interface, the fabric cables in the path of failure would be a primary suspected cause of the failure.

All diagnostic tests run within the 1 second to 1 minute range.

This section contains the following additional topics related to diagnostics:

Online Diagnostics

Transient Condition when Standby RP Becomes Active

Offline Diagnostics—FDIAG RUNNING State

Additional Reference for Diagnostic Commands

Online Diagnostics

You can start and stop diagnostic tests on specific nodes while the node is online and processing traffic. It is important to run test on both the active and standby RP; the standby RP is actually capable of running more fabric diagnostic tests than the active RP.

Examples:

The following example shows a set of diagnostic tests on the active RP (0/RP0/CPU0).

RP/0/RP0/CPU0:router(admin)#diagnostic start location 0/RP0/CPU0 test non-disruptive 
 
   
Wed Sep  1 12:50:24.156 PDT
RP/0/RP0/CPU0:Sep  1 12:50:24.426 : online_diag_rp[351]: 
%DIAG-DIAG-6-TEST_SKIPPED_FROM_ACTIVE : RP 0/RP0/CPU0: ControlEthernetInactiveLinkTest 
cannot be executed from active node. 
RP/0/RP0/CPU0:Sep  1 12:50:24.426 : online_diag_rp[351]: 
%DIAG-DIAG-6-TEST_SKIPPED_FROM_ACTIVE : RP 0/RP0/CPU0: FabricDiagnosisTest cannot be 
executed from active node. 
RP/0/RP0/CPU0:Sep  1 12:50:24.428 : online_diag_rp[351]: 
%DIAG-DIAG-6-TEST_SKIPPED_FROM_ACTIVE : RP 0/RP0/CPU0: FabricMcastTest cannot be 
executed from active node. 
RP/0/RP0/CPU0:Sep  1 12:50:24.430 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running ControlEthernetPingTest{ID=1} ... 
RP/0/RP0/CPU0:Sep  1 12:50:28.703 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: ControlEthernetPingTest{ID=1} has completed successfully 
 
   
RP/0/RP0/CPU0:router(admin)#show diagnostic result location 0/RP0/CPU0 
 
   
Wed Sep  1 12:51:41.606 PDT
 
   
Current bootup diagnostic level for RP 0/RP0/CPU0: bypass
RP 0/RP0/CPU0: 
  Overall diagnostic result: MINOR ERROR
  Diagnostic level at card bootup: bypass
 
   
  Test results: (. = Pass, F = Fail, U = Untested)
 
   
  1  ) ControlEthernetPingTest ---------> .
  2  ) SelfPingOverFabric --------------> .
  3  ) FabricPingTest ------------------> .
  4  ) ControlEthernetInactiveLinkTest -> U
  5  ) RommonRevision ------------------> F
  6  ) FabricDiagnosisTest -------------> U
  7  ) FilesystemBasicDisk0 ------------> .
  8  ) FilesystemBasicDisk1 ------------> .
  9  ) FilesystemBasicHarddisk ---------> .
  10 ) ScratchRegisterTest: 
 
   
      Device  1  2  3  4
      ------------------
              .  .  .  . 
 
   
  11 ) FabricMcastTest -----------------> U
 
   
 
   

The following example shows a set of diagnostic tests on the standby RP (0/RP1/CPU0).

RP/0/RP0/CPU0:router(admin)#diagnostic start location 0/RP1/CPU0 test non-disruptive 
 
   
Wed Sep  1 12:50:52.703 PDT
RP/0/RP1/CPU0:Sep  1 12:50:54.242 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP1/CPU0: Running ControlEthernetPingTest{ID=1} ... 
RP/0/RP1/CPU0:Sep  1 12:50:58.686 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP1/CPU0: ControlEthernetPingTest{ID=1} has completed successfully 
RP/0/RP1/CPU0:Sep  1 12:50:58.686 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP1/CPU0: Running SelfPingOverFabric{ID=2} ... 
RP/0/RP0/CPU0:Sep  1 12:50:58.809 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: SelfPingOverFabric{ID=2} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:50:58.809 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running FabricPingTest{ID=3} ... 
RP/0/RP0/CPU0:Sep  1 12:51:00.672 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: FabricPingTest{ID=3} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:51:00.672 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running FilesystemBasicDisk0{ID=7} ... 
RP/0/RP0/CPU0:Sep  1 12:51:00.697 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: FilesystemBasicDisk0{ID=7} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:51:00.697 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running FilesystemBasicDisk1{ID=8} ... 
RP/0/RP0/CPU0:Sep  1 12:51:00.749 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: FilesystemBasicDisk1{ID=8} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:51:00.749 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running FilesystemBasicHarddisk{ID=9} ... 
RP/0/RP0/CPU0:Sep  1 12:51:01.796 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: FilesystemBasicHarddisk{ID=9} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:51:01.796 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running ScratchRegisterTest{ID=10} ... 
RP/0/RP0/CPU0:Sep  1 12:51:02.799 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_OK : RP 
0/RP0/CPU0: ScratchRegisterTest{ID=10} has completed successfully 
RP/0/RP0/CPU0:Sep  1 12:51:02.799 : online_diag_rp[351]: %DIAG-DIAG-6-TEST_RUNNING : 
RP 0/RP0/CPU0: Running RommonRevision{ID=5} ... 
RP/0/RP0/CPU0:Sep  1 12:51:02.800 : online_diag_rp[351]: %DIAG-DIAG-3-TEST_FAIL : RP 
0/RP0/CPU0: RommonRevision{ID=5} has failed. Error code = 0x1 (DIAG_FAILURE) 
 
   
 
   
RP/0/RP0/CPU0:router(admin)#show diagnostic result location 0/RP1/CPU0 
Wed Sep  1 13:03:28.617 PDT
 
   
Current bootup diagnostic level for RP 0/RP1/CPU0: bypass
RP 0/RP1/CPU0: 
  Overall diagnostic result: MINOR ERROR
  Diagnostic level at card bootup: bypass
 
   
  Test results: (. = Pass, F = Fail, U = Untested)
 
   
  1  ) ControlEthernetPingTest ---------> .
  2  ) SelfPingOverFabric --------------> .
  3  ) FabricPingTest ------------------> .
  4  ) ControlEthernetInactiveLinkTest -> .
  5  ) RommonRevision ------------------> F
  6  ) FabricDiagnosisTest -------------> .
  7  ) FilesystemBasicDisk0 ------------> .
  8  ) FilesystemBasicDisk1 ------------> .
  9  ) FilesystemBasicHarddisk ---------> .
  10 ) ScratchRegisterTest: 
 
   
      Device  1  2  3  4
      ------------------
              .  .  .  . 
 
   
  11 ) FabricMcastTest -----------------> .
 
   
 
   

Transient Condition when Standby RP Becomes Active

If online diagnostics are performed within five minutes of the standby RP becoming active, some test cases will be skipped. Wait at least five minutes after the standby RP is ready before performing the online diagnostic test. If your system is set to perform diagnostic checks automatically, it might skip some tests during this five-minute period. Therefore, you should perform these tests manually after the standby RP has been active for at least five minutes.

To run a specified on-demand diagnostic test or series of tests, use the diagnostic start location command.

Examples:

RP/0/RP0/CPU0:router(admin)# diagnostic start location 0/RP1/CPU0 test 1
 
   
RP/0/RP0/CPU0:router(admin)# diagnostic stop location 0/RP1/CPU0 
 
   

Offline Diagnostics—FDIAG RUNNING State

You load the offline diagnostics with the command (admin)diagnostic load location node-id. The specified node remains in the "FDIAG RUNNING" state until you unload diagnostics with the command (admin)diagnostic unload location node-id.


Note In the "FDIAG RUNNING" state, the specified node is offline and cannot process traffic.


While a node is in "FDIAG RUNNING" state, tests are run in response to the optional autostart keyword of the diagnostic load location node-id command or the diagnostic start location node-id command. When an individual test completes, a message is printed and results are updated. The result can be read using the show diagnostic result location node-id command. When a test completes, a new diagnostic start location node-id command can be invoked since the card remains the "FDIAG RUNNING" state until it is explicitly unloaded using the diagnostic unload location node-id command.

Additional Reference for Diagnostic Commands

For details on diagnostics commands and available tests, see Cisco IOS XR Diagnostics at the Configuration Guide site:
http://www.cisco.com/en/US/products/ps5763/
products_installation_and_configuration_guides_list.html

Commands Used to Display Process and Thread Details

For details on processes and threads, see the "Understanding Processes and Threads" section in Cisco IOS XR Getting Started Guide for the Cisco CRS-1 Router.