Cisco 12000 Series Router SIP and SPA Hardware Installation 12.0(33)S
Troubleshooting the Installation
Downloads: This chapterpdf (PDF - 427.0KB) The complete bookPDF (PDF - 9.45MB) | Feedback

Troubleshooting the Installation

Table Of Contents

Troubleshooting the Installation

Using show Commands to Check Status

Advanced SIP Troubleshooting

Output Examples

show context summary Output

show logging Output

show logging onboard Output

show diag slot Output

show context slot Output

Checking the Current Status of the SIP

show led Output

Fabric Ping Failure

Error Messages

FPGA Error Messages

SIP Diagnostics

Packing a SIP for Shipment

Packing a SPA for Shipment


Troubleshooting the Installation


Release 12.0(33)S, OL-8831-01, Rev. G9

This chapter describes how to troubleshoot the installation of SIPs and SPAs on the Cisco 12000 series router. This chapter contains the following sections:

Using show Commands to Check Status

Advanced SIP Troubleshooting

SIP Diagnostics

Packing a SIP for Shipment

Packing a SPA for Shipment

Using show Commands to Check Status

Each Cisco 12000 Series Router SIP maintains information about its configuration, traffic, errors, and so on. You can display this information by using the following show commands.

Using the show version Command

Use the show version command to display the configuration of the router hardware (the number of each line card type installed), the Cisco IOS software release, the names and sources of configuration files, and the boot images.

Using the show gsr Command

Use the show gsr command to display information about the hardware modules installed in the Cisco 12000 Series Internet Router.

Using the show interfaces Command

The following commands display information about the router interfaces: show interfaces, show interfaces pos slot/subslot/port, and show interfaces serial slot/subslot/port, and so on depending on the SPA interface type.

Using the show running-config Command

Use the show running-config command to display the currently running configuration in RAM:

Advanced SIP Troubleshooting

This section provides advanced troubleshooting information in the event of a SIP failure. It also provides pointers for identifying whether or not the failure is hardware related. This section does not include any software-related failures, except for those that are often mistaken for hardware failures.


Note This section assumes that you possess basic proficiency in the use of Cisco IOS software commands.


By reading this section and by following the troubleshooting steps, you should be able to determine the nature of the problems you are having with your SIP. The first step is to identify the cause of the SIP failure or console errors that you are seeing. To discover which card may be at fault, it is essential to collect the output from the following commands:

show context summary

show logging

show logging summary

show logging onboard

show diag

show context slot slot

Along with these show commands, you should also gather the following information:

Console Logs and Syslog Information—This information is crucial if multiple symptoms are occurring. If the router is configured to send logs to a Syslog server, you may see some information on what has occurred. For console logs, it is best to be directly connected to the router on the console port with logging enabled.

Additional Data—The show tech-support command is a compilation of many different commands, including show version, show running-config, and show stacks. This information is required when working on issues with the Cisco Technical Assistance Center (TAC).


Note It is important to collect the show tech-support data before doing a reload or power cycle. Failure to do so can cause all information about the problem to be lost.



Note Output from these commands will vary slightly depending on which SIP you are using, but the basic information will be the same.


Output Examples

The following are examples of system output that you may see if your Cisco 12000 series router SIP fails. Key data in the output is underlined.

show context summary Output

show logging Output

show logging onboard Output

show diag slot Output

show context slot Output

show context summary Output

Router# show context summary
CRASH INFO SUMMARY
Slot 0 : 0 crashes
Slot 1 : 1 crashes
1 . crash at 10:36:20 UTC Wed Dec 19 2001
Slot 2 : 0 crashes
Slot 3 : 0 crashes
Slot 4 : 0 crashes
Slot 5 : 0 crashes
Slot 6 : 0 crashes
(remainder of output omitted)

show logging Output

Router# show logging
Syslog logging: enabled (2 messages dropped, 0 messages rate.limited, 0 flushes,
0 overruns)
Console logging: level debugging, 24112 messages logged
Monitor logging: level debugging, 0 messages logged
Buffer logging: level debugging, 24411 messages logged
Logging Exception size (4096 bytes)
Trap logging: level informational, 24452 message lines logged
5d16h: %LCINFO.3.CRASH: Line card in slot 1 crashed
5d16h: %GRP.4.RSTSLOT: Resetting the card in the slot: 1,Event: 38
5d16h: %IPCGRP.3.CMDOP: IPC command 3
5d16h: %CLNS.5.ADJCHANGE: ISIS: Adjacency to malachim2 (GigabitEthernet1/0) Up,
n8 (slot1/0): linecard is disabled
.Traceback= 602ABCA8 602AD8B8 602B350C 602B3998 6034312C 60342290 601A2BC4 601A2BB0
5d16h: %LINK.5.CHANGED: Interface GigabitEthernet1/0, changed state to
administratively down
5d16h: %LINEPROTO.5.UPDOWN: Line protocol on Interface GigabitEthernet1/0,
changed state to down
5d16h: %GRP.3.CARVE_INFO: Setting mtu above 8192 may reduce available buffers
on Slot: 1.
SLOT 1:00:00:09: %SYS.5.RESTART: System restarted ..
(remainder of output omitted)

show logging onboard Output

The show logging onboard command can be used on a specific slot or on the router as a whole.

RouterA# show logging onboard slot 3
[using 329 of 32768 bytes]
Boot location #0: slot 3 in 'Test_2'
Location #0 runtime: 13 weeks 13h 00m (inexact)
Temperature after last boot in location #0: inlet 27 C, hotpoint 37 C
Boot location #1: slot 2 in 'RouterA'
Location #1 runtime: 5 weeks 07h 52m (inexact)
Temperature after last boot in location #1: inlet 27 C, hotpoint 37 C
<=== Crash at Aug 08 2004 11:10:37 ===>
<===End Crash ===>

Router# show logging onboard
MAIN: 800-2427-03 rev A0, S/N CAB0549LRMK
Cumulative runtime: 20h 00m (inexact)

Use the show logging onboard command to list information about a specific parameter. Type options are shown in Table 6-1.

Table 6-1 Type Options for show logging onboard Command

Type
Description

boot

Boot record

clear

Clear record

crash

Crash record

environment

Environmental error record

mem-errors

Memory error record

runtime

Runtime count


Example output of the show logging onboard command follows:

RouterA# show logging onboard slot 2 boot
Boot location #0: slot 8 in 'Test_1'
Location #0 runtime: 13 weeks 13h 00m (inexact)
Boot location #1: slot 2 in 'Test_2'
Temperature after last boot in location #0: inlet 30 C, hotpoint 39 C
Temperature after last boot in location #0: inlet 31 C, hotpoint 40 C

show diag slot Output

Router# show diag 1
SLOT 1 (RP/LC 1 ): 3 Port Gigabit Ethernet
MAIN: type 68, 800.6376.01 rev E0 dev 0
HW config: 0x00 SW key: 00.00.00
PCA: 73.4775.02 rev E0 ver 2
HW version 2.0 S/N CAB0450G8FX
MBUS: Embedded Agent
Test hist: 0x00 RMA#: 00.00.00 RMA hist: 0x00
DIAG: Test count: 0x00000001 Test results: 0x00000000
FRU: Linecard/Module: 3GE.GBIC.SC=
Route Memory: MEM.GRP/LC.64=
Packet Memory: MEM.LC1.PKT.256=
L3 Engine: 2 . Backbone OC48 (2.5 Gbps)
MBUS Agent Software version 01.46 (RAM) (ROM version is 02.10)
Using CAN Bus A
ROM Monitor version 10.06
Fabric Downloader version used 05.01 (ROM version is 05.01)
Primary clock is CSC 0 Board is analyzed
Board State is Line Card Enabled (IOS RUN )
Insertion time: 00:00:10 (5d16h ago)
DRAM size: 67108864 bytes
FrFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
ToFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
1 crash since restart

show context slot Output

Router# show context slot 2
CRASH INFO: Slot 2, Index 1, Crash at 12:24:22 MET Wed Nov 28 2001
VERSION:
GS Software (GLC1.LC.M), Version 12.0(18)S1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)
TAC Support: http://www.cisco.com/tac
Compiled Fri 07.Sep.01 20:13 by nmasa
Card Type: 3 Port Gigabit Ethernet, S/N
System exception: SIG=23, code=0x24, context=0x4103FE84
System restarted by a Software forced crash
STACK TRACE:
.Traceback= 400BEB08 40599554 4004FB64 4005B814 400A1694 400A1680
CONTEXT:
$0 : 00000000, AT : 41040000, v0 : 00000032, v1 : 4103FC00
a0 : 4005B0A4, a1 : 41400A20, a2 : 00000000, a3 : 00000000
t0 : 41D75220, t1 : 8000D510, t2 : 00000001, t3 : FFFF00FF
t4 : 400C2670, t5 : 00040000, t6 : 00000000, t7 : 4150A398
s0 : 0000003C, s1 : 00000036, s2 : 4103C4D0, s3 : 41D7EC60
s4 : 00000000, s5 : 00000001, s6 : 41027040, s7 : 00000000
t8 : 41A767B8, t9 : 00000000, k0 : 415ACE20, k1 : 400C4260
GP : 40F0DD00, SP : 41D7EC48, s8 : 4102D120, ra : 40599554
EPC : 0x400BEB08, SREG : 0x3400BF03, Cause : 0x00000024
ErrorEPC : 0x400C6698, BadVaddr : 0xFFBFFFFB
.Process Traceback= No Extra Traceback
SLOT 2:00:00:09: %SYS.5.RESTART: System restarted ..
(remainder of output omitted)

The type of failure that has occurred in the show context slot 2 example is identified by the underlined SIG= value. The three most common types of SIP failures are:

Software Forced Crash (SIG=23)

Bus Error (SIG=10)

Cache Parity Exception (SIG=20)

In the example above, the SIP has failed and has caused a reload because of a software forced crash exception. Once you have determined the cause and collected the necessary output, you can check for any caveats in your Cisco IOS software release using the Bug Toolkit (available to registered Cisco.com users only).

Checking the Current Status of the SIP

Once you have determined if the problems are caused by system errors in the log or an actual crash, it is important to check the current status of the SIP to see if it has recovered from the failure. The status of individual SIPs can be identified by using the show led command.

show led Output

Router# show led
SLOT 1 : RUN IOS
SLOT 6 : DNLD FABL
SLOT 7 : RP ACTV
SLOT 10 : RUN IOS
SLOT 11 : RUN IOS
SLOT 13 : RUN IOS
SLOT 14 : RUN IOS


Note The LED label may appear reversed in the show led command output. For example, IOS RUN may be displayed as RUN IOS.


If the show led command on the SIP displays anything other than IOS RUN, or the RP is neither the active Master/Primary nor the Slave/Secondary, there is a problem and the SIP has not fully loaded correctly. Before replacing the SIP, try fixing the problem by following these steps:


Step 1 Reload the microcode using the global configuration microcode reload slot command.

Step 2 Reload the SIP using the hw-module slot reload command. This causes the SIP to reset and download the MBus and fabric downloader software modules before attempting to download the Cisco IOS software.

or

Reset the SIP manually. This may rule out any problems that are caused by a bad connection to the MBus or switching fabric.


Fabric Ping Failure

Fabric ping failures occur when either a SIP or the secondary RP fails to respond to a fabric ping request from the primary RP over the switch fabric. Such failures are a problem symptom that should be investigated. They are indicated by the following error messages:

%GRP-3-FABRIC_UNI: Unicast send timed out (1)
%GRP-3-COREDUMP: Core dump incident on slot 1, error: Fabric ping failure
%LCINFO-3-CRASH: Line card in slot 1 crashed

You can find more information about this issue on Cisco.com in the Troubleshooting Fabric Ping Timeouts and Failures on the Cisco 12000 Series Internet Router document.

Error Messages

If you receive any error message related to a SIP, you can use the Error Message Decoder Tool (on Cisco.com) to find the meaning of this error message. Some errors point to a hardware issue, while others indicate a Cisco IOS software caveat or a hardware issue on another part of the router. This document does not cover all these messages.


Note Some messages related to Cisco Express Forwarding (CEF) and Inter Process-Communication (IPC) are explained on Cisco.com in the Troubleshooting CEF-Related Error Messages document.


FPGA Error Messages

If the SIP does not boot and you receive an error message indicating that there is a problem with the Field-Programmable Gate Array (FPGA) image (or if the show led command display remains frozen in IOS STRT state, you need to upgrade the FPGA image using the update-fpga option in the diag command.


Note The diag command and the update-fpga option are documented in the Field Diagnostics for the Cisco 12000 Series Internet Router document.

When the Cisco IOS image boots, it verifies that a compatible FPGA image is running on the router. The major version number of the FPGA image must be the same as that expected by the Cisco IOS image; the minor version number on the FPGA image must be the same as or greater than the minor version number expected by the Cisco IOS image. For example, if the Cisco IOS image expects a minimum FPGA image of 03.02, the software will verify that the actual major version number of the FPGA image in the SIP bootflash is 03, and that the minor version number is 02 or above.


Example error messages indicating an FPGA problem appear as follows:

Error Message    No FPGA image available for slot0. Please run field diagnostics image 
on slot0 to upgrade the FPGA image.

Explanation    There is currently no valid FPGA image in the bootflash of the SIP. You must load a valid FPGA image to the SIP bootflash.

Error Message    FPGA image not appropriate or corrupted for slot0. Please run field 
diagnostics on slot0 to upgrade the FPGA image.

Explanation    The FPGA image currently loaded in the SIP bootflash is not compatible with the Cisco IOS software release currently running on the router or is corrupted. Upgrade the FPGA image to the correct version.


Note Do not confuse the SIP bootflash with the route processor (RP) bootflash. FPGA images are loaded only to the SIP bootflash.


SIP Diagnostics


Note Output from this procedure will vary slightly depending on which SIP you are using, but the basic information will be the same.


SIP field diagnostic software is designed to identify any faulty SIP within a Cisco 12000 series router. Before Cisco IOS Release 12.0(22)S, the field diagnostic software was imbedded within the Cisco IOS software. Starting with Cisco IOS Release 12.0(22)S, this software is unbundled from the main image and must be downloaded from Cisco.com using the IOS Upgrade Planner.

Cisco initiated this change to accommodate users with 20-MB Flash memory cards. Field diagnostics are now stored and maintained as a separate image under the following name:

c12k-fdiagsbflc-mz-xxx-xx.s (where xxx-xx is the version number)

This image must be available on a separate Flash memory card, Flash disk, or TFTP boot server in order to load SIP field diagnostics. The latest version is always available on Cisco.com. RP and fabric tests remain embedded within the main Cisco IOS software image.

While the diagnostic test is running, the SIP does not function normally and cannot pass any traffic for the duration of the testing (5 to 20 minutes depending upon the complexity of the SIP). Without the verbose keyword, the command provides a truncated output message. When communicating with the Cisco TAC, the verbose mode is helpful in identifying specific problems. The output of the diagnostic test without the verbose command appears like the following example:

Router# diag 7 tftp://223.255.254.254/diagnostic/award/c12k.fdiagsbflc.mz.120-25.s
Running DIAG config check
Fabric Download for Field Diags chosen: If timeout occurs, try 'mbus' option.
Runnning Diags will halt ALL activity on the requested slot. [confirm]
Launching a Field Diagnostic for slot 7
Downloading diagnostic tests to slot 7 via fabric (timeout set to 300 sec.)
5d20h: %GRP.4.RSTSLOT: Resetting the card in the slot: 7,Event:
EV_ADMIN_FDIAGLoading diagnostic/award/c12k.fdiagsbflc.mz.120-25.s from 223.255.254.254
(via Ethernet0): !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
5d20h: Downloading diags from tftp file tftp://223.255.254.254/diagnostic/award/
c12k.fdiagsbflc.mz.120-25.s
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[OK . 13976524 bytes]
FD 7> *****************************************************
FD 7> GSR Field Diagnostics V6.05
FD 7> Compiled by award on Tue Jul 30 13:00:41 PDT 2002
FD 7> view: award.conn_isp.FieldDiagRelease
FD 7> *****************************************************
Executing all diagnostic tests in slot 7
(total/indiv. timeout set to 2000/600 sec.)
FD 7> BFR_CARD_TYPE_OC12_4P_POS testing...
FD 7> Available test types 2
FD 7> 1
FD 7> Completed f_diags_board_discovery() (0x1)
FD 7> Test list selection received: Test ID 1, Device 0
FD 7> running in slot 7 (30 tests from test list ID 1)
FD 7> Skipping MBUS_FDIAG command from slot 2
FD 7> Just into idle state
Field Diagnostic ****PASSED**** for slot 7
Shutting down diags in slot 7
Board will reload
(remainder of output omitted)

The SIP reloads automatically only after passing the test. If the SIP fails the test, it will not reload automatically. You can manually reload the SIP by using the hw-module slot slot reload command.

Field diagnostic results are stored in an electrically erasable programmable read-only memory (EEPROM) on the SIP. It is possible to view the results of the last diagnostic test performed on the SIP by executing the diag slot previous command.

There are some caveats that exist that cause diagnostic tests to fail, even though the SIP is not faulty. As a precaution, if the SIP fails and had been replaced previously, you should review this output with the Cisco TAC.

Packing a SIP for Shipment

This section provides step-by-step instructions for packing a SIP for shipment. Before beginning this procedure, you should have the following original Cisco Systems packaging materials:

Clipboard insert

Smaller inner carton

Larger exterior carton

Two packing cushions


Caution Use Cisco Systems original packaging for the shipment of all SIPs. Failure to properly use Cisco Systems packaging can result in damage or loss of product.


Warning During this procedure, wear grounding wrist straps to avoid ESD damage to the card. Do not directly touch the backplane with your hand or any metal tool, or you could shock yourself.



Note These instructions assume that the SIP has been removed from the router according to the recommended procedures specified in this guide.


To pack a SIP for shipment, follow these steps:


Step 1 Insert the SIP into the clipboard insert by carefully aligning the edges of the SIP between the upper and lower edges of the clipboard insert.

Step 2 Slide the SIP all the way into the clipboard insert until it clicks into place. You might have to lift the clip assembly to ensure that it securely engages with the sheet-metal carrier.

Step 3 Place the clipboard insert containing the SIP into the smaller inner carton.

Step 4 Close the carton top, and tape the sides closed.

Step 5 Apply the packing cushions to the sealed smaller inner carton.

Step 6 Place the sealed smaller inner carton and packing cushions into the larger exterior carton, and seal the exterior carton with tape for shipment.


Packing a SPA for Shipment

This section provides step-by-step instructions for packing a SPA and the cable-management brackets for shipment. Before beginning this procedure, you should have the following original Cisco Systems packaging materials:

Thermoform container (transparent plastic-molded clamshell)

Carton


Caution The Cisco Systems original packaging is to be used for the shipment of all SPAs and cable-management brackets. Failure to properly use Cisco Systems packaging can result in damage or loss of product.


Warning During this procedure, wear grounding wrist straps to avoid ESD damage to the card. Do not directly touch the backplane with your hand or any metal tool, or you could shock yourself.



Note These instructions assume that the SPA and cable-management brackets have been removed from the router according to the recommended procedures specified in this guide.


To pack a SPA and the cable-management brackets for shipment, perform the following steps:


Step 1 Open the Thermoform container and place the SPA and each of the cable-management brackets into the appropriate cavities.


Caution Always handle the SPA by the carrier edges and handle; never touch the SPA components or connector pins.

Step 2 Close the Thermoform container. Be sure to lock the snaps securely.

Step 3 Check that the Thermoform container is fully closed. Apply tape or a label closure over the opening to ensure the container stays closed during shipping.

Step 4 Place the Thermoform container into the carton.

Step 5 Close the carton.

Step 6 Apply tape over the carton flap to ensure the carton stays closed during shipping.