Guest

Cisco 12000 Series Routers

Hardware Troubleshooting for the Cisco 12000 Series Internet Router

Document ID: 22281

Updated: Jan 15, 2008

   Print

Introduction

Valuable time and resources are often wasted replacing hardware that actually functions properly. This document helps troubleshoot common hardware issues with the Cisco 12000 Series Internet Router and provides pointers for identifying whether or not the fault is in the hardware.

Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.

Note: Additionally, this document does not cover the hardware troubleshooting steps for the Cisco 12000 Series line cards (LCs). Hardware Troubleshooting for Cisco 12000 Series Internet Router Line Card Failures details the steps to follow to troubleshoot a hardware issue with a line card and/or identify an issue with a line card that could be misinterpreted as a hardware failure.

Prerequisites

Requirements

Readers of this document should be knowledgeable of the following:

If you feel that the problem may be related to a hardware fault, this document may help you identify the cause of the failure.

Components Used

The information in this document is based on the software and hardware versions below.

  • All Cisco 12000 Series Internet Routers, including the 12008, 12012, 12016, 12404, 12406, 12410, and the 12416.

  • All Cisco IOS® software versions that support the Cisco 12000 Series Internet Router.

The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you are working in a live network, ensure that you understand the potential impact of any command before using it.

Hardware-Software Compatibility and Memory Requirements

Whenever you install a new line card, module, or Cisco IOS® software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible with the features you wish to use.

Perform the following recommended steps to check for hardware-software compatibility and memory requirements:

  1. Use the Software Advisor (registered customers only) tool to choose software for your network device.

    Tips:

    • The Software Support for Hardware section helps you verify whether the modules and cards installed on the router are supported by the desired Cisco IOS software version.

    • The Software Support for Features section helps you determine the Cisco IOS software image needed by choosing the types of features you wish to implement.

  2. Use the Download Software Area to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and/or download the Cisco IOS software image. To determine the amount of memory (RAM and Flash) installed on your router, refer to the Memory Requirements section of How to Choose a Cisco IOS Software Release.

    Tips:

    • If you want to keep the same features as the version that is currently running on your router, but don't know which feature set you are using, enter the show version command from your Cisco device, and paste it in the Output Interpreter Tool. You can use to display potential issues and fixes. To use , you must be a registered customer, be logged in, and have JavaScript enabled. It is important to check for feature support, especially if you plan to use recent software features.

    • If you need to upgrade the Cisco IOS software image to a new version or feature set, refer to How to Choose a Cisco IOS Software Release for more information.

  3. If you determine that a Cisco IOS software upgrade is required, follow the Software Installation and Upgrade Procedure for the Cisco 12000 Series Router.

    Tip: For information on how to recover a Cisco 12000 series router stuck in ROMmon (rommon # > prompt), see ROMmon Recovery Procedure for the Cisco 12000.

Conventions

For more information on document conventions, see the Cisco Technical Tips Conventions.

Cisco 12000 Components

The components that make up the Cisco 12000 Series Internet Router chassis include:

  • Chassis

  • Switch Fabric Cards (SFCs)

  • Clock Scheduler Cards (CSCs)

  • Maintenance BUS (MBUS)

  • Power supplies

  • Blowers - fan assembly

  • Alarm cards

The chassis itself has no electronic components, so it is very rarely the cause of hardware-related problems unless some of the backplane connectors are bent or broken. The power supplies, SFCs, CSCs, alarm card, and fan assembly all have electronic components in them so they can be affected by hardware problems. In general, hardware problems with these components result in either error messages or the router fails to function. For a detailed explanation of all these components and how they interact together, see Cisco 12000 Series Internet Router Architecture.

Identifying the Issue

By reading the information below and following the troubleshooting steps, you can determine whether or not the problems you are having with your router are hardware-related.

Capturing Information

The first thing you need to do is identify the cause of the router crash or console errors that you are seeing. To see which part is possibly at fault, it is essential that the output from the following commands is collected:

  • show context summary

  • show logging

Along with these specific show commands, you should also gather the following information:

  • Console logs and/or Syslog information: These can be crucial in determining the originating issue if multiple symptoms are occurring. If the router is set up to send logs to a syslog server, you may see some information on what happened. For console logs, it is best to be directly connected to the router on the console port using logging enabled.

  • Show technical-support: The show technical-support command is a compilation of many different commands including show version, show running-config, and show stacks. When a router runs into problems, the Cisco Technical Assistance Center (TAC) engineer usually asks for this information. It is important to collect the show technical-support before doing a reload or power-cycle as these actions can cause all information about the problem to be lost.

Misleading Symptoms

There are a few issues that can be misinterpreted as hardware problems, when, in fact, they are not. Some of the more common issues are when the router stops responding or "hangs". Another one is a failure following a new hardware installation. It is very uncommon for any of these symptoms to be caused by a chassis component. The table below lists symptoms, explanations, and troubleshooting steps for these commonly misinterpreted issues:

Symptom Explanation/Troubleshooting
The Cisco 12000 hangs during normal operation This is usually caused by software problems, but can also be caused by hardware. See Troubleshooting Router Hangs for this issue.
A new line card is not recognized Use the Software Advisor (registered customers only) tool to determine if the new card is supported in your current Cisco IOS software version. If the LC is supported, then configure service upgrade all, save the configuration with the copy run start command and power-cycle the router. Sometimes a reload is not sufficient, but a power-cycle fixes the problem. If the new card is not supported in your current Cisco IOS software version, verify that you have enough route memory installed on the line card before upgrading the Cisco IOS software version. For release 12.0(21)S, 256 MB of route memory is required, especially if Border Gateway Protocol (BGP) is configured with many peers and many routes.
The CPU utilization is running very high While there are hardware problems that can cause this, it is much more likely that the router is either mis-configured or something on the network is causing the problem. See Troubleshooting High CPU Utilization on a Cisco Router to troubleshoot this issue.
Memory allocation errors are seen on the Gigabit Route Processor (GRP) Memory allocation errors are almost never caused by hardware problems. Troubleshooting tips for memory allocations errors are located on the Troubleshooting Memory Problems page.
An increasing number of input drops is seen in the output of the show interfaces command This is never due to a hardware issue with the router. See Troubleshooting Input Drops on the Cisco 12000 Series Internet Router to troubleshoot this problem.
An increasing number of ignored messages is seen in the output of the show interfaces command One of the line cards is most likely overloaded. Follow the steps detailed in Troubleshooting Ignored Errors and No Memory Drops on the Cisco 12000 Series Internet Router.
Forwarding Information Base (FIB) error messages are seen on the GRP Use the Cisco Error Message Decoder (registered customers only) Tool to find information about the meaning of this error message. Some of them point to a hardware issue on either the line card or a switch fabric card (SFC or CSC); others indicate a Cisco IOS software bug or a hardware issue on another part of the router. Some FIB and CEF-related messages are explained in Troubleshooting CEF-Related Error Messages.
Inter Process-Communication (IPC)-related messages are seen on the GRP. You can use the Cisco Error Message Decoder (registered customers only) Tool to find information about the meaning of this error message. Some of them point to a hardware issue on either the line card or a switch fabric card (SFC or CSC); others indicate a Cisco IOS software bug or a hardware issue on another part of the router. Some IPC-related messages are explained in Cisco 12000, 10000, 7600, and 7500 Series Routers: Troubleshooting IPC-3-NOBUFF Messages.
The following error messages are seen on the GRP:
%GRP-3-FABRIC_UNI: Unicast send timed out (1)
%GRP-3-COREDUMP: Core dump incident on slot 1, 
error: Fabric ping failure
Fabric ping failures occur when either a line card or the secondary GRP fails to respond to a fabric ping request from the primary GRP over the switch fabric. Such failures are a problem symptom that should be investigated. You can find more information about this issue at Troubleshooting Fabric Ping Timeouts and Failures on the Cisco 12000 Series Internet Router.
The following error message is seen on the GRP:
%GRP-3-UCODEFAIL: Download failed to slot 5
The image that was downloaded to the line card has been rejected by the line card. You can try to reload the microcode using the microcode reload configuration command. If the error message recurs, try to upgrade the MBUS Agent ROM, MBUS Agent RAM, Fabric-downloader using the upgrade all slot command as explained in Upgrading Line Card Firmware on a Cisco 12000 Series Internet Router. You can also refer to the symptom "A new line card is not recognized" in this table.

Step-by-Step Troubleshooting

Troubleshooting the Switch Fabric (CSC and SFC)

The GRP and the line cards connect through a crossbar switch fabric, which provides a high-speed physical path for most inter-card communication. Among the messages passed between the GRP and the line cards over the switch fabric are included actual packets being routed and received, forwarding information, traffic statistics, and most management and control information. Thus, it is important for the GRP to ensure that this path is operating correctly.

Switch Fabric Symptoms

You should always suspect the switch fabric if you see similar fabric-related error messages in the logs:

%FABRIC-3-CRC: Switch card 18

or

%FABRIC-3-PARITYERR: To Fabric parity error was detected. Grant parity error Data = 0x2.
SLOT 1:%FABRIC-3-PARITYERR: To Fabric parity error was detected. 
Grant parity error Data = 0x1

The following messages may or may not be due to a hardware issue with the switch fabric:

05:21:11: %GRP-3-FABRIC_UNI: Unicast send timed out (2)
05:21:16: %GRP-3-FABRIC_UNI: Unicast send timed out (2)

Such failures are a problem symptom that should be investigated. More information about this issue is located at Troubleshooting Fabric Ping Timeouts and Failures on the Cisco 12000 Series Internet Router.

Switch Fabric Troubleshooting

If a switch fabric failure is suspected, follow the steps below:

  1. Collect the data.

    Remember that when you connect to the LC, you should do it over the MBUS using the attach command. The execute-on command depends on the IPC (Inter-Process Communication) which goes over the switch fabric. If you are having problems with IPC (fabric problems, software bug, and so on), the commands that run remotely through the switch fabric can time out. Normally, for commands that generate a fair amount of output, it is recommended to attach to the LC to execute the command. The attach <slot #> command always goes over the MBUS.

    • show controllers fia (on the GRP)

    • attach <slot #>, then show controllers fia, then type exit (repeat for each LC and the secondary GRP)

    • show controllers clock (on the GRP)

    • show log (look for Online Insertion and Removal (OIR) events to explain CSC master change; look for fabric-related errors)

    • show log summary (look for fabric-related errors)

    • show log slot <slot #>

  2. Analyze data

    Fabric problems can occur due to failures in any of the following components:

    • Control plane - GRP

    • Data plane

    • Tofab LC hardware

    • Backplane

    • CSCs/SFCs

    • Frfab LC hardware

    When troubleshooting fabric errors, start by looking for patterns with regard to which components are reporting errors. For example, combine the show controllers fia output from all the GRPs and LCs to see if there is a pattern.

    Note: For the remainder of this document, when we say LC, this refers to any LC or GRP.

Increasing the Number of CRCs

If you see crc16s in the output of the show controllers fia command, it is important to check if this number is incrementing. It is very important to correlate the data from both the primary GRP and the other GRP/LCs. If one LC or one switch fabric card (CSC and/or SFC) has been OIRed, you can expect to see some fabric error messages and some crc16s. However, this number should not increase afterwards. If the number is incrementing, you need to replace some parts due to faulty hardware.

In the output below, you can see the status for the primary GRP and the LC in slot 2:

Router#show controllers fia 
Fabric configuration: Full bandwidth, redundant fabric
Master Scheduler: Slot 17  Backup Scheduler: Slot 16
From Fabric FIA Errors
-----------------------
redund fifo parity 0    redund overflow 0      cell drops 0         
crc32 lkup parity  0    cell parity     0      crc32      0         
Switch cards present    0x001F    Slots  16 17 18 19 20
Switch cards monitored  0x001F    Slots  16 17 18 19 20
Slot:     16         17         18         19         20
Name:    csc0       csc1       sfc0       sfc1       sfc2
       --------   --------   --------   --------   --------
los    0          0          0          0          0          
state  Off        Off        Off        Off        Off       
crc16  0          0          0          1345       0
To Fabric FIA Errors
-----------------------
sca not pres 0          req error     0          uni FIFO overflow 0         
grant parity 0          multi req     0          uni FIFO undrflow 0         
cntrl parity 0          uni req       0          crc32 lkup parity 0         
multi FIFO   0          empty dst req 0          handshake error   0         
cell parity  0
Router#attach 2
Entering Console for 4 port ATM Over SONET OC-3c/STM-1 in Slot: 2
Type "exit" to end this session
Press RETURN to get started!
LC-Slot2>
LC-Slot2>enable
LC-Slot2#show controllers fia
From Fabric FIA Errors
-----------------------
redund FIFO parity 0          redund overflow 0          cell drops 0         
crc32 lkup parity  0          cell parity     0          crc32      0         
Switch cards present    0x001F    Slots  16 17 18 19 20 
Switch cards monitored  0x001F    Slots  16 17 18 19 20 
Slot:     16         17         18         19         20
Name:    csc0       csc1       sfc0       sfc1       sfc2
       --------   --------   --------   --------   --------
Los    0          0          0          0          0          
state  Off        Off        Off        Off        Off       
crc16  0          0          0          1345       0
To Fabric FIA Errors
-----------------------
sca not pres 0          req error     0          uni fifo overflow 0         
grant parity 0          multi req     0          uni fifo undrflow 0         
cntrl parity 0          uni req       0          crc32 lkup parity 0         
multi fifo   0          empty DST req 0          handshake error   0         
cell parity  0
LC-Slot2#exit
Disconnecting from slot 2.
Connection Duration: 00:00:21
Router#
... 

Once you have analyzed all the show commands, you can write a similar table:

hwts_12000_22281a.gif

This table indicates that more than one line card is reporting errors coming from SFC1. Therefore, the first step would be to change this SFC. The common failure patterns and recommended actions are as follows (one step at a time until the problem goes away):

Tip: Whenever a replacement is recommended, first verify that the card is correctly seated (see below). You should ALWAYS reseat the corresponding card to be sure it is correctly seated. If, after reseating the blade, the CRCs are still incrementing, then go ahead and replace the part.

  • Frfab errors on more than one LC from the same fabric card:

    1. Replace the fabric card in the slot corresponding to the errors

    2. Replace all fabric cards

    3. Replace the backplane

  • Frfab errors on one LC from more than one fabric card:

    1. Replace the LC

    2. If errors are incrementing, replace the current master CSC

    3. If errors are not incrementing and the current master is CSC0, replace CSC1

Seating the Switch Fabric Cards

The switch fabric cards in the 12016 and 12416 are not easy to insert, and may require a little bit of force. If either of the CSCs are not seated properly, you may see the following error message:

%MBUS-0-NOCSC: Must have at least 1 CSC card in slot 16 or 17 
%MBUS-0-FABINIT: Failed to initialize switch fabric infrastructure

You may also get this error message if there are only enough CSCs and SFCs seated for quarter bandwidth configurations. In this case, none of the Engine 1 and higher engine-based LCs will boot.

One sure way to tell if the cards are seated properly is that, on the CSC/SFC, you should see four lights "on". If this is not the case, then the card is not seated correctly.

When dealing with problems related to the fabric and LCs not booting, it is important to verify that all necessary CSCs and SFCs are correctly seated and powered on. For instance, three SFCs and two CSCs are required on a 12016 to get a full bandwidth redundant system. Three SFCs and only one CSC are needed to get a full bandwidth non-redundant system.

The output from the show version and show controllers fia commands tells you which hardware configuration is currently running in the box.

Router#show version 
Cisco Internetwork Operating System Software 
IOS (tm) GS Software (GSR-P-M), Experimental Version 12.0(20010505:112551)
Copyright (c) 1986-2001 by cisco Systems, Inc. 
Compiled Mon 14-May-01 19:25 by tmcclure 
Image text-base: 0x60010950, data-base: 0x61BE6000

ROM: System Bootstrap, Version 11.2(17)GS2, [htseng 180] 
EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) 
BOOTFLASH: GS Software (GSR-BOOT-M), Version 12.0(15.6)S, 
EARLY DEPLOYMENT MAINTENANCE INTERIM SOFTWARE

Router uptime is 17 hours, 53 minutes 
System returned to ROM by reload at 23:59:40 MET Mon Jul 2 2001 
System restarted at 00:01:30 MET Tue Jul 3 2001 
System image file is "tftp://172.17.247.195/gsr-p-mz.15S2plus-FT-14-May-2001"

cisco 12016/GRP (R5000) processor (revision 0x01) with 262144K bytes of memory. 
R5000 CPU at 200Mhz, Implementation 35, Rev 2.1, 512KB L2 Cache 
Last reset from power-on

2 Route Processor Cards 
1 Clock Scheduler Card 
3 Switch Fabric Cards 
1 8-port OC3 POS controller (8 POs). 
1 OC12 POS controller (1 POs). 
1 OC48 POS E.D. controller (1 POs). 
7 OC48 POS controllers (7 POs). 
1 Ethernet/IEEE 802.3 interface(s) 
17 Packet over SONET network interface(s) 
507K bytes of non-volatile configuration memory.

20480K bytes of Flash PCMCIA card at slot 0 (Sector size 128K). 
8192K bytes of Flash internal SIMM (Sector size 256K).
...
Router#show controller fia 
Fabric configuration: Full bandwidth nonredundant
Master Scheduler: Slot 17
...

We recommend that you read Cisco 12000 Series Internet Router Architecture: Switch Fabric for more detailed information.

Grant Parity Errors and Request Errors

You might experience the following types of errors:

  • From the console logs or the output of the show log command:

    %FABRIC-3-PARITYERR: To Fabric parity error was detected. 
    Grant parity error Data = 0x2.
    SLOT 1:%FABRIC-3-PARITYERR: To Fabric parity error was detected. 
    Grant parity error Data = 0x1
    
  • From the output of the show controllers fia command:

    Router#show controllers fia
    Fabric configuration: Full bandwidth, redundant fabric
    Master Scheduler: Slot 17     Backup Scheduler: Slot 16
    
    !-- Here you can see which CSC is the master CSC. 
    By default CSC1 in slot 17 is the master
     
    
    From Fabric FIA Errors
    -----------------------
    redund FIFO parity 0   redund overflow 0   cell drops 76
    
    !-- You may see some cell drops as well
    
    
    crc32 lkup parity  0    cell parity 0   crc32 0
    Switch cards present    0x001F    Slots  16 17 18 19 20
    Switch cards monitored  0x001F    Slots  16 17 18 19 20
    Slot:     16         17         18         19         20
    Name:    csc0       csc1       sfc0       sfc1       sfc2
           --------   --------   --------   --------   --------
    Los    0          0          0          0          0
    state  Off        Off        Off        Off        Off
    crc16  876        257        876        876        876
    
    !-- You will see some crc16
    
    
    To Fabric FIA Errors
    -----------------------
    sca not pres 0          req error     1          uni fifo overflow 0
    grant parity 1          multi req     0          uni fifo undrflow 0
    
    !-- Grant parity and/or Request error counter not 0
    
    
    cntrl parity 0          uni req       0          crc32 lkup parity 0
    multi fifo   0          empty DST req 0          handshake error   0
    cell parity  0
    

The Fabric Interface ASIC (FIA) resides on both the Gigabit Route Processor (GRP) and the line cards (LC). It provides an interface between the GRP/LC and the switch fabric cards (CSC/SFC), whereas the Scheduler Control ASIC (SCA) resides on the CSC only. It takes care of the transmission requests from the line cards and issues grants to access the fabric.

Hardware Request Errors

  • req error - The SCA detected a parity error on the req lines

  • grant parity - The FIA detected a parity error on the grant lines

The output of the show controllers fia command can be used to determine whether multiple line cards are reporting these errors, and if a CSC switchover has taken place. In order to get this output from a specific line card, type attach <slot #> and then execute the show controller fia command after the LC-Slot prompt appears.

Note: As explained above, the execute-on slot <slot #> show controllers fia command should not be used, since, in the event that the Cisco IOS software is unable to handle this error, this command will fail.

  • Grant errors on more than one LC

    1. Replace the CSC (see the note below to know which one should be swapped)

    2. Replace the backplane

  • Grant errors on one LC

    1. Replace the LC

    2. Replace the CSC (see the note below to know which one should be swapped)

    3. Replace the backplane

Note: If multiple line cards are reporting grant parity or request errors and the box is still functioning, then a CSC switchover has occurred. The failed CSC is the one that is currently the backup CSC (not the one listed as "Master Scheduler" in the show controller fia output). If "Halted" is next to the heading "From Fabric FIA Errors" or "To Fabric FIA Errors", or if the router is no longer forwarding traffic, then a CSC switchover has not occurred and the failing CSC is the one listed as "Master Scheduler". By default, the CSC in slot 17 is the primary and the CSC in slot 16 is the backup.

On routers running a Cisco IOS software release without the fix to software bug CSCdw10748 (registered customers only) , grant parity errors may result in a system level failure. With the fix for CSCdw10748, a router with redundant CSCs will not experience a system level disruption if this hardware failure occurs. A failover to the backup CSC (if one is present) will be performed.

The fix to CSCdw10748 has been implemented in Cisco IOS software releases 12.0(17)ST4, 12.0(21)S, 12.0(21)ST, 12.0(19)ST02, 12.0(19)S02, 12.0(17)S04, 12.0(18)S04, and 12.0(16)S07.

Other Errors

There are other errors that are less frequent and can be seen in the output of the show controllers fia command:

From Fabric FIA Errors

  • First In First Out (FIFO) Errors: redundant data Overflow Error. This is caused if the back pressure is broken, that is, the From Fab exerts back pressure and the Scheduler Control ASIC (SCA) keeps giving more data to it. This could be a problem with the Clock Scheduler Card (CSC). Try reseating the card; if that doesn't work, try to swap it.

  • Serial Link Errors: This is caused by the From Fab FIA losing synchronization with one of the Switch Fabric Cards (SFCs) or Clock Scheduler Cards (CSCs) (this error is not generated for a pulled out card). The FIA has a built-in mechanism to wait before halting the FIA for a certain number of cell periods. There is a loss counter for each card. Depending on the information gathered from all the GRPs/LCs, you should be able to determine which part is faulty.

To Fabric FIA Errors

  • FIFO Errors

    • uni FIFO overflow - unicast FIFO overflow caused by a problem between the Buffer Management ASIC (BMA)/Cisco Cell Segmentation and Reassembly (CSAR) and the FIA.

    • uni FIFO underflow - unicast FIFO underflow caused by the SCA granting without actually getting a request from the FIA.

    For FIFO errors, it is difficult to determine whether it is the line card or the scheduler card (CSC) which is broken. If many cards show errors, the CSC should be suspected.

  • Fabric Error: sca not pre - The master SCA (Scheduler Control ASIC) is lost. The solution for this error is to do nothing and wait until the upper layers detect that there has been a problem. The reason for not automatically switching to the redundant CSC is that, at this level, you do not know whether or not the two SCAs are in sync. If a CSC card has been plugged in after the initial power on, the SCA chips are not going to be in sync. The Fabric Interface ASIC (FIA) resides on both the Gigabit Route Processor (GRP) and the line cards (LC). It provides an interface between the GRP/LC and the switch fabric cards (CSC/SFC), whereas the Scheduler Control ASIC (SCA) resides on the CSC only. It takes care of the transmission requests from the line cards and issues grants to access the fabric.

    or

    %FIA-3-PARITYERR: To Fabric parity error was detected.
    %FIA-3-HALT: To Fabric Request parity error interrupt = 0x4

    The output of the show controllers fia command can be used to determine whether multiple line cards report these errors and if a CSC switchover has taken place. In order to get this output from a specific line card, type attach slot no: , and execute the show controller fia command after the LC-Slot prompt appears.

  • BMA/CSAR Handshake error:This should be accompanied by a parity error that should point out the reason of the problem.

  • Software Request Errors: There are other errors on the FIA that do not cause it to become halted or cause an interrupt. These are polled once every second and counted. On the To Fabric side, these errors are software request errors. The following errors are detected:

    • multi req - single destination in a multicast request. The FIA sends this cell to the destination. You should be aware of bug CSCdw05067 - show controller fia shows multi requests on ATM LCs with multicast. ATM Engine 0 (1xOC12 and 4x0C3) line cards may record a few "multi request" errors in the show controller fia command output of the affected line cards running distributed multicast traffic. This happens for each multicast packet distributed switched to only a single destination line card. It is purely cosmetic, and there is no drop. The workaround is to disable distributed multicast switching.

    • uni req - multidestination in a unicast request. The FIA drops this cell.

    • empty DST req - empty destination request. The FIA drops this cell.

Troubleshooting the Maintenance BUS (MBUS)

On initial bootup, the primary GRP uses the MBUS to instruct the MBUS modules on the line cards and switch cards to power on their cards. A bootstrap image is then downloaded to the line cards across the MBUS. The MBUS is also used to gather revision numbers, environmental information, and general maintenance information. In addition, the GRPs exchange redundancy messages over the MBUS, which report the results of GRP arbitration.

The following messages are harmless and expected under normal router conditions. If you see these non-exhaustive messages, no action is required:

%MBUS-6-GRP_STATUS: GRP in Slot 0 Mode = MBUS Secondary

or

%MBUS-6-FIA_CONFIG: Switch Cards 0x1F (bit mask); Primary Clock CSC_1

Use the Error Message Decoder (registered customers only) Tool to determine whether or not a message is expected, and whether you need to take action.

If you see an "upgrade warning" message that looks like this:

%MBUS-0-DOWNREV: Fabric Downloader in slot 2; use 
"upgrade fabric-downloader" command to update the image

Make sure that the Fabric Downloader version of the line card is in sync with the one from the current Cisco IOS software release running on the primary GRP. You can configure service upgrade all, save the configuration, and reload the router to synchronize the MBUS agent RAM, the Fab Downloader, and so on. Sometimes a reload is not enough, but a power-cycle always works. Make sure you have enough route memory on the line card to support your Cisco IOS software release.

You can find more information at Upgrading Line Card Firmware on a Cisco 12000 Series Router.

For more explanations about the purpose of the MBUS and some MBUS-related error messages, see Cisco 12000 Series Internet Router Architecture: Maintenance Bus, Power Supplies and Blowers, and Alarm Cards.

Troubleshooting the Power Supplies and Blowers

The Cisco 12000 Series router is available in either an AC or a DC configuration. All power supplies are load-sharing and hot-swappable.

There are some software bugs where low voltage is reported and should not be. Be sure to run the latest Cisco IOS software release image which is available on the Download Software Area to get rid of all the known voltage-related software bugs that have been fixed in the meantime.

You can find some interesting links for the different types of chassis at Cisco 12000 Series Internet Router Architecture: Maintenance Bus, Power Supplies and Blowers, and Alarm Cards.

Troubleshooting the Alarm Cards

There are different types of alarm cards depending on the type of 12000 chassis. On the Cisco 12008 and the 12016/12416, alarm cards power the LCs, so make sure that at least one alarm card is present. The 12008 needs an alarm card because the alarm card is integrated with the card scheduler and clock (CSC). The 12016 and 12416 have slots for two alarm cards (for redundancy). The two alarm cards do not have segmented service zones like the DC power supply on a 12016.

The Cisco 12404 supports a Consolidated Switch Fabric Card that includes the switch fabric, alarm, and clock and schedule functions on one board.

You can find some interesting links for the different types of chassis at Cisco 12000 Series Internet Router: Alarm Cards.

Troubleshooting the Line Cards

The Hardware Troubleshooting for Cisco 12000 Series Internet Router Line Card Failures document explains the steps to identify and troubleshoot the line card failures. Troubleshooting Line Card Crashes on the Cisco 12000 Series Internet Router provides troubleshooting information for line card crashes.

Troubleshooting Parity Error Messages

The Cisco 12000 Series Internet Router Parity Error Fault Tree document explains the steps to troubleshoot and isolate a failing part or component of the Cisco 12000 Series Internet Router after you encounter a variety of parity error messages.

Information to Collect if You Open a TAC Service Request

If you still need assistance after following the troubleshooting steps above and want to open a service request (registered customers only) with the Cisco TAC, be sure to include the following information for troubleshooting hardware problems on the Cisco 12000 Series Internet Router:
  • show log output or console captures showing the troubleshooting steps taken and the boot sequence during each step
  • Troubleshooting logs
  • Output from the show technical-support command
Please attach the collected data to your case in non-zipped, plain text format (.txt). You can attach information to your case by uploading it using the Service Request Tool (registered customers only) . If you cannot access the Service Request Tool, you can send the information in an email attachment to attach@cisco.com with your case number in the subject line of your message to attach the relevant information to your case.

Note: Please do not manually reload or power-cycle the router before collecting the above information unless required, as this can cause important information to be lost that is needed for determining the root cause of the problem.

Related Information

Updated: Jan 15, 2008
Document ID: 22281