Guest

Cisco 7500 Series Routers

Hardware Troubleshooting for the 7500 Series Router

Cisco - Hardware Troubleshooting for the Cisco 7500 Series Router

Document ID: 17852

Updated: Dec 13, 2007

   Print

Introduction

Valuable time and resources are often wasted replacing hardware that actually functions properly. This document helps troubleshoot common hardware issues with the Cisco 7500 Series router chassis. This document also provides pointers for identifying whether or not the fault is in the hardware. This document does not cover any software-related failures except for those that are often mistaken for hardware issues.

Before You Begin

Conventions

For more information on document conventions, see the Cisco Technical Tips Conventions.

Prerequisites

Readers of this document should take the following steps:

Components Used

The information in this document is based on the software and hardware versions below.

  • Cisco IOS® Software (all versions)

  • Cisco 7500 Series routers, including the 7505, 7507, 7513, and 7576 routers.

The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you are working in a live network, ensure that you understand the potential impact of any command before using it.

Cisco 7500 Components

The components that make up the Cisco 7500 Series chassis include:

  • The chassis.

  • Power supplies.

  • The card cage.

  • The arbiter.

  • The chassis interface.

  • The fan assembly.

The chassis itself has no electronic components, so it is very rarely the cause of hardware-related problems unless some of the backplane connectors are bent or broken. The power supplies, card cage, arbiter, chassis interface, and fan assembly all have electronic components, and therefore can be affected by hardware problems. In general, hardware problems with these components result in either error messages or total failure of the router.

Note: The Cisco 7505 has a single arbiter, while the Cisco 7507, 7513, and 7576 have a dual arbiter for the dual CyBuses.

Note: The dual arbiter and chassis interface are printed circuit boards that attach to the rear of the backplane. The dual arbiter and chassis interface are replaced when the card cage and backplane assembly are replaced.

Identifying the Issue

Routers can restart and reload, fail to function, or produce error messages for many reasons. Several of these are potential hardware issues.

Most of the common chassis related problems are reported by the following error message syntax:

%CI-n-YYYY :  [text]

CI refers to the Chasis Interface (CI) card. The n refers to the severity of the error, and YYYY represents the code that describes this message. The text contains a detailed description of the event. The CI card stores a set of registers and EEPROM memory that holds vital chassis-related environmental information. Some of the common log messages indicating a potential issue are as discussed in the following sections below.

Fan Failure

%CI-3-BLOWER: main fan failure 
%CI-3-BLOWER: #1 fan failure

The above message is printed when one of the cooling fans has failed. This could be caused by the following possibilities:

  1. A dirty fan or filter—You can determine this by physically looking at the fan. If it is dirty, pull out the fans and clean them using an air compressor or vacuum cleaner. You can pull out the fans without powering down the router.

    caution Caution: Do not try to stop the fan from spinning by using your hand or an object; this could be dangerous for you and could damage the fan.

    Note: If it has been found that the fault lies with a fan assembly and the router is essential for network production, the router can run without one fan assembly for a few hours.

  2. A bad or mis-seated fan assembly—If the filter is clean, there is a good possibility that the fan has gone bad. Check whether the fan is turning; if not, try reseating the fan. This can be done without powering down the router.

  3. A failed CI card—If the above have been checked and the error message continues, the CI card could have failed. Open a case with the Technical Assistance Center (TAC) to have the card replaced.

Router Has Overheated

%CI-4-COND: Restarting with <n> 
recent soft power shutdowns

The n represents the number of times the router powers the line cards down because of a detected over-temperature condition. When the temperature rises above the board shutdown trip point, the cards are shut down, but the power supplies, fans, and CI continue to run. For more information, refer to CI Error Messages.

If the above message is not accompanied by a %CI-3-BLOWER message, please collect the information yielded by issuing the following commands on the router, and then create a service request with the TAC:

  • show environment all

  • show environment last

  • show environment table

There are several possible conditions that can trigger this error message, and the above information will help to pinpoint the cause.

Power Supply Failure

%CI-4-ENVWARN: -12 Voltage measured at -13.48
%CI-4-ENVWARN: +24 Voltage measured at 26.70
%CI-2-ENVCRIT: +12 Voltage measured at 13.62
%CI-2-ENVCRIT: +5 Voltage measured at 5.78

The above messages indicate the voltage measured by the CI card. The following are the most likely triggers for these messages:

  • A faulty power supply—If you have a spare power supply available, install it in either of the power supply bays and verify that the error messages disappear. If, after replacing the power supply, the errors still occur, the CI card is the next thing to check (see below).

  • A faulty CI card—If the router has two power supplies, and the combined bus voltages are low, the cause cannot be a power supply problem. In the Cisco 7500 architecture, one power supply cannot drag down the output voltage of another; isolation diodes prevent this from happening. A faulty CI card is the most likely cause. Open a case with the TAC for further troubleshooting.

%CI-3-PSFAIL: Power supply 2 failure
      %CI-3-BLOWER: ps2 fan failure

The above messages appear when there is a power supply failure. In this example, the failure happens on power supply slot 2. Due to the power supply failure, the fan on slot 2 also fails.

When you see the above error messages, take the following steps:

  1. Ensure that the power supply that logs the above message is fully seated into the router. If not, insert it firmly. The router need not be powered down for this activity.

  2. If the power supply is fully seated but the messages still appear, it is possible that the power supply has failed. In this event, create a service request with the TAC.

  3. If the replacement of the power supply still does not address the issue, the CI card needs to be replaced. The power supply connector pins are connected to the CI card connector pins, and the pins on the CI card might be bent. To address this problem, create a service request with the TAC.

Misleading Symptoms

Prior to this point, this document has dealt with hardware problems in the Cisco 7500 series. However, there are a few issues that can be misinterpreted as hardware problems, even though they are not. A common example is when the router simply stops responding, or "hangs." Another example is a failure following a new hardware installation. It is very uncommon for either of these symptoms to be caused by a chassis component. Please refer to the Misleading Symptoms section of Hardware Troubleshooting for the Cisco Route Switch Processor (RSP) for further information.

Capturing Information

The first step in troubleshooting a hardware problem is to capture as much information about it as possible. The following information is essential to determining the cause of an error:

  • Console logs and syslog information—These can be crucial in determining the originating issue if multiple errors are cropping up together. If the router is set up to send logs to a syslog server or a PC connected to the console port, you might capture the logs generated when the failure occurred. For console logs, it is best for the PC, laptop, or terminal server to be directly connected to the console port of the router using system message logging.

  • show environment commands—The following commands provide information about the power supplies and about the temperatures of various components of the router (which directly reflect the status of the fan assembly):

    • show environment all

    • show environment table

    • show environment last

  • show technical support—The show technical-support command is a compilation of many different commands including show version, show running-config, and show stacks. It is important to collect the show technical-support information before doing a reload or power-cycle, as these actions can cause all information about the problem to be lost.

  • Crashinfo files—When an RSP crashes, it attempts to save a crashinfo file into its bootflash memory. Refer to Retrieving Information from the Crashinfo File for details about crashinfo files. As a rule, if a crashinfo file was created, then it exists in the bootflash of the RSP that has crashed.

    Note: If the router has dual RSPs and the standby RSP has crashed, the crashinfo file may be on the bootflash of the standby RSP.

Related Information

Updated: Dec 13, 2007
Document ID: 17852