Guest

Cisco 7500 Series Routers

Hardware Troubleshooting for the Cisco Route Switch Processor (RSP)

Document ID: 16100

Updated: Sep 25, 2006

   Print

Introduction

Valuable time and resources are often wasted to replace hardware that actually functions properly. This document helps troubleshoot common hardware issues with the Cisco 7500 Series Router and, more specifically, its Route Switch Processor (RSP) card. This document provides pointers for the identification of faulty hardware.

Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

Components Used

The information in this document is based on these software and hardware versions:

  • All Cisco IOS® software releases

  • These RSPs in any of the 7500 Series Routers that include the 7505, 7507, 7513, and 7576:

    • RSP1

    • RSP2

    • RSP4

    • RSP4+

    • RSP8

    • RSP16*

    *Supported on 7505, 7507, and 7513. RSP16 is not supported on the 7576.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Cisco 7500 Router Series Family

rsp1.gif

rsp2.gif

rsp4.gif

rsp8.gif

rsp16.gif

Background

The Cisco 7500 series router has at least one RSP and between one and 11 interface processors (Legacy IP or Versatile Interface Processor - VIP).

The RSP handles the main functionalities of the router. It is responsible for handling routing protocol algorithms, packet switching in non-distributed environments, higher-level features, and so forth. The Interface Processors (IPs and VIPs) contain the network interfaces for the router. RSPs can only go into certain slots within the 7500 Series Router as outlined, slot numbering starts at 0:

Router Slot Number(s)
7505 4
7507 2 and 3
7513 6 and 7
7576 6 and 7

Note that for the 7507, 7513, and 7576, the lower and higher slot numbers are referred to as the Primary RSP slot and the Secondary RSP slot, respectively.

There are six different versions of the RSPs used for the Cisco 7500 Series Routers:

Router Switch Processor Type Description
RSP1 Contains a (MIPS) R4600 CPU that runs at 100 MHz internally, 50 MHz external bus speed and supports memory options from 16 MBs to 128 MBs
RSP2 Contains a MIPS R4600 CPU that runs at 100 MHz internally, 50 MHz external bus speed, and supports memory options from 32 MBs to 128 MBs
RSP4 Contains a MIPS R5000 CPU that runs at 200 MHz internally, 100 MHz external bus speed, and supports memory options from 32 MBs to 256 MBs
RSP4+ This RSP is identical to the RSP4 except that it has Error-Correcting Code (ECC) memory protection/correction and an updated version of ROMMON
RSP8 Contains a MIPS R7000 CPU that runs at 250 MHz internally, 100 MHz external bus speed, and supports memory options from 64 MBs to 256 MBs
RSP16 Contains a MIPS R7000 CPU that runs at 500 MHz internally, 100 MHz external bus speed, and supports memory options from 64 MBs to 1 GB of Synchronous Dynamic RAM (SDRAM). The RSP16 supports Error-Correcting Code (ECC) in addition to 2 MBs of Statis RAM (SRAM) for Layer 3 cache.

Hardware-Software Compatibility and Memory Requirements

Whenever you install a new RSP, VIP, port adapter, or Cisco IOS software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible.

Perform these recommended steps in order to check for hardware-software compatibility and memory requirements:

  1. Use the Software Advisor (registered customers only) Tool in order to verify whether the modules and cards are supported by the desired Cisco IOS software version.

    Tip: Make sure you go to the Software Support for Hardware (registered customers only) section.

  2. Use the Cisco Download Software Area (registered customers only) in order to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and/or download the Cisco IOS software image. Refer to Memory Requirements in order to determine the amount of memory (RAM and Flash) installed.

    Tip: 

    Complete the steps in the Software Installation and Upgrade Procedure for the Cisco 7500 Series Router, if you determine that a Cisco IOS software upgrade is required.

Error Messages

The Error Message Decoder (registered customers only) Tool allows you to check the definition of an error message. Error messages appear on the console of Cisco products, usually in this form:

%XXX-n-YYYY : [text]

This is an example of an error message:

Router# %SYS-2-MALLOCFAIL: Memory allocation of [dec] bytes failed from [hex], pool [chars], alignment [dec]

Some error messages are informational only, while others indicate hardware or software failures and require action. The Error Message Decoder (registered customers only) Tool provides an explanation of the message, a recommended action, if needed, and if available, a link to a document that provides extensive troubleshooting information about that error message.

Example Memory Issue

This show log output shows the low memory error message %SYS-2-MALLOCFAIL: due to the process BGP Router. Verify the show processes memory and show memory summary output in order to verify the memory usage by the BGP process.

Router#show log
%SYS-2-MALLOCFAIL: Memory allocation of 32768 bytes failed from 0x403B4650, alignment 0 
Pool: Processor  Free: 406936  Cause: Memory fragmentation 
Alternate Pool: None  Free: 0  Cause: No Alternate pool 

-Process= "BGP Router", ipl= 0, pid= 158
-Traceback= 403B96D0 403BD8BC 403B4658 40DF73C0 402476FC 4064FA10 4061C840 406268A0 40626A4C 40816EC4 408102B0 40ED0820 408103C0 407D46A8
Jun 30 10:27:40.836 UTC: %FIB-3-NORPXDRQELEMS: Exhausted XDR queuing elements while preparing message for slot 4
-Process= "BGP Router", ipl= 0, pid= 158
-Traceback= 40DF74A0 402476FC 4064FA10 4061C840 406268A0 40626A4C 40816EC4 408102B0 40ED0820 408103C0 407D46A8
 %BGP-5-ADJCHANGE: neighbor 10.10.10.254 Down BGP Notification sent
 %BGP-3-NOTIFICATION: sent to neighbor 10.10.10.254 4/0 (hold time expired) 0 bytes 
 %BGP-5-ADJCHANGE: neighbor 10.10.10.99 Down BGP Notification sent
 %BGP-3-NOTIFICATION: sent to neighbor 10.10.10.99 4/0 (hold time expired) 0 bytes 
 %BGP-5-ADJCHANGE: neighbor 10.10.10.100 Down BGP Notification sent
 %BGP-3-NOTIFICATION: sent to neighbor 10.10.10.100 4/0 (hold time expired) 0 bytes 
 %BGP-5-ADJCHANGE: neighbor 10.10.10.254 Up 

Router#show processes memory
Processor Pool Total:  229224896 Used:  198433716 Free:   30791180
     Fast Pool Total:     131072 Used:     131024 Free:         48 


!--- Output suppressed.
 

Router#show memory summary
              Head     Total(b)        Used(b)    Free(b)   Lowest(b)  Largest(b)
Processor   42564E40   229224896    198457508    30767388       22200    196700
     Fast   42544E40      131072       131024          48          48        48

In the previous output, the largest available block in Processor memory is 196700. Total free memory is 30767388. The Router needs more than 40 MB of free memory in order to accommodate BGP transient memory usage for neighbors coming up/down. In this scenario, you need to consider to upgrade memory or set up BGP filters or BGP reconfiguration in order to minimize the routing table. This is an example of low memory issues on routers.

Identify the Issue

An RSP can reboot or reload for various reasons. Several of these are due to potential hardware issues. You can find information on how to capture different types of output that can be helpful for troubleshooting and identifying symptoms that mislead caused by bad hardware. Troubleshooting tips for the symptoms are listed in the Troubleshooting Guidelines section.

How to Capture Information

The first step is to capture as much information about the problem as possible in order to determine what causes the issue. This information is essential in order to determine the cause of the problem:

  • Crashinfo file(s)—When a Route Switch Processor (RSP) crashes, it attempts to save a crashinfo file into its bootflash. Refer to Retrieving Information from the Crashinfo File for details about crashinfo files. Note that if the router has dual RSPs, the crashinfo file might be on the standby RSP bootflash if the standby RSP crashed when it was a primary RSP. Usually, if the process of the creation of a crashinfo file is successful, then it is present in the bootflash of the RSP that crashed.

  • Console logs and/or Syslog information—These can be crucial in the determination of the issue that originated if multiple symptoms occur. This is usually the case with the Cisco 7500 Series Router. Effective troubleshooting can be performed if the console log/syslog is made available. If the router is set up to send logs to a syslog server, check the server for the log. For console logs, make sure you are directly connected to the console port of the router and Apply Correct Terminal Emulator Settings for Console Connections. Make sure, also, that logging is enabled.

  • show technical-support output—The show technical-support command is a compilation of many different commands that includes show version, show running-config, and show stacks. When an RSP experiences issues, the Technical Support engineer usually asks for this information. It is important to collect the show technical-support before you do a reload or power-cycle as these actions can cause all information about the problem to be lost. This is because the context information saved on the stacks are cleared when the router is reloaded.

  • show environment commands—The show environment all command is used in order to view the power supply and temperature output of the router. In addition to the show environment all command, the show environment last and show environment table are also helpful.

If you have the output of a show command from your Cisco device, which includes show technical-support, you can use Output Interpreter (registered customers only) to display potential issues and fixes. You must be a registered customer, be logged in, and have JavaScript enabled in order to use Output Interpreter (registered customers only) .

Misleading Symptoms

Some issues might be misinterpreted as hardware problems, when, in fact, they are not. Some of the more common issues are when the router stops responding or hangs, or when the router fails due to new hardware installation. This is a list of symptoms, explanations, and troubleshooting steps for these commonly misinterpreted issues:

Symptom Explanation
The RSP hangs during normal operation This is usually caused by software problems, but can also be caused by hardware. Refer to Troubleshooting Router Hangs.
A new RSP, VIP, or port adapter is not recognized Use the Software Advisor (registered customers only) tool in order to determine if the new card is supported in your current Cisco IOS software version.
The RSP1, RSP2 or RSP4 crashes or hangs on bootup This could be caused by the first file on the bootflash not being a valid Cisco IOS software image or RxBoot image. This is documented in Field Notice 14484: Router May Fail To Boot when Bootflash Contains Non-Bootable Code. This problem should not affect the RSP4+ , the RSP8, and the RSP16.
You get the error message
RSP-3-RESTART: cbus complex
This error message might be due to configuration changes, Online Insertion and Removal (OIR) of an interface processor, or other software or bad hardware issues. This error message is discussed in detail in What Causes a "%RSP-3-RESTART: cbus complex"?.
RSP CPU utilization runs very high While some hardware problems can cause high CPU utilization, it is much more likely that the router is either misconfigured or something on the network causes the problem. This is discussed in detail in Troubleshooting High CPU Utilization on a Cisco Router.
Memory allocation errors are seen on the RSP Memory allocation errors are almost never caused by hardware problems. Troubleshooting tips for memory allocations errors are located on the Troubleshooting Memory Problems page.
RSP crashes Not all RSP crashes are caused by hardware. A majority of RSP crashes are actually caused by software. This is discussed in detail on the Troubleshooting Router Crashes page.

Troubleshooting Guidelines

These are some troubleshooting guidelines, which depend on the type of issue you encounter:

  • Parity Errors - Parity errors on a 7500 are most commonly triggered due to bad hardware. In order to troubleshoot parity errors, capture the output at the time of the crash. Once you have this information, refer to Processor Memory Parity Errors - RSP for detailed troubleshooting steps.

  • Bus Error at a Valid Address - Refer to Troubleshooting Bus Error Crashes for further information on bus errors. If the address of the bus error is a valid address, then the most likely cause of the problem is a hardware failure.

  • Continuous Rebooting - If the Cisco 7500 Series Router continuously reboots, even after a power-cycle of the router, complete these steps:

    1. Remove all the cards, except for the RSP. Move it to the Primary RSP if it sits on the Secondary RSP slot, and power-cycle the router. If the router does not boot up with the Primary RSP, move that RSP to the Secondary slot, and reload the router.

    2. If the router still does not work properly, collect the console log/syslog of the boot sequence and create a service request with the Cisco Technical Support.

Information to Collect if You Open a TAC Service Request

If you have identified a component that needs to be replaced, contact your Cisco partner or reseller to request a replacement for the hardware component that is causing the issue. If you have a support contract directly with Cisco, use the Cisco Technical Support Service Request Tool (registered customers only) to open a Cisco Technical Support service request for a hardware replacement. Make sure you attach the following information:
  • Console captures that show the error messages
  • Console captures that show the troubleshooting steps taken and the boot sequence during each step
  • The hardware component that failed and the serial number for the chassis
  • Troubleshooting logs
  • Output from the show technical-support command

Related Information

Updated: Sep 25, 2006
Document ID: 16100