Guest

Cisco 2800 Series Integrated Services Routers

Cisco 2800 Series Router Hardware Troubleshooting

Cisco - Cisco 2800 Series Router Hardware Troubleshooting

Document ID: 71444

Updated: Jun 20, 2008

   Print

Introduction

Valuable time and resources are often wasted in the replacement of hardware that actually functions properly. This document helps you to troubleshoot potential hardware issues with Cisco 2800 Series Routers. This document also provides information to help you identify which component causes a hardware failure. This depends on the type of error that the router experiences.

Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

Components Used

The information in this document is based on Cisco 2800 Series Routers.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Hardware-Software Compatibility and Memory Requirements

Whenever you install a new card, module or Cisco IOS® software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible with the features you wish to use.

Perform these recommended steps to check for hardware-software compatibility and memory requirements:

  1. Use the Software Advisor tool (registered customers only) to choose software for your network device.

    Tip: The Software Support for Hardware (registered customers only) section helps you verify whether the modules and cards installed on the router are supported by the desired Cisco IOS software version.

    Tip: The Software Support for Features (registered customers only) section helps you choose the types of features you wish to implement in order to determine the Cisco IOS software image that is needed.

  2. Use the Download Software Area (registered customers only) to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and to download the Cisco IOS software image. Refer to the Memory Requirements section of How to Choose a Cisco IOS Software Release in order to determine the amount of memory (RAM and Flash) installed on your router.

    Tip: If you want to keep the same features as the version that currently runs on your router, but you do not know which feature set you use, issue the show version command from your Cisco device, and paste it in the Output Interpreter Tool. You can use the Output Interpreter tool (registered customers only) to display potential issues and fixes. You must be logged in and have JavaScript enabled in order to use this tool.

    Tip: If you need to upgrade the Cisco IOS software image to a new version or feature set, you can refer to How to Choose a Cisco IOS Software Release for more information.

  3. If you determine that a Cisco IOS software upgrade is required, refer to Upgrading the System Image for the Cisco 2800 Series Router.

    Tip: If your 2800 router does not have a connection to the network or a valid Cisco IOS software image, you can issue the tftpdnld ROMmon command to recover the IOS image. Refer to How to Download a Software Image to a Cisco 2600/2800/3700/3800 via TFTP Using the tftpdnld ROMMON Command for more information.

Error Messages

The Error Message Decoder tool (registered customers only) allows you to check the meaning of an error message. Error messages appear on the console of Cisco products, usually in this form:

 %XXX-n-YYYY : [text]

This is an example error message:

Router# %SYS-2-MALLOCFAIL: Memory allocation of [dec] bytes failed from [hex], 
pool [chars], alignment [dec]

Some error messages are informational only, while others indicate hardware or software failures and require action. The Error Message Decoder Tool provides an explanation of the message, a recommended action (if needed), and if available, a link to a document that provides extensive troubleshooting information about that error message.

Troubleshooting

These sections from Troubleshooting Cisco 2800 Series Routers are useful:

Also, refer to Password Recovery Procedure for troubleshooting information.

Modules and Cards

These documents can help you verify which module/card is supported for the Cisco 2800 Series Router:

T1 Controller VWIC2-2MFT-T1/E1 Issues

After you install the VWIC2-2MFT-T1/E1 card, you do not recognize the card from the IOS. You need to issue the card type {t1 | e1} command to configure the router in order to recognize the card. Refer to Configuration Examples for Second-Generation 1- and 2-Port T1/E1 Multiflex Trunk Voice/WAN Interface Cards for more information.

NM-16ESW-PWR-1GIG Module PoE Issues

NM-16ESW-PWR-1GIG is an EtherSwitch network module with Power over Ethernet (PoE) capabilities. After you add this card, you might not be able to configure PoE. This is because you need to have a matching power supply installed on the router to support the PoE features. Refer to Cisco EtherSwitch Network Modules Data Sheet for more information on the EtherSwitch network modules and the power supplies.

Identify the Issue

In order to identify the issue, the first step is to capture as much information about the problem as possible. This information is essential to help you determine the cause of the problem:

  • Console logs—Refer to Applying Correct Terminal Emulator Settings for Console Connections for more information.

  • Syslog information—If the router is set up to send logs to a syslog server, you can obtain information on what occurred. Refer to the How to Configure Cisco Devices for Syslog section of Resource Manager Essentials and Syslog Analysis: How-To for more information.

  • show technical-support command output—The show technical-support command is a compilation of many different commands which includes the show version, show running-config, and show stacks commands. TAC engineers usually ask for this information to troubleshoot hardware issues. It is important to collect the show technical-support command information before you perform a reload or power-cycle as these actions can cause the loss of all information about the issue.

  • Complete the bootup sequence if the router experiences boot errors.

If you have the output of a show command from your Cisco device (including the show technical-support command), you can use the Output Interpreter tool (registered customers only) to display potential issues and fixes. You must be logged in and have JavaScript enabled in order to use this tool.

Troubleshoot Serial Interfaces

This is a list of references to use in order to troubleshoot serial interfaces:

Troubleshoot ISDN Interfaces

This is a list of references to use in order to troubleshoot ISDN interfaces:

Troubleshoot Router Hangs

A 2800 Series Router might experience a router hang. A hang is when the router boots to a certain point and then no longer accepts any commands or keystrokes. In other words, the console screen hangs after a certain point. Hangs are not necessarily hardware issues and most of the time, they are a software issue. Refer to Troubleshooting Router Hangs if your router experiences a router hang.

Router Reboot/Reload

When the router reboots, it returns to a normal state. A normal state means that the router is functional, passes traffic, and you are able to gain access to the router. Issue the show version command and look at the output in order to check why the router rebooted. This is an example:

Router#show version
Router uptime is 20 weeks, 5 days, 33 minutes
System returned to ROM by power-on

Router Crashes

A system crash refers to a situation where the system has detected an unrecoverable error and has restarted itself. A crash can be caused by software problems, hardware problems, or both. This section deals with hardware-caused crashes and crashes that are software-related, but might be mistaken as hardware problems.

caution Caution: If the router is reloaded after the crash, such as through a power-cycle or the reload command, important information about the crash is lost. You need to collect the show technical-support command and show log command outputs, as well as the crashinfo file (if possible) before you reload the router.

Refer to Troubleshooting Router Crashes for more information about this issue.

Bus Error Crashes

The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). A bus error can be identified through the output of the show version command provided by the router (if not power-cycled or manually reloaded).

These are two examples of bus error crashes:

Router uptime is 2 days, 21 hours, 30 minutes
System restarted by bus error at PC 0x30EE546, address 0xBB4C4
System image file is "flash:igs-j-l.111-24.bin", booted via flash 
.........

At the console prompt, this error message might also be seen during a bus error:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0x8, context= 0x608c3a50
PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002

Refer to Troubleshooting Bus Error Crashes for more information about this issue.

Continuous/Boot Loop

The router might experience a continuous loop that can be due to a hardware issue. A continuous loop never lets you gain access to the router. For example, you cannot log in to enable mode, and so on, and the router continues to give scrolling error messages until it is powered off. This section provides examples and troubleshooting steps to determine which piece of hardware causes the continuous loop.

Troubleshooting Flowchart

This is a troubleshooting flowchart for Bus Error Exception, SegV Exception, %ERR-1-GT64010, and Watchdog Timeout continuous loops:

hwts-2800-1.gif

If the router does not experience the continuous loop after you complete these troubleshooting steps, then the problem might have been caused by a mis-seated network module. It is recommended that you monitor the router for 24 hours to make sure that the router continues to function without experiencing the issue again.

SegV Exception

If you do not power-cycle or manually reload the router, the show version command displays this output:

Router uptime is 2 days, 3 hours, 5 minutes 
System restarted by error - a SegV exception, PC 0x80245F7C 
System image file is "flash:c2600-js-mz.120-9.bin"

This output can also be present in the console logs:

*** System received a SegV exception *** 
signal= 0xb, code= 0x1200, context= 0x80d15094 
PC = 0x80678854, Vector = 0x1200, SP = 0x80fcf170

Refer to SegV Exceptions for more information about this issue.

%ERR-1-GT64010

This is an example of a %ERR-1-GT64010 error:

%ERR-1-GT64010: Fatal error, PCI Master read 
cause=0x0120E483, mask=0x0CD01F00, real_cause=0x00000400 
bus_err_high=0x00000000, bus_err_low=0x04080000, addr_decode_err=0x14000470

Corrupt Software Image

When booting, a router might detect that a Cisco IOS software image is corrupt. The router returns the compressed image checksum is incorrect message and attempts to reload, and reports the event as a software-forced crash:

Error : compressed image checksum is incorrect 0x54B2C70A
        Expected a checksum of 0x04B2C70A


*** System received a Software forced crash ***
signal= 0x17, code= 0x5, context= 0x0
PC = 0x800080d4, Cause = 0x20, Status Reg = 0x3041f003

This behavior can then repeat indefinitely, or the router might drop to the ROM monitor.

This can be caused by a Cisco IOS software image that has actually been corrupted during the transfer to the router. In order to resolve this, you can load a new image onto the router. Refer to this Cisco Search to find a ROMmon recovery method for your platform.

It can also be caused by faulty memory hardware or by a software bug.

Watchdog Timeouts

Cisco processors have timers that guard against certain types of system hangs. The CPU periodically resets a watchdog timer. The watchdog timer basically controls the time of each process. If the timer is not reset, a trap occurs. If a process is longer than it should be, the watchdog timer is used to escape from this process.

There are two main types of watchdog timeouts. The first type is usually caused by a software problem and is reported in one or both of these ways:

  • The show version command output shows:

    "System returned to ROM by bus error at PC 0x602DADE0, address 0x480811"  
    - or - 
    "System returned to ROM by error - a Software forced crash, PC 0x60435894"
  • The console logs show:

    %SYS-2-WATCHDOG: Process aborted on watchdog timeout

The second type of watchdog timeout is usually due to a hardware problem and is reported in one or both of these ways:

  • The show version command output shows:

    Router uptime is 17 minutes
    System returned to ROM by watchdog timer expired
    System image file is "flash:c3640-is-mz.122-3.bin"
  • The console logs show:

    System returned to ROM by watchdog timer expired
    *** Watch Dog Timeout ***
    PC = 0x800001b4, SP = 0x61e19590

Both of these are potential software issues and need further investigation based on their symptoms. Refer to Troubleshooting Bus Error Crashes or Understanding Software-forced Crashes. This depends on which one appears in the show version command output. Refer to Troubleshooting Watchdog Timeouts for more information about watchdog timeout crashes.

Router Does Not Boot

Information captured from the console of the router is essential to troubleshoot a router that does not boot. The console output should be logged in a file for later analysis or for Cisco Technical Support if a TAC case is opened. This section compares symptoms and recommended actions to take if you encounter boot problems.

No LEDs On After Powerup

Verify that the power cord is plugged in firmly and that the power supply is good. If that does not resolve the issue, replace the power cord. If the problem persists, replace the router.

LEDs On After Powerup, Nothing on the Console

Verify that the baud rate is set to 9600 bps. If that does not help, verify that the equipment used to connect to the console operates properly. Connect to a good router in order to check your console equipment. If the equipment tests successfully, but the problem remains, then replace the router.

Router Boots to ROMmon, No Error Messages on the Console

Set the configuration register to 0x2102 and reload the router:

rommon 1 > confreg 0x2102 
rommon 2 > reset

If the router remains in ROMmon, complete the procedure described in ROMmon Recovery for the Cisco 2600 Series Router and the VG200.

Router Boots to ROMmon, Error Messages on the Console

At bootup, one of these errors might be seen:

  • device does not contain a valid magic number

  • boot: cannot open "flash:"

  • boot: cannot determine first file name on device "flash:"

  • Error : uncompressed image checksum is incorrect [hex value]

These error messages mean that either the Flash is empty or the filesystem is corrupted.

Copy a valid image onto the Flash in order to resolve this. While you copy, you are prompted to erase the old contents of Flash (if any). Then, reload the router. Refer to ROMmon Recovery for the Cisco 2600 Series Router and the VG200 for instructions on how to copy a valid image into Flash.

You see this error message at bootup:

 %SCC-2-BAD_ID_HW: Failed Identification Test in 0/-1/-1 [1/0]

This error message means that there is a hardware module or interface that is not supported by the chassis or by a Cisco device.

  • Remove all the modules and interfaces and boot the router. Make sure the router boots witout any issues.

  • Add modules one by one and reboot the router to isolate the module that affects the router boot.

  • Make sure the module is supported by the router hardware and the software version.

Router Stops Booting After It Receives Error Message

During bootup, the router might display the pre- and post-compression image sizes disagree error message after which booting ceases.

Possible causes include:

  • corrupted software image

  • faulty Flash memory

  • faulty DRAM

  • bad memory slot

Copy a new image into Flash to begin troubleshooting this issue. Refer to ROMmon Recovery for the Cisco 2600 Series Router and the VG200 for instructions on how to copy a valid image into Flash.

If the installation of a new image fails to resolve the problem, you can swap out the memory. If you replace the Flash and DRAM, and this fails to resolve the problem, there is a chance that the memory slot on the chassis is faulty. Then, you need to use the TAC Service Request Tool (registered customers only) to create a service request in order to resolve the hardware issue.

Router Is Dropping Packets

Packet loss caused by hardware problems is fairly easy to identify. This section uses the output of the show interfaces command to identify packet loss.

Cyclic Redundancy Check (CRC) and Frame Errors

If CRC errors or frame errors constantly increase on the interface, this usually indicates a hardware problem.

router#show interface ethernet 0/0 
Ethernet0/0 is up, line protocol is up 
... 
121 input errors, 102 CRC, 19 frame, 0 overrun, 0 ignored

An exception to this is when CRC and frame errors are found on channelized interfaces. These can also indicate clocking problems. The fault that causes the errors can be anywhere between two connected interfaces: on cables, intermediate devices, or on the interfaces themselves. The troubleshooting techniques differ slightly for different interface types.

Ethernet Interfaces

For Ethernet interfaces, troubleshooting differs between a shared environment (devices connected through a hub or with a coaxial cable) and a switched environment (devices connected to a switch).

In a switched environment, there are only five components that can cause the error:

  • cable

  • local interface (port)

  • remote interface (port)

  • speed

  • duplex mismatch

Consequently, the troubleshooting steps are simple. For example, if a router is connected to a switch, the troubleshooting steps are:

  1. Replace the cable.

  2. If this does not solve the problem, try another port on the switch.

  3. If the problem persists, replace the Ethernet interface.

In a shared environment, the source of the problem is a lot harder to find. Every piece of hardware that makes up the shared segment can be the cause. All components (cables, connectors, and so on) have to be tested one by one.

Ignored Packets

Packets are ignored if there are no free buffers to accept the new packet. This can occur if the router is overloaded with traffic, or if the interface is faulty.

router#show interfaces ethernet 0/0 
Ethernet0/0 is up, line protocol is up 
... 
21 input errors, 0 CRC, 0 frame, 0 overrun, 21 ignored

If ignores are present on all interfaces, then the router is probably overloaded with traffic, or it does not have sufficient free buffers in the pool that match the maximum transmission unit (MTU) on interfaces. In the latter case, an increment of the ignored counter is followed by an increment of the no buffer counter:

router#show interfaces serial 0/0 
... 
1567 packets input, 0 bytes, 22 no buffer 
22 input errors, 0 CRC, 0 frame, 0 overrun, 22 ignored, 0 abort

You might also see an increase in the buffer failures counter in the pool that matches the MTU size:

router#show buffers 
  ... 
   Big buffers, 1524 bytes (total 50, permanent 50): 
   50 in free list (5 min, 150 max allowed) 
   3066 hits, 189 misses, 0 trims, 24 created 
   12 failures (0 no memory)

The number of preconfigured permanent, free, and maximum allowed buffers might not be completely compatible for every environment. Refer to Buffer Tuning for all Cisco Routers for more information about this and how to avoid it.

If ignores only increase on one interface and are not followed by an increment of the no buffer counter, and the interface is not heavily loaded, then this interface can be faulty. In that case, capture the output of the show tech-support command and contact Cisco Technical Support. The load on the interface can be viewed in the output of the show interfaces command:

router#show interfaces serial 0/0 
... 
reliability 255/255, txload 100/255, rxload 122/255

Input and Output Queue Drops

Input queue drops are never caused by hardware problems. Output queue drops can be caused by a hardware problem only if the output queue is constantly full and no packets are being sent out of the interface. Refer to Troubleshooting Input Queue Drops and Output Queue Drops for more information about these kinds of drops.

Router Loses Configuration Due to Faulty or Corrupt NVRAM

The router fails to load a previously saved configuration. One of these error messages is displayed:

System Bootstrap, Version 11.1(8)CA1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)
Copyright (c) 1997 by cisco Systems, Inc.
Warning: monitor nvram area is corrupt ... using default values
   
   Warning: NVRAM size is 0

   environment checksum in NVRAM failed

   Router#show startup-config
%Error opening nvram:/startup-config (Invalid Checksum)

These error messages usually indicate a hardware failure. Issue the test memory command in order to verify. This is an example of the command output:

Router#test memory
Test NVRAM card [y/n] ? y
Failed

The solution is to issue the write erase command and reload the router. If the issue persists, the hardware needs to be replaced.

When a hardware replacement is indicated after you troubleshoot, use one of these options:

  • If you have a hardware support contract directly with Cisco for this part, use the Service Order Submit Tool (registered customers only) to request a replacement part directly.

  • For warranty service, use the TAC Service Request Tool (registered customers only) in order to contact Cisco Technical Support online.

  • If your product is not covered by contract or warranty, contact your Cisco partner or reseller to request a replacement part for the hardware component that causes the issue.

Information to Collect If You Open a TAC Service Request

If you have identified a component that needs to be replaced, contact your Cisco partner or reseller to request a replacement for the hardware component that causes the issue. If you have a support contract directly with Cisco, use the TAC Service Request Tool (registered customers only) to open a TAC Service Request for a hardware replacement. Make sure you attach this information:
  • Console captures that show the error messages
  • Console captures that show the troubleshooting steps taken and the boot sequence during each step
  • The hardware component that failed and the serial number for the chassis
  • Troubleshooting logs
  • Output from the show technical-support command

Related Information

Updated: Jun 20, 2008
Document ID: 71444