Often, you spend valuable time and resources when you replace hardware that actually functions properly. This document helps troubleshoot potential hardware issues with the Cisco 2500 series routers, and can help you identify which component has caused a hardware failure, depending on the type of error that the router experiences.
Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.
Readers of this document should have knowledge of these topics:
The information in this document is based on these software and hardware versions:
Cisco 2500 Series Routers
The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you work in a live network, ensure that you understand the potential impact of any command before you use it.
For more information on document conventions, refer to the Cisco Technical Tips Conventions.
This section provides the requisite background information about Cisco 2500 series routers.
Whenever you install a new card/module, or Cisco IOS® software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible with the features you want to use.
To check for hardware-software compatibility and memory requirements, complete these steps:
Use the Download Software Area (registered customers only) to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and/or download the Cisco IOS software image. To determine the amount of memory (RAM and Flash) installed on your router, refer to the Memory Requirements section of How to Choose a Cisco IOS Software Release.
If you determine that a Cisco IOS software upgrade is required, follow the Software Installation and Upgrade Procedure for the Cisco 2500 Series Router.
Tip: For information on how to recover a Cisco 2500 Series Router that is stuck in ROMmon (rommon #> prompt), refer to ROMmon Recovery Procedure for Cisco 2500 Series Routers.
To see the variety of routers that this series includes, refer to Cisco 2500 Series Routers.
%XXX-n-YYYY : [text]
Here is an example error message:
Router# %SYS-2-MALLOCFAIL: Memory allocation of [dec] bytes failed from [hex], pool [chars], alignment [dec]
Some error messages are informational only, while others indicate hardware or software failures, and require action. The Error Message Decoder Tool (registered customers only) provides an explanation of the message, a recommended action (if needed), and if available, a link to a document that provides extensive information to troubleshoot that error message.
In order to determine the cause, the first step is to capture as much information about the problem as possible. The information listed here is essential to determine the cause of the problem:
Console logs (for more information, refer to Applying Correct Terminal Emulator Settings for Console Connections).
Syslog information. If the router is set up to send logs to a syslog server, you may be able to obtain information on what happened. For details, refer to the How to Configure Cisco Devices for Syslog section of the Resource Manager Essentials and Syslog Analysis: How-To document.
show technical-support command output. The show technical-support command is a compilation of many different show commands, such as, show version, show running-config, and show stacks. It is important to collect the show technical-support information before you reload or power-cycle the router, as these actions can cause all information about the problem to be lost.
Complete bootup sequence, if the router experiences boot errors or fails to boot.
Use these references to troubleshoot serial interfaces:
Use these references to troubleshoot ISDN interfaces:
If a 2500 router does not have enough memory, this can result in boot errors such as:
SYSTEM INIT: INSUFFICIENT MEMORY TO BOOT THE IMAGE!
or other issues, such as %SYS-2-MALLOCFAIL: Memory Allocation Failure errors. Here is an example of this type of message:
00:00:07: %SYS-2-MALLOCFAIL: Memory allocation of 4000 bytes failed from 0x32D05 C0, alignment 0 Pool: Processor Free: 0 Cause: Not enough free memory Alternate Pool: I/O Free: 0 Cause: Not enough free memory
See Upgrading Boot Image with Flash Memory Cards for Cisco 2500 Series Routers for more information.
Cisco 2500 Series Routers sometimes hang, which means that they may not forward traffic and/or may not respond to the console when it boots up or during normal operation. This type of issue is usually software-related. If your router experiences a hang, see Troubleshooting Router Hangs for more information.
When the router reboots, it returns to a normal state. A "normal state" means that the router is functional, and passes traffic, and that you are able to gain access to the router. To check why the router rebooted, issue the show version command, and look at the output (see example below):
Router# show version Router uptime is 20 weeks, 5 days, 33 minutes System returned to ROM by power-on
A "system crash" refers to a situation where the system has detected an unrecoverable error, and has restarted itself. A crash can be caused by software problems, hardware problems, or both. This section deals with hardware-caused crashes, and crashes that are software-related, but may be mistaken for hardware problems.
Important: If you manually reload the router after the crash (for example, through a power-cycle or the reload command), important information about the crash will be lost. Try to collect show technical-support and show log output before you manually reload the router.
See Troubleshooting Router Crashes for more information on this issue.
The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error), or does not respond properly (a hardware problem). To identify a bus error, look at the output of the show version command provided by the router (if not power-cycled or manually reloaded).
Here are two examples of bus error crashes:
Router uptime is 2 days, 21 hours, 30 minutes System restarted by bus error at PC 0x30EE546, address 0xBB4C4 System image file is "flash:igs-j-l.111-24.bin", booted via flash .........
On the console, you might see this error message during a bus error:
*** System received a Bus Error exception *** signal= 0xa, code= 0x8, context= 0x608c3a50 PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002
For more information on this issue, see Troubleshooting Bus Error Crashes.
The "system restarted by unknown reload" cause is generated by an unrecognizable string received through the console. Most likely, an analogue modem connected to the console port has sent an unsolicited data stream to the router which causes the router, in rare instances, to reload. These are usually single-event occurrences, but there are a few common causes:
The configuration register is break-enabled.
The standard configuration register value is 0x2102, where the eighth bit (the 1 in 0x2102) set to 1 disables the break. Set the eighth bit to 0 (that is, 0x2002) to enable the break. When the break is enabled, unrecognized strings to may enter, and crash the router, if you reload a PC console connected to the console of the router. See Software Configuration Register Bit Meanings for detailed information about the different configuration register values.
A modem attached to the console port could generate noise that is not in a recognizeable format.
If this happens, unplug the modem from the router, and move the connection to the auxiliary port.
The router is reloaded, and the break key sequence is entered constantly.
You can check the value of the configuration register at the end of a show version command:
Router#show version Cisco Internetwork Operating System Software IOS (tm) 2500 Software (C2500-DS40-L), Version 11.2(9), RELEASE SOFTWARE (fc1) Copyright (c) 1986-1997 by cisco Systems, Inc. Compiled Tue 23-Sep-97 08:20 by ckralik Image text-base: 0x03038474, data-base: 0x00001000 ROM: System Bootstrap, Version 11.0(5), SOFTWARE BOOTFLASH: 3000 Bootstrap Software (IGS-BOOT-R), Version 11.0(5), RELEASE SOFTWARE (fc1) doduo uptime is 18 hours, 37 minutes System restarted by unknown reload cause - ptr to non-ascii bytes 0x4 System image file is "flash:c2500-ds40-l.112-9", booted via flash cisco 2524 (68030) processor (revision B) with 8192K/2048K bytes of memory. Processor board ID 02781822, with hardware revision 00000000 Bridging software. X.25 software, Version 2.0, NET2, BFE and GOSIP compliant. 1 Ethernet/IEEE 802.3 interface(s) 2 Serial network interface(s) 5-in-1 module for Serial Interface 0 5-in-1 module for Serial Interface 1 32K bytes of non-volatile configuration memory. 8192K bytes of processor board System flash (Read ONLY) Configuration register is 0x2
Cisco processors have timers that guard against certain types of system hangs. The CPU periodically resets a watchdog timer. The watchdog timer basically controls the time of each process. If the timer is not reset, a trap occurs. If a process is longer than it should be, the watchdog timer is used to escape from this process.
There are two main types of watchdog timeouts. The first type is usually caused by a software problem and is reported in one or both of these ways:
The show version command output shows:
System returned to ROM by bus error at PC 0x602DADE0, address 0x480811
System returned to ROM by error, which is a Software forced crash, PC 0x60435894
The console logs show:
%SYS-2-WATCHDOG: Process aborted on watchdog timeout
The second type of watchdog timeout is usually due to a hardware problem, and is reported in one or both of these two ways:
The show version command output shows:
Router uptime is 17 minutes System returned to ROM by watchdog timer expired System image file is "flash:c3640-is-mz.122-3.bin"
The console logs show:
System returned to ROM by watchdog timer expired *** Watch Dog Timeout *** PC = 0x800001b4, SP = 0x61e19590
Both of these are potential software issues, and need further investigation based on their symptoms. See Troubleshooting Bus Error Crashes or Troubleshooting Software-forced Crashes, depending on which one appears in the show version output. For more information on Watchdog Timeout crashes, see Troubleshooting Watchdog Timeouts.
In order to troubleshoot a router that does not boot, capture information from the console of the router.
The console output should be logged in a file for later analysis or for the Cisco Technical Assistance Center (TAC) if a TAC service request is opened.
This table lists symptoms and recommended actions to take if you encounter boot problems:
|No LEDs are on after you power on the router||Check whether the power cord is plugged in firmly and power supply is good. If that does not resolve the issue, replace the power cord. If the problem persists, replace the router.|
|LEDs are on after you power on the router, but there is nothing on the console||Verify that the baud rate is set to 9600 bps. If that does not help, verify whether the equipment you use to connect to the console is operating properly. To do so, connect to a known good router to check your console equipment. If the equipment is successfully tested, but the problem remains, replace the router.|
|Router stuck in ROMmon rommon # >prompt; no error messages on the console||
Set the configuration to:
rommon 1 > confreg 0x2102 rommon 2 > resetFor information on how to recover a Cisco 2500 Series Router stuck in ROMmon (rommon # > prompt), refer to ROMmon Recovery for the Cisco 2500 Series Routers.
Router boots into ROMmon, with these messages on the console:
device does not contain a valid magic number boot: cannot open "flash:" boot: cannot determine first file name on device "flash:"
|The Flash is empty or the filesystem is corrupted. Copy a valid image on the Flash, and while you copy, you will be prompted to erase the old Flash (if one exists). Then reload the router. See Software Installation and Upgrade Procedure for instructions on how to copy a valid image onto the Flash.|
|Reload due to a parity error||At the first occurrence, simply monitor the router. At the second occurrence, replace the corresponding hardware as described in the Parity Error Troubleshooting document.|
|Reload due to a software-forced crash||This is almost always a software problem. Upgrade to the latest Cisco IOS software release in your release train.|
|Reload due to a SegV error||SegV errors are always software-related problems. Upgrade to the latest Cisco IOS software release in your release train, or use the Output Interpreter (registered customers only) tool to display potential issues and fixes. You can also see SegV Exceptions for more information on this issue.|
|What Causes a Router To Be Restarted By abort or trace trap?||
If you do not power-cycle or manually reload the router, the
show version output displays:
Router uptime is 1 minute System restarted by abort at PC 0x802737BC System image file is "flash:c2500-i-mz.120-4.T"or
Router uptime is 2 minutes System restarted by trace trap at PC 0x3171310 System image file is "flash:c2500-jos56i-l.120-9.bin"
|Why Does My Router Lose Its Configuration During Reboot?||In most cases, this is the result of an improperly set configuration register. The configuration register is usually changed during password recovery to bypass the startup configuration upon reboot. Many times, the configuration register is not returned back to a normal setting.|
Packet loss caused by hardware problems is fairly easy to identify. This section uses the output of the show interfaces command to identify packet loss.
If CRC errors or frame errors are constantly on the rise on the interface, this usually indicates a hardware problem.
router#show interfaces ethernet 0 Ethernet0/0 is up, line protocol is up ... 121 input errors, 102 CRC, 19 frame, 0 overrun, 0 ignored
An exception to this is when CRC and frame errors are found on channelized interfaces; these can also indicate clocking problems. The fault that causes the errors can be anywhere between two connected interfaces - on cables, intermediate devices, or on interfaces themselves. Troubleshooting techniques differ slightly for different interface types.
For Ethernet interfaces, troubleshooting differs between a shared environment (devices connected through a hub or with a coaxial cable) and a switched environment (devices connected to a switch).
In a switched environment, there are only five components that could cause the error:
local interface (port)
remote interface (port)
Consequently, the steps to troubleshoot are simple. For example, if a router is connected to a switch, the steps to troubleshoot are:
Replace the cable.
If this does not solve the problem, try another port on the switch.
If the problem persists, replace the Ethernet interface.
In a shared environment, the source of the problem is a lot harder to find. Every piece of hardware that makes up the shared segment can be the cause. All components (cables, connectors, and so on) have to be tested one by one.
Packets are ignored if there are no free buffers to accept the new packet. This can happen if the router is overloaded with traffic, but can also happen if the interface is faulty.
router# show interfaces ethernet 0 Ethernet0/0 is up, line protocol is up ... 21 input errors, 0 CRC, 0 frame, 0 overrun, 21 ignored
If "ignores" are present on all interfaces, the router is probably overloaded with traffic, or does not have sufficient free buffers in the pool that match the maximum transmission unit (MTU) on interfaces. In the latter case, an increment of the ignored counter is followed by an increment of the no buffer counter:
router# show interfaces serial 0 ... 1567 packets input, 0 bytes, 22 no buffer 22 input errors, 0 CRC, 0 frame, 0 overrun, 22 ignored, 0 abort
You may also see an increase in the buffer failures counter in the pool that matches the MTU size:
router# show buffers ... Big buffers, 1524 bytes (total 50, permanent 50): 50 in free list (5 min, 150 max allowed) 3066 hits, 189 misses, 0 trims, 24 created 12 failures (0 no memory)
The number of preconfigured permanent, free, and maximum allowed buffers may not be completely compatible for every environment. You can read more about this and how to avoid it in Buffer Tuning.
If "ignores" only increase on one interface and are not followed by an increment of the "no buffer" counter, and the interface is not heavily loaded, then this interface could be faulty. In that case, capture the output of the show technical-support command and contact the TAC. The load on the interface can be viewed in the output of the show interfaces command:
router#show interfaces serial 0 ... reliability 255/255, txload 100/255, rxload 122/255
Input queue drops are never caused by hardware problems. Output queue drops may be caused by a hardware problem, only if the output queue is constantly full, and no packets are being sent out of the interface. You can read more about these kinds of drops in Troubleshooting Input Queue Drops and Output Queue Drops.
|If you have identified a component that needs to be replaced, contact your Cisco partner or reseller to request a replacement for the hardware component that is causing the issue. If you have a support contract directly with Cisco, use the TAC Service Request Tool (registered customers only) to open a TAC service request, and ask for a hardware replacement. Ensure that you attach the following information:|
The Cisco Support Community is a forum for you to ask and answer questions, share suggestions, and collaborate with your peers.
Refer to Cisco Technical Tips Conventions for information on conventions used in this document.