Guest

Cisco 3800 Series Integrated Services Routers

Cisco 3800 Series Router Hardware Troubleshooting

Cisco - Cisco 3800 Series Router Hardware Troubleshooting

Document ID: 71450

Updated: May 28, 2007

   Print

Introduction

Valuable time and resources are often wasted in the replacement of hardware that actually functions properly. This document helps you to troubleshoot potential hardware issues with Cisco 3800 Series Routers. This document also provides information to help you identify which component causes a hardware failure. This depends on the type of error that the router experiences.

Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

Components Used

The information in this document is based on Cisco 3800 Series Routers.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Hardware-Software Compatibility and Memory Requirements

Whenever you install a new card, module, or Cisco IOS® software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible with the features you wish to use.

Perform these recommended steps to check for hardware-software compatibility and memory requirements:

  1. Use the Software Advisor tool (registered customers only) to choose software for your network device.

    Tips:

  2. Use the Download Software Area (registered customers only) to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and to download the Cisco IOS software image. Refer to the Memory Requirements section of How to Choose a Cisco IOS Software Release in order to determine the amount of memory (RAM and Flash) installed on your router.

    Tips:

    • If you want to keep the same features as the version that currently runs on your router, but you do not know which feature set you use, issue the show version command on your router, and paste it into the Output Interpreter tool (registered customers only) to find out. It is important to check for feature support, especially if you plan to use recent software features.

    • If you need to upgrade the Cisco IOS software image to a new version or feature set, refer to How to Choose a Cisco IOS Software Release for more information.

  3. If you determine that a Cisco IOS software upgrade is required, complete the steps outlined in the Software Upgrade Procedure for the Cisco 3600 Series Router.

    Note: The Cisco IOS software upgrade procedure for the 3600 Series Router also applies to the 3800 Series Router. The Cisco IOS software file names might vary, based on the Cisco IOS software version, feature set, and platform.

    Tips:

Error Messages

The Error Message Decoder tool (registered customers only) allows you to check the meaning of an error message. Error messages appear on the console of Cisco products, usually in this form:

%XXX-n-YYYY : [text]

This is an example of an error message:

Router# %SYS-2-MALLOCFAIL: Memory allocation of [dec] bytes failed from [hex], 
pool [chars], alignment [DEC]

Some error messages are informational only, while others indicate hardware or software failures and require action. The Error Message Decoder tool provides an explanation of the message, a recommended action (if needed), and if available, a link to a document that provides extensive troubleshooting information about that error message.

Troubleshoot Cisco 3800 Series Routers

Your Cisco 3800 Series Integrated Services Router goes through extensive tests and burn-in before it leaves the factory. If you encounter problems, refer to Troubleshooting Cisco 3800 Series Routers to help isolate the problem or eliminate the router as the source of the problem.

This document contains these sections:

Also, refer to Password Recovery Procedure.

Boot Sequence

When a 3800 Series Router is powered on or rebooted, these events occur:

  • The ROM Monitor (in Boot ROM) initializes itself.

  • The ROM Monitor checks the boot field (the lowest four bits) in the configuration register.

    • If the last digit of the boot field is 0, for example 0x100, the system does not boot a Cisco IOS software image and waits for user intervention at the ROM Monitor prompt. From the ROM Monitor mode, you can issue the boot or b command in order to manually boot the system.

    • If the last digit of the boot field is 2 through F, for example 0x102 through 0x10F, the router boots the first valid image specified in the configuration file or specified by the BOOT environment variable. It goes through each boot system command in sequential order until it boots a valid image.

If the router cannot find a valid image, these events occur:

  • If all boot commands in the system configuration file fail, the system attempts to boot the first valid file in Flash memory.

  • If a fully functional system image is not found, the router does not function and stays in ROM Monitor while it waits to be reconfigured through a direct console port connection.

If the router finds a valid image, these events occur:

  • The main Cisco IOS software image is uncompressed into DRAM and loads from there.

  • Cisco IOS software makes required data structures, such as interface description blocks (IDBs), carves Interface Buffer on DRAM, loads the Startup Configuration, and is ready to go.

If the router is stuck in ROM Monitor mode, refer to the recovery procedures described in ROMmon Recovery for the Cisco 3800 Series Router.

Modules and Cards

The Cisco 3845 has four slots, and the Cisco 3825 has two slots. Each network module slot accepts a variety of network module interface cards that support a variety of LAN, WAN, and Voice technologies.

NM-1T3/E3 Installation Issues (DS3 Card)

By default, the T3 controller does not display in the show running-config command output. Issue the show version command in order to see the card. It does not display in the show run and show ip interface brief command outputs.

Router-3845#show version
Cisco Internetwork Operating System Software
IOS (tm) 3800 Software (C3845-IK9S-M), Version 12.3(12b), RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2005 by cisco Systems, Inc.
Compiled Thu 31-Mar-05 18:07 by jfeldhou
Image text-base: 0x60008AF4, data-base: 0x61E20000

ROM: System Bootstrap, Version 12.2(8r)T2, RELEASE SOFTWARE (fc1)
ROM: 3800 Software (C3845-IK9S-M), Version 12.3(12b), RELEASE SOFTWARE (fc2)

D-R4745-9A uptime is 18 minutes
System returned to ROM by reload
System image file is "flash:c3845-ik9s-mz.123-12b.bin"


This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco 3845 (R7000) processor (revision 0.0) with 249856K/12288K bytes of memory.
Processor board ID
R7000 CPU at 350MHz, Implementation 39, Rev 3.3, 256KB L2, 2048KB L3 Cache
Bridging software.
X.25 software, Version 3.0.0.
SuperLAT software (copyright 1990 by Meridian Technology Corp).
2 FastEthernet/IEEE 802.3 interface(s)
1 Subrate T3/E3 ports(s)
DRAM configuration is 64 bits wide with parity disabled.
151K bytes of non-volatile configuration memory.
62592K bytes of ATA System CompactFlash (Read/Write)

Configuration register is 0x2102
Router-3845#show ip interface brief
Interface                  IP-Address      OK? Method Status                Prot
ocol
FastEthernet0/0            10.10.50.25     YES NVRAM  up                    up

FastEthernet0/1            unassigned      YES NVRAM  administratively down down

You need to configure the router in order to recognize the card. This is a configuration example. Refer to the hardware installation guide, Configure the Card Type and Controller for T3, for more configuration information.

Router-3845#card type t3 1
Router-3845#
*Mar  1 00:24:20.031: %LINK-3-UPDOWN: Interface Serial1/0, changed state to down
*Mar  1 00:24:21.031: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1/0,
 changed state to down
Router-3845#show ip interface brief
Interface                  IP-Address      OK? Method Status                Prot
ocol
FastEthernet0/0            10.10.50.25     YES NVRAM  up                    up

FastEthernet0/1            unassigned      YES NVRAM  administratively down down

Serial1/0                  unassigned      YES unset  down                  down

Note: Some of the modules might not be hot swappable. After you install the card into the router, you might not be able to see the module in the show version command output. You need to reload the router in order to recognize the newly installed module.

Identify the Issue

This section explains how to determine the cause of the potential hardware issues.

In order to identify the issue, the first step is to capture as much information about the problem as possible. This information is essential to determine the cause of the problem:

  • Console logs—Refer to Applying Correct Terminal Emulator Settings for Console Connections for more information.

  • Syslog information—If the router is set up to send logs to a syslog server, you can obtain information on what occurred. Refer to Resource Manager Essentials and Syslog Analysis: How-To for more information.

  • show technical-support command output—The show technical-support command is a compilation of many different commands which includes the show version, show running-config, and show stacks commands. TAC engineers usually ask for this information to troubleshoot hardware issues. It is important to collect the show technical-support command information before you perform a reload or power-cycle as these actions can cause the loss of all information about the problem.

  • Complete the bootup sequence if the router experiences boot errors.

If you have the output of a show command from your Cisco device, which includes the show technical-support command, you can use the Output Interpreter tool (registered customers only) to display potential issues and fixes. You must be logged in and have JavaScript enabled in order to use this tool.

Router Reboot/Reload

When the router reboots, it returns to a normal state. A normal state means that the router is functional, passes traffic, and you are able to gain access to the router. Issue the show version command and look at the output in order to check why the router rebooted. This is an example:

Router#show version
Router uptime is 20 weeks, 5 days, 33 minutes
System returned to ROM by power-on

Router Stuck in ROMmon (rommon # > prompt)

Refer to ROMmon Recovery for the Cisco 3600/3700/3800 Series Routers for information on how to recover a Cisco 3800 Series Router stuck in ROMmon (rommon # > prompt).

Router Crashes

A system crash refers to a situation where the system has detected an unrecoverable error and has restarted itself. A crash can be caused by software problems, hardware problems, or both. This section deals with hardware-caused crashes and crashes that are software-related, but might be mistaken for hardware problems.

caution Caution:  If the router is reloaded after the crash, such as through a power-cycle or the reload command, important information about the crash is lost. You need to collect the show technical-support and show log command outputs, as well as the crashinfo file (if possible) before you reload the router.

Refer to Troubleshooting Router Crashes for more information about this issue.

Bus Error Crashes

The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). A bus error can be identified through the output of the show version command provided by the router (if not power-cycled or manually reloaded).

These are two examples of bus error crashes:

Router uptime is 2 days, 21 hours, 30 minutes
System restarted by bus error at PC 0x30EE546, address 0xBB4C4
System image file is "flash:igs-j-l.111-24.bin", booted via flash 
.........

At the console prompt, this error message might also be seen during a bus error:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0x8, context= 0x608c3a50
PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002

Refer to Troubleshooting Bus Error Crashes for more information about this issue.

Continuous/Boot Loop

The router might get stuck in a continuous loop that can be due to a hardware issue. A continuous loop never lets you gain access to the router. The router continues to scroll error messages until it is powered off. This section provides examples of the error messages seen, and the necessary troubleshooting steps to determine the faulty hardware.

Troubleshooting Flowchart

This is a troubleshooting flowchart for Bus Error Exception, %ERR-1-GT64010, Watchdog Timeout, and OIRINT continuous loops:

hwts-3800-1.gif

Note: If the router does not experience the continuous loop after you complete these troubleshooting steps, then it might have been caused by a mis-seated network module. It is recommended that you monitor the router for 24 hours to make sure that the router continues to function without experiencing this issue again.

Bus Error Exception

This is an example of a bus error exception message:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0xc, context= 0x61c67fc0 
PC = 0x6043904c, Cause = 0x2420, Status Reg = 0x34018002

Refer to Troubleshooting Bus Error Crashes for more information about this issue.

SegV Exception

If you do not power-cycle or manually reload the router, the show version command displays this output:

Router uptime is 2 days, 3 hours, 5 minutes 
System restarted by error - a SegV exception, PC 0x80245F7C 
System image file is "flash:c2600-js-mz.120-9.bin"

This output might also be present in the console logs:

 *** System received a SegV exception *** 
signal= 0xb, code= 0x1200, context= 0x80d15094 
PC = 0x80678854, Vector = 0x1200, SP = 0x80fcf170

Refer to SegV Exceptions for more information about this issue.

TLB (Load/Fetch) Exception

The TLB (Load/Fetch) Exception error appears similar to this sample:

*** TLB (Load/Fetch) Exception ***
Access address = 0x1478
PC = 0x1478, Cause = 0x8008, Status Reg = 0x30410002

This error typically repeats indefinitely until interrupted by a user-issued break sequence or by power-cycling the router (after which the error might resume).

Use the procedure outlined in ROMmon Recovery for the Cisco 3600/3700/3800 Series Routers to reload the Cisco IOS software image into Flash.

Use the troubleshooting flowchart from this document to troubleshoot the hardware.

If the problem persists, turn the router off and reseat the DRAM, then power-up the router. If the problem continues to manifest itself, replace the DRAM and power-up the router again.

%ERR-1-GT64010

This is an example of the %ERR-1-GT64010 error message:

%ERR-1-GT64010: Fatal error, PCI Master read
cause=0x0120E483, mask=0x0CD01F00, real_cause=0x00000400 
bus_err_high=0x00000000, bus_err_low=0x04080000, addr_decode_err=0x14000470

Watchdog Timeouts

Cisco processors have timers that guard against certain types of system hangs. The CPU periodically resets a watchdog timer. The watchdog timer basically controls the time of each process. If the timer is not reset, a trap occurs. If a process is longer than it should be, the watchdog timer is used to escape from this process.

There are two main types of watchdog timeouts. The first type is usually caused by a software problem and is reported in one or both of these ways:

  • The show version command output shows:

    "System returned to ROM by bus error at PC 0x602DADE0, address 0x480811"  
    - or - 
    "System returned to ROM by error - a Software forced crash, PC 0x60435894"
  • The console logs show:

    %SYS-2-WATCHDOG: Process aborted on watchdog timeout

The second type of watchdog timeout is usually due to a hardware problem and is reported in one or both of these ways:

  • The show version command output shows:

    Router uptime is 17 minutes 
    System returned to ROM by watchdog timer expired 
    System image file is "flash:c3640-is-mz.122-3.bin"
  • The console logs show:

    System returned to ROM by watchdog timer expired 
    *** Watch Dog Timeout *** 
    PC = 0x800001b4, SP = 0x61e19590

Both of these are potential issues and need further investigation based on their symptoms. Refer to Troubleshooting Bus Error Crashes or Understanding Software-forced Crashes. This depends on which one appears in the show version command output. Refer to Troubleshooting Watchdog Timeouts for more information on watchdog timeout crashes.

Router Does Not Boot

Information captured from the console of the router is essential to troubleshoot a router that does not boot. The console output should be logged in a file for later analysis or for Cisco Technical Support if a TAC case is opened.

This table lists symptoms and recommended actions to take if you encounter boot problems:

Symptom Recommended action
No LEDs are on after powering on the router. Check whether the power cord is plugged in firmly and the power supply is good. If that does not resolve the issue, replace the power cord. If the problem persists, replace the router.
LEDs are on after powering on the router, but there is nothing on the console. Verify that the baud rate is set to 9600 bps. Refer to Applying Correct Terminal Emulator Settings for Console Connections for information on how to use the PC Hyper Terminal to configure and monitor a router. If that does not help, verify that the equipment used to connect to the console operates properly. Connect to a good router in order to check your console equipment. If the equipment tests successfully, but the problem remains, replace the router.
Router boots in ROMmon; no error messages on the console. Set the configuration register to 0x2102 and reload the router:
rommon 1 > confreg 0x2102 
rommon 2 > reset
If the router remains in ROMmon, complete the procedure described in ROMmon Recovery for the Cisco 3600/3700/3800 Series Routers.
Router boots into ROMmon with these messages on the console:
  • device does not contain a valid magic number
  • boot: cannot open "flash:"
  • boot: cannot determine first file name on device "flash:"
The Flash is empty or the filesystem is corrupted. Copy a valid image on the Flash.While you copy, you are prompted to erase the old Flash (if one exists). Then, reload the router. Refer to Software Upgrade Procedure for instructions on how to copy a valid image onto the Flash.
During bootup, the router displays the error message pre and post compression image sizes disagree after which booting ceases. Possible causes include:
  • corrupted software image
  • faulty Flash memory
  • faulty DRAM
  • bad memory slot
Copy a new image into Flash to begin troubleshooting this issue. Refer to ROMmon Recovery for the Cisco 3600/3700/3800 Series Routers for instructions on how to copy a valid image into Flash. If the installation of a new image fails to resolve the problem, you can swap out the memory. If you replace the Flash and DRAM, and this fails to resolve the problem, there is a chance that the memory slot on the chassis is faulty. Then, you need to use the TAC Service Request Tool (registered customers only) to create a service request in order to resolve the hardware issue.

Router Is Dropping Packets

Packet loss caused by hardware problems is fairly easy to identify. This section uses the output of the show interfaces command to identify packet loss.

Cyclic Redundancy Check (CRC) and Frame Errors

If CRC errors or frame errors constantly increase on the interface, this usually indicates a hardware problem.

router#show interface ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   121 input errors, 102 CRC, 19 frame, 0 overrun, 0 ignored

An exception to this is when CRC and frame errors are found on channelized interfaces. These can indicate clocking problems as well. The fault that causes the errors can be anywhere between two connected interfaces: on cables, intermediate devices, or on the interfaces themselves. The troubleshooting techniques differ slightly for different interface types.

Ethernet Interfaces

For Ethernet interfaces, troubleshooting differs between a shared environment (devices connected through a hub or with a coaxial cable) and a switched environment (devices connected to a switch).

In a switched environment, five components can cause the error:

  • cable

  • local interface (port)

  • remote interface (port)

  • speed

  • duplex mismatch

Consequently, the troubleshooting steps are simple. For example, if a router is connected to a switch, the troubleshooting steps are:

  1. Replace the cable (make sure you use a straight through cable).

  2. If this does not solve the problem, try another port on the switch.

  3. If the problem persists, replace the Ethernet interface.

In a shared environment, the source of the problem is a lot harder to find. Every piece of hardware that makes up the shared segment can be the cause. All components (cables, connectors, and so on) have to be tested one by one.

Ignored Packets

router#show interfaces ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   21 input errors, 0 CRC, 0 frame, 0 overrun, 21 ignored

Packets are ignored if there are no free buffers to accept the new packet. This can occur if the router is overloaded with traffic, but can also occur if the interface is faulty. If ignores are present on all interfaces, then the router is probably overloaded with traffic, or it does not have sufficient free buffers in the pool that match the maximum transmission unit (MTU) on interfaces. In the latter case, an increment of the ignored counter is followed by an increment of the no buffer counter:

router#show interfaces serial 0/0 
   ... 
   1567 packets input, 0 bytes, 22 no buffer 
   22 input errors, 0 CRC, 0 frame, 0 overrun, 22 ignored, 0 abort

You might also see an increase in the buffer failures counter in the pool that matches the MTU size:

router#show buffers 
   ... 
   Big buffers, 1524 bytes (total 50, permanent 50): 
   50 in free list (5 min, 150 max allowed) 
   3066 hits, 189 misses, 0 trims, 24 created 
   12 failures (0 no memory)

The number of preconfigured permanent, free, and maximum allowed buffers might not be completely compatible for every environment. Refer to Buffer Tuning for all Cisco Routers for more information about this and how to avoid it.

If ignores only increase on one interface and are not followed by an increment of the no buffer counter, and the interface is not heavily loaded, then this interface might be faulty. In that case, capture the output of the show tech-support command and contact Cisco Technical Support. The load on the interface can be viewed in the output of the show interfaces command:

router#show interfaces serial 0/0 
   ... 
   reliability 255/255, txload 100/255, rxload 122/255

Input and Output Queue Drops

Input queue drops are never caused by hardware problems. Output queue drops can be caused by a hardware problem only if the output queue is constantly full and no packets are being sent out of the interface. Refer to Troubleshooting Input Queue Drops and Output Queue Drops for more information about these kinds of drops.

Troubleshoot Ethernet Interfaces

Refer to Troubleshooting Ethernet for the procedures to troubleshoot common Ethernet media problems.

Troubleshoot Serial Interfaces

This is a list of references to use in order to troubleshoot serial interfaces:

Troubleshoot ISDN Interfaces

These are references to use in order to troubleshoot ISDN interfaces:

Troubleshoot Router Hangs

A 3800 Series Router might experience a router hang. A hang is when the router boots to a certain point and then no longer accepts any commands or keystrokes. In other words, the console screen hangs after a certain point. Hangs are not necessarily hardware issues and most of the time, they are a software issue. Refer to Troubleshooting Router Hangs if your router experiences a router hang.

Inline Power Issues

The new Cisco EtherSwitch service modules (NME-16ES-1G-P, NME-X-23ES-1G-P, NME-XD-24ES-1S-P, and NME-XD-48ES-2S-P only) can provide both Cisco pre-standard and IEEE 802.3af Power over Ethernet (PoE) support when inserted in Cisco 2800 Series or 3800 Series Integrated Services Routers (requires an upgrade to an AC-IP power supply). 802.3af is the IEEE standard for delivering power to Ethernet ports.

After you add 802.3af EtherSwitch modules, you might not be able to configure PoE. This is because the inline power supply is required to provide PoE capabilities in these routers. The external power supply option cannot be used with the Cisco 2800 or 3800 Series. The internal router power supply should be swapped out for a new power supply with PoE capabilities if PoE is required. Examples of PoE enabled power supplies include PWR-2811-AC-IP=, PWR-2821-51-AC-IP=, PWR-3825-AC-IP=, and PWR-3845-AC-IP=. Refer to Cisco EtherSwitch Network Modules for more information and requirements.

Information to Collect If You Open a TAC Case

If you still need assistance after you complete these troubleshooting steps and want to open a case (registered customers only) with Cisco Technical Support, make sure to include this information:
  • Console captures that show the error messages
  • Console captures that show the troubleshooting steps taken and the boot sequence during each step
  • The hardware component that failed and the serial number for the chassis
  • Troubleshooting logs
  • Output from the show technical-support command
Attach the collected data to your case in non-zipped, plain text format (.txt). You can use the TAC Service Request Tool (registered customers only) in order to upload and attach information to your case. If you cannot access the TAC Service Request Tool, send the information in an email attachment to attach@cisco.com with your case number in the subject line of your message.

Note: Do not manually reload or power-cycle the router before you collect this information unless required for troubleshooting reasons. This can cause the loss of important information that is needed to determine the root cause of the problem.

Related Information

Updated: May 28, 2007
Document ID: 71450