Guest

Cisco 3600 Series Multiservice Platforms

Hardware Troubleshooting for the Cisco 3600 Series Router

Cisco - Hardware Troubleshooting for the Cisco 3600 Series Router

Document ID: 17963

Updated: Jul 07, 2005

   Print

Introduction

Valuable time and resources are often wasted replacing hardware that actually functions properly. This document helps troubleshoot potential hardware issues with Cisco 3600 series routers, and can help you identify which component may be causing a hardware failure, depending on the type of error that the router is experiencing.

Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.

Prerequisites

Requirements

Readers of this document should be knowledgeable of the following:

Components Used

The information in this document is based on the following hardware:

  • Cisco 3620, 3640, and 3600 Series Routers

Conventions

For more information on document conventions, see the Cisco Technical Tips Conventions.

Hardware-Software Compatibility and Memory Requirements

Whenever you install a new card, module, or Cisco IOS® software image, it is important to verify that the router has enough memory, and that the hardware and software are compatible with the features you wish to use.

Perform the following recommended steps to check for hardware-software compatibility and memory requirements:

  1. Use the Software Advisor tool (registered customers only) to choose software for your network device.

    Tips:

  2. Use the Download Software Area (registered customers only) to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and/or download the Cisco IOS software image. To determine the amount of memory (RAM and Flash) installed on your router, see How to Choose a Cisco IOS Software Release - Memory Requirements.

    Tips:

    • If you want to keep the same features as the version that is currently running on your router, but don't know which feature set you are using, enter the show version command on your router and paste it into the Output Interpreter tool (registered customers only) to find out. It is important to check for feature support, especially if you plan to use recent software features.

    • The 2600/3600/3700 Memory Calculator tool (registered customers only) can also help you determine the minimum amount of memory required based on the type of router, module, Cisco IOS software version, and feature set.

    • If you need to upgrade the Cisco IOS software image to a new version or feature set, see How to Choose a Cisco IOS Software Release for more information.

  3. If you determine that a Cisco IOS software upgrade is required, follow the steps outlined in Software Installation and Upgrade Procedure for the Cisco 3600 series router.

    Tips:

    • If your 3600 router does not have a connection to the network or does not have a valid Cisco IOS software image, you may need to use the console port of the router to perform an xmodem software upgrade using ROMmon. This procedure does not require the use of a Trivial File Transfer Protocol (TFTP) server.

      For information on how to recover a Cisco 3600 series router stuck in ROMmon (rommon # > prompt), see ROMmon Recovery for the Cisco 3600 Series Router.

Error Messages

The Error Message Decoder tool (registered customers only) allows you to check the meaning of an error message. Error messages appear on the console of Cisco products, usually in the following form:

%XXX-n-YYYY : [text]

Here is an example of an error message:

Router# %SYS-2-MALLOCFAIL: Memory allocation of [dec] bytes failed from [hex], 
pool [chars], alignment [DEC]  

Some error messages are informational only, while others indicate hardware or software failures and require action. The Error Message Decoder tool (registered customers only) provides an explanation of the message, a recommended action (if needed), and if available, a link to a document that provides extensive troubleshooting information about that error message.

Modules and Cards

The Cisco 3660 has six network module slots, the Cisco 3640 has four slots, and the Cisco 3620 has two slots. Each network module slot accepts a variety of network module interface cards, supporting a variety of LAN, WAN, and Voice technologies.

Note: It is very important to check the 2600/3600/3700 Memory Calculator tool (registered customers only) to make sure your router has enough memory for the module/card you are trying to install.

Troubleshooting DSP on the NM-HDV Module

For information on troubleshooting the basic functionality of the Digital Signal Processor (DSP) from a hardware and software perspective, see Troubleshooting the DSP on NM-HDV for Cisco 2600/3600/VG200 Series Routers.

Troubleshooting Modem Modules

Internal Analog Modems

3600 Series Routers support the following internal analog modem components:

Internal Digital Modems

The 3600 Series Router supports the following internal digital modem components:

Digital Modem Network Modules: NM-6DM, NM-12DM, NM-16DM, NM-18DM, NM-24DM, and NM-30DM.

Refer to the following documents for more information:

Identifying the Issue

This section explains what to do to determine the cause of the potential hardware issue(s).

In order to determine the cause, the first step is to capture as much information about the problem as possible. The following information is essential for determining the cause of the problem:

  • Console logs - For more information, see Applying Correct Terminal Emulator Settings for Console Connections.

  • Syslog information - If the router is set up to send logs to a syslog server, you may be able to obtain information on what happened. For details, see How to Configure Cisco Devices for Syslog.

  • show technical-support command output - The show technical-support command is a compilation of many different commands including show version, show running-config, and show stacks. TAC engineers usually ask for this information to troubleshoot hardware issues. It is important to collect the show technical-support information before doing a reload or power-cycle as these actions can cause all information about the problem to be lost.

  • The complete bootup sequence if the router experiences boot errors.

If you have the output of a show command from your Cisco device (including show technical-support).

You can use Output Interpreter to display potential issues and fixes. In order to use Output Interpreter, you must be a registered customer, be logged in, and have JavaScript enabled.

Router Reboot/Reload

When the router reboots, it returns to a normal state. A "normal state" means that the router is functional, passing traffic, and that you are able to gain access to the router. To check why the router rebooted, issue the show version command and look at the output (see examples below).

Router# show version
Router uptime is 20 weeks, 5 days, 33 minutes
System returned to ROM by power-on

Router Stuck in ROMmon (rommon # > prompt)

For information on how to recover a Cisco 3600 series router stuck in ROMmon (rommon # > prompt), see ROMmon Recovery for the Cisco 3600 Series Router.

Router Crashes

A "system crash", refers to a situation where the system has detected an unrecoverable error and has restarted itself. A crash can be caused by software problems, hardware problems, or both. This section deals with hardware-caused crashes and crashes that are software-related, but may be mistaken for hardware problems.

caution Caution:  If the router is reloaded after the crash (for example, through a power-cycle or the reload command), important information about the crash will be lost, so try to collect show technical-support and show log output, as well as the crashinfo file (if possible) before reloading the router!

See Troubleshooting Router Crashes for more information regarding this issue.

Bus Error Crashes

The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). A bus error can be identified by looking at the output of the show version command provided by the router (if not power-cycled or manually reloaded).

Here are two examples of bus error crashes:

Router uptime is 2 days, 21 hours, 30 minutes
System restarted by bus error at PC 0x30EE546, address 0xBB4C4
System image file is "flash:igs-j-l.111-24.bin", booted via flash 
.........

At the console prompt, the following error message might also be seen during a bus error:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0x8, context= 0x608c3a50
PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002

For more information regarding this issue, see Troubleshooting Bus Error Crashes.

Continuous/Boot Loop

The router may get stuck in a continuous loop that may be due to a hardware issue. A continuous loop never lets you gain access to the router. The router continues to scroll error messages until it is powered off. Below are examples of the error messages seen, and the necessary troubleshooting steps to determine the faulty hardware.

Identifying a Continuous/Boot Loop Due to a Wrong Iomem Size

The following symptoms might be observed on the console during the boot sequence:

Not enough memory in the system for IO memory
IO memory available 4110105 required 5242880 _> 2600

and/or

SYSTEM INIT: INSUFFICIENT MEMORY TO BOOT THE IMAGE!

and/or

Not enough memory in the system to run this image 
Required pmem/iomem: 39435385/524288
*** System received a Software forced crash *** 

Recovery Procedure

Step A

Use the 2600/3600/3700 Memory Calculator (registered customers only) to review the Input/Output (I/O) and processor memory requirements for your hardware configuration.

  • If your router doesn't have enough DRAM memory, go to Step D.

  • If your router has enough DRAM memory, go to Step B.

Step B

  1. Turn off the router.

  2. Remove all the Network Modules (NMs) and WAN Interface Cards (WICs) from the router.

  3. Turn on the router.

  4. If the router does not come up after removing the NM and WIC, go to Step C.

    If it comes up fine, change the iomem configuration percentage as calculated by the 2600/3600/3700 Memory Calculator (registered customers only) :

    Router#configure terminal
    Router(config)#memory-size iomem 10
    
    !--- The command above adjusts the percentage of DRAM to use for I/O MemoryRouter
    
    (config)#exit
    Router#copy running-config startup-config
    Destination filename [startup-config]?
    Building 
    [OK]
    Router# 
    
  5. Turn off the router and reseat both the NM and WIC.

  6. Turn on the router. It should boot up fine with the NM and WIC cards.

Step C

  1. Attach a terminal or PC with terminal emulation to the console port of the router. Use the following terminal settings:

    9600 baud rate 
         No parity 
         8 data bits 
         1 stop bit 
          No flow control 
    

    See Applying Correct Terminal Emulator Settings for Console Connections for information on how to use the PC Hyper Terminal to configure and monitor a router.

    The required console cable specifications are described in Cabling Guide for RJ-45 Console and AUX Ports.

    Using the power switch, turn off the router and then turn it back on.

  2. Press Break on the terminal keyboard within 60 seconds of the power-up to put the router into ROMMON - diagnostic test mode.

    Tip: If the break sequence doesn't work, see Possible Key Combinations for Break Sequence During Password Recovery for other key combinations.

  3. Type confreg 0x2142 at the rommon 1> prompt to boot from Flash without loading the configuration.

  4. Type reset at the rommon 2> prompt. The router reboots, but ignores its saved configuration. Thus, it ignores the iomem command and uses the default value.

  5. Type no after each setup question, or press Ctrl-C to skip the initial setup procedure.

  6. Type enable at the Router> prompt. This puts you in enable mode where you will see the Router# prompt.

  7. caution Caution:  Do not type configure terminal at this point.

    Type copy startup-config running-config to copy the non-volatile RAM (NVRAM) into memory.

  8. Type show running-config. The show running-config command shows the configuration of the router. In this configuration, you can see the "memory-size iomem" line.

  9. Type configure terminal and remove the memory-size iomem command, or change it to the correct one (calculated with the 2600/3600/3700 Memory Calculator (registered customers only) ).

    The prompt is now Router(config)#.

  10. Type config-register 0x2102.

  11. Press Ctrl-z or type end to leave the configuration mode.

    The prompt is now Router#.

  12. Type copy running-config startup-config to commit the changes.

  13. Turn off the router, reseat the NM and WIC, and turn on the router again. It should boot up fine.

Step D

If your router does not have enough DRAM memory, you can:

  • upgrade the DRAM memory of your router, or

  • load a Cisco IOS software image in Flash which requires less I/O and processor memory

See the Hardware-Software Compatibility and Memory Requirements section if you wish to load a new Cisco IOS software image.

Troubleshooting Flowchart

Below is a troubleshooting flowchart for Bus Error Exception, %ERR-1-GT64010, Watchdog Timeout, and OIRINT continuous loops:

hwts_3600_17963.gif

**If the router does not experience the continuous loop after following the troubleshooting steps above, then it could have been caused by a mis-seated network module. It is recommended that you monitor the router for 24 hours to be sure that the router continues to function without experiencing the issue again.

Bus Error Exception

Here is an example of a bus error exception message:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0xc, context= 0x61c67fc0 
PC = 0x6043904c, Cause = 0x2420, Status Reg = 0x34018002

See Troubleshooting Bus Error Exceptions for more information regarding this issue.

SegV Exception

If you don't power-cycle or manually reload the router, the show version output displays the following:

Router uptime is 2 days, 3 hours, 5 minutes 
System restarted by error - a SegV exception, PC 0x80245F7C 
System image file is "flash:c2600-js-mz.120-9.bin" 

The following lines may also be present in the console logs:

 *** System received a SegV exception *** 
signal= 0xb, code= 0x1200, context= 0x80d15094 
PC = 0x80678854, Vector = 0x1200, SP = 0x80fcf170 

See SegV Exceptions for more information regarding this issue.

TLB (Load/Fetch) Exception

The TLB (Load/Fetch) Exception error will appear similar to this sample:

*** TLB (Load/Fetch) Exception ***
Access address = 0x1478
PC = 0x1478, Cause = 0x8008, Status Reg = 0x30410002

This error typically repeats indefinitely until interrupted by a user-issued break sequence or by power-cycling the router (after which the error may resume).

Begin troubleshooting by re-loading the Cisco IOS software image into Flash using the procedure outlined in ROMmon Recovery for the Cisco 3600 Series Router.

Troubleshoot the hardware using the flowchart above.

If the problem persists, turn the router off and reseat the DRAM, then power-up the router. If the problem continues to manifest itself, try replacing the DRAM and power-up the router again.

%ERR-1-GT64010

Here is an example of the %ERR-1-GT64010 error message:

%ERR-1-GT64010: Fatal error, PCI Master read 
cause=0x0120E483, mask=0x0CD01F00, real_cause=0x00000400 
bus_err_high=0x00000000, bus_err_low=0x04080000, addr_decode_err=0x14000470

Watchdog Timeouts

Cisco processors have timers that guard against certain types of system hangs. The CPU periodically resets a watchdog timer. The watchdog timer basically controls the time of each process. If the timer is not reset, a trap occurs. If a process is longer than it should be, the watchdog timer is used to escape from this process.

There are two main types of watchdog timeouts. The first type is usually caused by a software problem and is reported in one or both of these ways:

  • The show version command output shows:

    "System returned to ROM by bus error at PC 0x602DADE0, address 0x480811"  - or - 
    "System returned to ROM by error - a Software forced crash, PC 0x60435894" 
  • The console logs show:

    %SYS-2-WATCHDOG: Process aborted on watchdog timeout

The second type of watchdog timeout is usually due to a hardware problem and is reported in one or both of these two ways:

  • The show version command output shows:

    Router uptime is 17 minutes 
    System returned to ROM by watchdog timer expired 
    System image file is "flash:c3640-is-mz.122-3.bin"
  • The console logs show:

    System returned to ROM by watchdog timer expired 
    *** Watch Dog Timeout *** 
    PC = 0x800001b4, SP = 0x61e19590

Both of these are potential issues and need further investigation based on their symptoms. See Troubleshooting Bus Error Crashes or Troubleshooting Software-forced Crashes depending on which one appears in the show version output. For more information on Watchdog Timeout crashes, see Troubleshooting Watchdog Timeouts.

%OIRINT

%OIRINT: OIR Event has occurred oir_ctrl 50 oir_stat 4F4C

If this error message appears in the router log, it may signal that there is a possible hardware issue with either a power supply connector, a mis-seated network module, a bad network module, or a bad chassis slot. See Troubleshooting OIR Events on 3600 Series Routers for details.

Router Does Not Boot

Capturing information from the console of the router is essential for troubleshooting a router that does not boot. The console output should be logged in a file for later analysis or for the Cisco Technical Assistance Center (TAC) if a TAC case is opened.

The following table lists symptoms and recommended actions to take if you are encountering boot problems:

Symptom Recommended action
No LEDs are on after powering on the router. Check whether the power cord is plugged in firmly and power supply is good. If that does not resolve the issue, replace the power cord. If the problem persists, replace the router.
LEDs are on after powering on the router, but there is nothing on the console. Verify that the baud rate is set to 9600 bps. See Applying Correct Terminal Emulator Settings for Console Connections for information on how to use the PC Hyper Terminal to configure and monitor a router.If that doesn't help, verify that the equipment used for connecting to the console is operating properly. You can do this by connecting to a known good router to check your console equipment. If the equipment is successfully tested, but the problem remains, replace the router.
Router boots in ROMmon; no error messages on the console. Set the configuration register to 0x2102 and reload the router:
rommon 1 > confreg 0x2102 
rommon 2 > reset 
If the router remains in ROMmon, follow the procedure described in ROMmon Recovery for the 3600 series router.
Memory problems can cause the router to have boot problems For more information, see the Troubleshooting Memory Problems section in this document.
Router boots into ROMmon, with the following messages on the console:
  • device does not contain a valid magic number
  • boot: cannot open "flash:"
  • boot: cannot determine first file name on device "flash:"
The Flash is empty or the filesystem is corrupted. Copy a valid image on the Flash, and while copying, you will be prompted to erase the old Flash (if one exists). Then reload the router. See Software Installation and Upgrade Procedure for instructions on how to copy a valid image onto the Flash.
During bootup, the router may display the error message "pre and post compression image sizes disagree" after which booting ceases. Possible causes include:
  • corrupted software image
  • faulty Flash memory
  • faulty DRAM
  • bad memory slot
Begin troubleshooting this issue by copying a new image into Flash. See ROMmon Recovery for the Cisco 3600 Series Router for instructions on how to copy a valid image into Flash. If installing a new image fails to resolve the problem, you can try swapping out the memory. If replacing the Flash and DRAM fail to resolve the problem, there is a chance that the memory slot on the chassis is faulty; you will need to open a TAC case (registered customers only) to resolve the hardware issue.

Router Is Dropping Packets

Packet loss caused by hardware problems is fairly easy to identify. The following section uses the output of the show interfaces command to identify packet loss.

Cyclic Redundancy Check (CRC) and Frame Errors

If CRC errors or frame errors are constantly increasing on the interface, this usually indicates a hardware problem.

router#show interface ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   121 input errors, 102 CRC, 19 frame, 0 overrun, 0 ignored 

An exception to this is when CRC and frame errors are found on channelized interfaces; they can indicate clocking problems as well. The fault that is causing the errors can be anywhere between two connected interfaces - on cables, intermediate devices, or on interfaces themselves. Troubleshooting techniques differ slightly for different interface types.

Ethernet Interfaces

For Ethernet interfaces, troubleshooting differs between a shared environment (devices connected through a hub or with a coaxial cable) and a switched environment (devices connected to a switch).

In a switched environment, five components could cause the error:

  • cable

  • local interface (port)

  • remote interface (port)

  • speed

  • duplex mismatch

Consequently, the troubleshooting steps are simple. For example, if a router is connected to a switch, the troubleshooting steps would be:

  1. Replace the cable (make sure you use a straight through cable).

  2. If this does not solve the problem, try another port on the switch.

  3. If the problem persists, replace the Ethernet interface.

In a shared environment, the source of the problem is a lot harder to find. Every piece of hardware that makes up the shared segment can be the cause. All components (cables, connectors, and so on) have to be tested one by one.

Ignored Packets

router#show interfaces ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   21 input errors, 0 CRC, 0 frame, 0 overrun, 21 ignored

Packets are ignored if there are no free buffers to accept the new packet. This can happen if the router is overloaded with traffic, but can also happen if the interface is faulty. If "ignores" are present on all interfaces, then the router is probably overloaded with traffic, or doesn't have sufficient free buffers in the pool that match the maximum transmission unit (MTU) on interfaces. In the latter case, an increment of the ignored counter is followed by an increment of the no buffer counter:

router#show interfaces serial 0/0 
   ... 
   1567 packets input, 0 bytes, 22 no buffer 
   22 input errors, 0 CRC, 0 frame, 0 overrun, 22 ignored, 0 abort

You may also see an increase in the buffer failures counter in the pool that matches the MTU size:

router#show buffers 
   ... 
   Big buffers, 1524 bytes (total 50, permanent 50): 
   50 in free list (5 min, 150 max allowed) 
   3066 hits, 189 misses, 0 trims, 24 created 
   12 failures (0 no memory)

The number of preconfigured permanent, free, and maximum allowed buffers may not be completely compatible for every environment. You can read more about this and how to avoid it in Buffer Tuning.

If "ignores" are only increasing on one interface and are not followed by an increment of the no buffer counter, and the interface is not heavily loaded, then this interface could be faulty. In that case, capture the output of the show tech-support command and contact the TAC. The load on the interface can be viewed in the output of the show interfaces command:

router#show interfaces serial 0/0 
... 
   reliability 255/255, txload 100/255, rxload 122/255 

Input and Output Queue Drops

Input queue drops are never caused by hardware problems. Output queue drops may be caused by a hardware problem only if the output queue is constantly full and no packets are being sent out of the interface. You can read more about these kinds of drops in Troubleshooting Input Queue Drops and Output Queue Drops.

Troubleshooting Memory Problems

For information on memory specifications and requirements, see the Cisco 3600 Series Memory Options and Configuration Guide.

Troubleshooting Ethernet Interfaces

Troubleshooting Ethernet Interfaces provides troubleshooting procedures for common Ethernet media problems.

Troubleshooting Serial Interfaces

Here is a list of references to use for troubleshooting serial interfaces:

Troubleshooting ISDN Interfaces

Here are some references to use for troubleshooting ISDN interfaces:

Troubleshooting Router Hangs

A 3600 series router may experience a router hang. A hang is when the router boots to a certain point and then no longer accepts any commands or keystrokes. In other words, the console screen hangs after a certain point. Hangs are not necessarily hardware issues and most of the time, they are a software issue. If your router is experiencing a router hang, see Troubleshooting Router Hangs.

Information to Collect if You Open a TAC Case

If you still need assistance after following the troubleshooting steps above and want to open a case (registered customers only) with the Cisco TAC, be sure to include the following information:
  • Console captures showing the error messages
  • Console captures showing the troubleshooting steps taken and the boot sequence during each step
  • The hardware component that failed and the serial number for the chassis
  • Troubleshooting logs
  • Output from the show technical-support command
Please attach the collected data to your case in non-zipped, plain text format (.txt). You can attach information to your case by uploading it using the Case Query tool (registered customers only) . If you cannot access the Case Query tool, you can send the information in an email attachment to attach@cisco.com with your case number in the subject line of your message.

Note: Please do not manually reload or power-cycle the router before collecting the above information unless required for troubleshooting reasons as this can cause important information to be lost that is needed for determining the root cause of the problem.

Related Information

Updated: Jul 07, 2005
Document ID: 17963