Guest

Cisco 3700 Series Multiservice Access Routers

Cisco 3700 Series Router Hardware Troubleshooting

Document ID: 71657

Updated: Mar 04, 2008

   Print

Introduction

This document helps you to troubleshoot potential hardware issues with Cisco 3700 Series Routers. This document also provides information to help you identify which component causes a hardware failure.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

Components Used

The information in this document is based on Cisco 3700 Series Routers.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Background Information

These are the common router issues:

  • Router does not boot.

  • Router crashes.

  • Router keeps rebooting.

  • Router is stuck in ROMmon mode.

  • Router is unable to recognize the newly inserted card.

The possible solutions for these router issues are:

  • Installation issue—Reseat the module in order to fix most of the issues. Read the Hardware Installation Guide before you install the hardware. There are some modules that need configuration after you install it on the router.

  • Configuration issue—Some of the issues are misunderstood as the hardware issue. Configure the router properly in order to solve those issues.

  • Software issue—In this case the router needs an upgrade of Cisco IOS® software.

  • Hardware issue—In this case the corresponding hardware needs to be replaced.

Troubleshooting

These Troubleshooting sections from 3700 hardware installation guide are useful.

Troubleshoot Router Boot Issues

General Troubleshooting Steps

Information captured from the console of the router is essential in order to troubleshoot a router that does not boot. The console output should be logged in a file for later analysis or for Cisco Technical Support if a TAC case is opened.

This table lists symptoms and recommended actions to take if you encounter boot problems:

Symptom Recommended Action
No LEDs are on after you power on the router. Check whether the power cord is plugged in firmly and power supply is good. If that does not resolve the issue, replace the power cord. If the problem persists, call Cisco Technical Support for further troubleshooting.
LEDs are on after powering on the router, but there is nothing on the console. Verify that the baud rate is set to 9600 bps. Refer to Applying Correct Terminal Emulator Settings for Console Connections for information on how to use the PC Hyper Terminal in order to configure and monitor a router. If that does not help, verify that the equipment used in order to connect to the console operates properly. Connect to a known good router in order to check your console equipment. If the equipment is successfully tested, but the problem remains, call Cisco Technical Support for further troubleshooting.
Router boots in ROMmon; no error messages on the console. Set the configuration register to 0x2102 and reload the router:
rommon 1 >confreg 0x2102 
rommon 2 >reset 
Complete the steps described in ROMmon Recovery for the Cisco 3600 and 3700 Series Routers if the router remains in ROMmon.
Router boots into ROMmon, with these messages on the console:
  • device does not contain a valid magic number
  • boot: cannot open "flash:"
  • boot: cannot determine first file name on device "flash:"
The Flash is empty or the filesystem is corrupted. Copy a valid image on the Flash, and while you copy, you are prompted to erase the old Flash, if one exists. Then reload the router. Refer to Software Upgrade Procedure for instructions on how to copy a valid image onto the Flash.
During bootup, the router might display the error message pre and post compression image sizes disagree after which booting ceases. Possible causes include:
  • corrupted software image
  • faulty Flash memory
  • faulty DRAM
  • bad memory slot
Copy a new image into Flash in order to begin to troubleshoot this issue. Refer to ROMmon Recovery for the Cisco 3600 and 3700 Series Routers for instructions on how to copy a valid image into Flash. If the installation of a new image fails to resolve the problem, you can try to swap out the memory. If the replacement of the Flash and DRAM fail to resolve the problem, there is a chance that the memory slot on the chassis is faulty. You need to use the TAC Service Request Tool (registered customers only) in order to resolve the hardware issue.

Router Boots with Error Messages after Cisco IOS Upgrade

After you upgrade the Cisco IOS and copy the startup-config file back to the router, the router boots up with this error message.

NV: Invalid Pointer value([Hex]) in private configuration
structure

This is a NVRAM error message. You need to format the NVRAM and copy the startup-config file back to the NVRAM.

Refer to Cisco bug ID CSCin98933 (registered customers only) for more information.

Router Stuck in ROMmon

When the router reloads, the router might go to ROMmon mode. Refer to ROMmon Recovery for the Cisco 3600 and 3700 Series Routers for information on how to recover a Cisco 3700 Series Router stuck in ROMmon (rommon # > prompt). After you recover the Cisco IOS with the use of the ROMmon Recovery for the Cisco 3600 and 3700 Series Routers, the router again might go to ROMmon mode after you reload the router even when you have the Configure-register 0x2102. The router stuck in ROMmon mode with the error message loadprog: bad file magic number: 0x0 boot: cannot load "flash:".

In this case these action items should be taken:

Router Boots and Hangs

The router boots and hangs with this message:

%ERR-1-GT64120 (PCI-1): Fatal error, PCI Master abort
 GT=0xB4000000, cause=0x03000400, mask=0x00D01F00, real_cause=0x00000400
 bus_err_high=0x00000000, bus_err_low=0x00000000, addr_decode_err=0x00000470

The error message relates to the bus error. The bus error can happen either due to the hardware modules problem or due to the router chassis problem.

Complete these steps in order to isolate the issue:

  1. Power down the router.

  2. Remove all network modules and power back on the router.

  3. Power up the router.

  4. If the router goes back into a boot loop with the error message, the chassis might be defective.

  5. If the router boots normally, power down the router, install one network module, and power back up.

  6. Continue to install one network module at a time because the hardware causes the issue. This should be done until the network module or chassis slot is isolated.

  7. Replace the part determined to be defective.

  8. If the router does not go back into the boot loop mode after you reinstall all modules, the issue might be a mis-seated network module.

Troubleshoot Continuous/Boot Loop

The router might get stuck in a continuous loop that can be due to a hardware issue. A continuous loop never lets you gain access to the router. Until it is powered off, the router continues to scroll error messages. This section provides examples of the error messages seen, and the necessary troubleshooting steps to determine the faulty hardware.

Bus error exception is one of the causes for continuous boot loop. Bus error exceptions can be caused in these instances:

  • Loaded Cisco IOS software does not support installed hardware

  • Software failure

  • Mis-seated hardware

  • Hardware failure

Troubleshooting Flowchart

This is a troubleshooting flowchart for Bus Error Exception, %ERR-1-GT64010, Watchdog Timeout, and OIRINT continuous loops:

troubleshoot_3700router.gif

Note: If the router does not experience the continuous loop after you complete these troubleshooting steps, then a mis-seated network module might have caused it. It is recommended that you monitor the router for 24 hours in order to make sure that the router continues to function and does not experience this issue again.

Bus Error Exception

This is an example of a bus error exception message:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0xc, context= 0x61c67fc0 
PC = 0x6043904c, Cause = 0x2420, Status Reg = 0x34018002

Refer to Troubleshooting Bus Error Crashes for more information on this issue.

SegV Exception

If you do not power-cycle or manually reload the router, this is what the show version output displays:

Router uptime is 2 days, 3 hours, 5 minutes 
System restarted by error - a SegV exception, PC 0x80245F7C 
System image file is "flash:c2600-js-mz.120-9.bin" 

These lines might also be present in the console logs:

 *** System received a SegV exception *** 
signal= 0xb, code= 0x1200, context= 0x80d15094 
PC = 0x80678854, Vector = 0x1200, SP = 0x80fcf170 

Refer to SegV Exceptions for more information on this issue.

TLB (Load/Fetch) Exception

The TLB (Load/Fetch) Exception error appears similar to this sample:

*** TLB (Load/Fetch) Exception ***
Access address = 0x1478
PC = 0x1478, Cause = 0x8008, Status Reg = 0x30410002

This error typically repeats indefinitely until interrupted by a user-issued break sequence or by power-cycling the router, after which the error might resume.

Re-load the Cisco IOS software image into Flash with the use of the procedure outlined in ROMmon Recovery for the Cisco 3600 and 3700 Series Routers in order to begin to troubleshoot.

Use the flowchart in this document in order to troubleshoot the hardware.

If the problem persists, turn the router off and reseat the DRAM, then power-up the router. If the problem continues to manifest itself, try to replace the DRAM and power-up the router again.

%ERR-1-GT64010

This is an example of the %ERR-1-GT64010 error message:

%ERR-1-GT64010: Fatal error, PCI Master read 
cause=0x0120E483, mask=0x0CD01F00, real_cause=0x00000400 
bus_err_high=0x00000000, bus_err_low=0x04080000, addr_decode_err=0x14000470

Watchdog Timeouts

Cisco processors have timers that guard against certain types of system hangs. The CPU periodically resets a watchdog timer. The watchdog timer basically controls the time of each process. If the timer is not reset, a trap occurs. If a process is longer than it should be, the watchdog timer is used in order to escape from this process.

There are two main types of watchdog timeouts. The first type is usually caused by a software problem and is reported in one or both of these ways:

  • The show version command output shows:

    "System returned to ROM by bus error at PC 0x602DADE0, address 0x480811"  - or - 
    "System returned to ROM by error - a Software forced crash, PC 0x60435894" 
  • The console logs show:

    %SYS-2-WATCHDOG: Process aborted on watchdog timeout

The second type of watchdog timeout is usually due to a hardware problem and is reported in one or both of these two ways:

  • The show version command output shows:

    Router uptime is 17 minutes 
    System returned to ROM by watchdog timer expired 
    System image file is "flash:c3640-is-mz.122-3.bin"
  • The console logs show:

    System returned to ROM by watchdog timer expired 
    *** Watch Dog Timeout *** 
    PC = 0x800001b4, SP = 0x61e19590

Both of these are potential issues and need further investigation based on their symptoms. Refer to Troubleshooting Bus Error Crashes or Understanding Software-forced Crashes, which depends on which one appears in the show version output. Refer to Troubleshooting Watchdog Timeouts for more information on watchdog timeout crashes.

Troubleshoot Router Crashes

A system crash refers to a situation where the system has detected an unrecoverable error and has restarted itself. Software problems, hardware problems, or both can cause a crash. This section deals with hardware-caused crashes and crashes that are software-related, but might be mistaken for hardware problems.

caution Caution:  If the router is reloaded after the crash, for example, through a power-cycle or the reload command, important information about the crash is lost. Try to collect show technical-support and show log output, as well as the crashinfo file, if possible, before you reload the router.

caution Caution: Refer to Troubleshooting Router Crashes for more information on this issue.

Bus Error Crashes

The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). A bus error can be identified through the output of the show version command provided by the router, if not power-cycled or manually reloaded.

These are two examples of bus error crashes:

Router uptime is 2 days, 21 hours, 30 minutes
System restarted by bus error at PC 0x30EE546, address 0xBB4C4
System image file is "flash:igs-j-l.111-24.bin", booted via flash 
......... 

At the console prompt, this error message might also be seen during a bus error:

*** System received a Bus Error exception *** 
signal= 0xa, code= 0x8, context= 0x608c3a50
PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002

You need to copy the show stacks output from the router and paste it into the Output Interpreter (registered customers only) . The output of Output Interpreter shows you the list of Cisco bug IDs if any are based on the show stacks output. Refer to Troubleshooting Bus Error Crashes for more information on bus error crashes.

Troubleshoot Router Hang Issues

A 3700 Series Router might experience a router hang. A hang is when the router boots to a certain point and then no longer accepts any commands or keystrokes. In other words, the console screen hangs after a certain point. Hangs are not necessarily hardware issues and most of the time, they are a software issue. Refer to Troubleshooting Router Hangs if your router experiences a router hang.

Troubleshoot Module Problems

The Cisco 3745 has four slots, and the Cisco 3725 has two slots. Each network module slot accepts a variety of network module interface cards, which support a variety of LAN, WAN, and Voice technologies.

NM-1T3/E3 Installation Issues

By default, the T3 controller does not show up in the show running-config output. Use show version in order to see the card, but it does not show up in show run and show ip interface brief.

Router-3745#show version
Cisco Internetwork Operating System Software
IOS (tm) 3700 Software (C3745-IK9S-M), Version 12.3(12b), RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2005 by cisco Systems, Inc.
Compiled Thu 31-Mar-05 18:07 by jfeldhou
Image text-base: 0x60008AF4, data-base: 0x61E20000

ROM: System Bootstrap, Version 12.2(8r)T2, RELEASE SOFTWARE (fc1)
ROM: 3700 Software (C3745-IK9S-M), Version 12.3(12b), RELEASE SOFTWARE (fc2)

D-R4745-9A uptime is 18 minutes
System returned to ROM by reload
System image file is "flash:c3745-ik9s-mz.123-12b.bin"


This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco 3745 (R7000) processor (revision 0.0) with 249856K/12288K bytes of memory.
Processor board ID
R7000 CPU at 350MHz, Implementation 39, Rev 3.3, 256KB L2, 2048KB L3 Cache
Bridging software.
X.25 software, Version 3.0.0.
SuperLAT software (copyright 1990 by Meridian Technology Corp).
2 FastEthernet/IEEE 802.3 interface(s)
1 Subrate T3/E3 ports(s)
DRAM configuration is 64 bits wide with parity disabled.
151K bytes of non-volatile configuration memory.
62592K bytes of ATA System CompactFlash (Read/Write)

Configuration register is 0x2102
Router-3745#show ip interface brief
Interface                  IP-Address      OK? Method Status                Prot
ocol
FastEthernet0/0            10.10.50.25     YES NVRAM  up                    up

FastEthernet0/1            unassigned      YES NVRAM  administratively down down

You need to configure the router in order to recognize the card. This is the configuration example. Refer to the hardware installation guide Configure the Card Type and Controller for T3 for detailed configuration information.

Router-3745#card type t3 1
Router-3745#
*Mar  1 00:24:20.031: %LINK-3-UPDOWN: Interface Serial1/0, changed state to down
*Mar  1 00:24:21.031: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1/0,
 changed state to down
Router-3745#show ip interface brief
Interface                  IP-Address      OK? Method Status                Prot
ocol
FastEthernet0/0            10.10.50.25     YES NVRAM  up                    up

FastEthernet0/1            unassigned      YES NVRAM  administratively down down

Serial1/0                  unassigned      YES unset  down                  down

Router Drops Packets

Packet loss caused by hardware problems is fairly easy to identify. This section uses the output of the show interfaces command in order to identify packet loss.

Cyclic Redundancy Check (CRC) and Frame Errors

If CRC errors or frame errors constantly increase on the interface, this usually indicates a hardware problem.

router#show interface ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   121 input errors, 102 CRC, 19 frame, 0 overrun, 0 ignored 

An exception to this is when CRC and frame errors are found on channelized interfaces. They can indicate clocking problems as well. The fault that causes the errors can be anywhere between two connected interfaces, on cables, intermediate devices, or on interfaces themselves. Troubleshooting techniques differ slightly for different interface types.

Ethernet Interfaces

For Ethernet interfaces, troubleshooting differs between a shared environment, devices connected through a hub or with a coaxial cable, and a switched environment, devices connected to a switch.

In a switched environment, these five components can cause the error:

  • cable

  • local interface (port)

  • remote interface (port)

  • speed

  • duplex mismatch

Consequently, the troubleshooting steps are simple. For example, if a router is connected to a switch, the troubleshooting steps are:

  1. Replace the cable (make sure you use a straight through cable).

  2. If this does not solve the problem, try another port on the switch.

  3. If the problem persists, replace the Ethernet interface.

    In a shared environment, the source of the problem is a lot harder to find. Every piece of hardware that makes up the shared segment can be the cause. All components, cables, connectors, and so forth, have to be tested one by one.

Ignored Packets

router#show interfaces ethernet 0/0 
   Ethernet0/0 is up, line protocol is up 
   ... 
   21 input errors, 0 CRC, 0 frame, 0 overrun, 21 ignored

Packets are ignored if there are no free buffers to accept the new packet. This can happen if the router is overloaded with traffic, but can also happen if the interface is faulty. If ignores are present on all interfaces, then the router is probably overloaded with traffic, or does not have sufficient free buffers in the pool that match the maximum transmission unit (MTU) on interfaces. In the latter case, an increment of the no buffer counter succeeds an increment of the ignored counter:

router#show interfaces serial 0/0 
   ... 
   1567 packets input, 0 bytes, 22 no buffer 
   22 input errors, 0 CRC, 0 frame, 0 overrun, 22 ignored, 0 abort

You might also see an increase in the buffer failures counter in the pool that matches the MTU size:

router#show buffers 
   ... 
   Big buffers, 1524 bytes (total 50, permanent 50): 
   50 in free list (5 min, 150 max allowed) 
   3066 hits, 189 misses, 0 trims, 24 created 
   12 failures (0 no memory)

The number of preconfigured permanent, free, and maximum allowed buffers might not be completely compatible for every environment. Refer to Buffer Tuning for all Cisco Routers for more information on this and how to avoid it.

If ignores only increase on one interface and are not followed by an increment of the no buffer counter, and the interface is not heavily loaded, then this interface could be faulty. In that case, capture the output of the show tech-support command and contact Cisco Technical Support. The load on the interface can be viewed in the output of the show interfaces command:

router#show interfaces serial 0/0 
... 
   reliability 255/255, txload 100/255, rxload 122/255 

Input and Output Queue Drops

Hardware problems never cause input queue drops. Output queue drops might be caused by a hardware problem only if the output queue is constantly full and no packets are sent out of the interface. Refer to Troubleshooting Input Queue Drops and Output Queue Drops in order to read more about these kinds of drops.

Troubleshoot Interface Issues

Troubleshoot Ethernet Interfaces

Troubleshooting Ethernet provides troubleshooting procedures for common Ethernet media problems.

Troubleshoot Serial Interfaces

This is a list of references to use in order to troubleshoot serial interfaces:

Troubleshoot ISDN Interfaces

These are some references to use in order to troubleshoot ISDN interfaces:

Information to Collect if You Open a TAC Case

If you still need assistance after you complete the troubleshooting steps and want to use the TAC Service Request Tool (registered customers only) in order to open a case with Cisco Technical Support, be sure to include this information:
  • Console captures that show the error messages
  • Console captures that show the troubleshooting steps taken and the boot sequence during each step
  • The hardware component that failed and the serial number for the chassis
  • Troubleshooting logs
  • Output from the show technical-support command
Please attach the collected data to your case in non-zipped, plain text format (.txt). Use TAC Service Request Tool (registered customers only) in order to upload and attach information to your case. If you cannot access the Case Query tool, you can send the information in an email attachment to attach@cisco.com with your case number in the subject line of your message.

Note: Do not manually reload or power-cycle the router before you collect this information unless required for troubleshooting reasons as this can cause important information to be lost that is needed in order to determine the root cause of the problem.

Related Information

Updated: Mar 04, 2008
Document ID: 71657