The information in this document is based on these software and hardware versions:
All Cisco IOS® software versions
All Cisco routers
Note: This document does not apply to Cisco Catalyst switches or MGX platforms.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). A bus error can be identified from the output of the show version command provided by the router if not power-cycled or manually reloaded.
Router uptime is 2 days, 21 hours, 30 minutes
System restarted by bus error at PC 0x30EE546, address 0xBB4C4
System image file is "flash:igs-j-l.111-24.bin", booted via flash
At the console prompt, this error message can also be seen during a bus error:
*** System received a Bus Error exception ***
signal= 0xa, code= 0x8, context= 0x608c3a50
PC = 0x60368518, Cause = 0x20, Status Reg = 0x34008002
After this, the router reloads. In some cases, however, the router goes into a loop of crashes and reloads and manual intervention is required to break out of this loop.
Another related issue is a Versatile Interface Processor (VIP) crash. If this problem occurs, error messages similar to these are logged:
%VIP2 R5K-1-MSG: slot0 System reloaded by a Bus Error exception
%VIP2 R5K-1-MSG: slot0 caller=0x600BC974
%VIP2 R5K-1-MSG: slot0 System exception: sig=10, code=0x408,
Finally, another bus error crash type is a line card crash on a Cisco 12000 Series Internet Router. If this problem occurs, error messages similar to these are logged in the show context output:
CRASH INFO: Slot 1, Index 1, Crash at 11:27:15 utc Wed May 16 2001
GS Software (GLC1-LC-M), Version 12.0(16.5)S, EARLY DEPLOYMENT MAINTENANCE
TAC Support: http://www.cisco.com/pcgi-bin/ibld/view.pl?i=support
Compiled Thu 29-Mar-01 17:12 by ninahung
Card Type: 3 Port Gigabit Ethernet, S/N
System exception: SIG=10, code=0x2008, context=0x40D8DF44
System restarted by a Bus Error exception
-Traceback= 40165800 4038D0FC 4025C7BC 4026287C 4029581C 402EECF8 400C0144
$0 : 00000000, AT : 00000000, v0 : 00000044, v1 : 0FE00020
a0 : 00000000, a1 : 0FE00000, a2 : 00000000, a3 : 39EC6AAB
t0 : 00000030, t1 : 34008D01, t2 : 34008100, t3 : FFFF00FF
t4 : 400C01E8, t5 : 00000001, t6 : 00000001, t7 : 00000001
s0 : 40DCDD20, s1 : 0FE00000, s2 : 00000000, s3 : 000005DC
s4 : 00000000, s5 : 0FE00020, s6 : 00000004, s7 : 414CF120
t8 : 41680768, t9 : 00000000, k0 : 00000000, k1 : FFFF8DFD
gp : 40CB9780, sp : 4105BFE8, s8 : 41652BA0, ra : 4038D0FC
EPC : 0x40165800, SREG : 0x34008D03, Cause : 0x00002008
ErrorEPC : 0xBFC22B94
-Process Traceback= No Extra Traceback
The first thing to do is to find out which memory location (also known as the "address" or "address operand") the router tried to access when the bus error occurred. With this information, you have an indication as to whether the fault lies with the Cisco IOS Software or the router hardware. In the example, "System restarted by bus error at PC 0x30EE546, address 0xBB4C4", the memory location that the router tried to access is 0xBB4C4. Do not confuse this with the program counter (PC) value above.
The second thing to do is determine the type of processor in the router. Memory address locations for routers differ depending on the type of processor. There are two main types of processors in Cisco routers:
This is part of a show version output that indicates that the router has a 68000 processor:
cisco 2500 (68030) processor (revision D) with 8192K/2048K bytes of memory.
Router platforms that have 68000 processors include:
Cisco 1000 Series Routers
Cisco 1600 Series Routers
Cisco 2500 Series Routers
Cisco 4000 Series Routers
Route Processor (RP) Modules on Cisco 7000 (RP) Series Routers
Reduced Instruction Set Computing (RISC) Processors
This is part of a show version output that indicates that the router has a RISC processor:
cisco 3640 (R4700) processor (revision 0x00) with 49152K/16384K bytes of memory.
The R in (R4700) indicates a RISC processor.
Router platforms that have RISC processors include:
Cisco 3600 Series Routers
Cisco 4500 Series Routers
Cisco 4700 Series Routers
Route Switch Processor (RSP) Modules on Cisco 7500 Series and Cisco 7000 (RSP7000) Series Routers
Network Processor Engine (NPE) Modules on Cisco 7200 Series Routers
Multilayer Switch Feature Card (MSFC) on the Cisco 7600 Series Routers or Catalyst 6000 Switch
Performance Routing Engine (PRE) Modules on Cisco 10000 Series Internet Routers
Gigabit Route Processor (GRP) Modules on Cisco 12000 Series Internet Routers
Once you have determined the address and the processor type, you can start with more detailed troubleshooting.
With the address accessed by the router when the bus error occurred, use the show region command to determine the memory location the address corresponds to. If the address reported by the bus error does not fall within the ranges displayed in the show region output, this means that the router tried to access an address that is not valid. This indicates that it is a Cisco IOS Software problem. Use the Output Interpreter Tool (registered customers only) to decode the output of the show stacks command and identify the Cisco IOS Software bug that causes the bus error.
On the other hand, if the address falls within one of the ranges in the show region output, it means that the router accessed a valid memory address, but the hardware corresponding to that address does not respond properly. This indicates a hardware problem.
Here is an example of the show region output:
Start End Size(b) Class Media Name
0x00000000 0x007FFFFF 8388608 Local R/W main
0x00001000 0x0001922F 98864 IData R/W main:data
0x00019230 0x000666B3 316548 IBss R/W main:bss
0x000666B4 0x007FEFFF 7965004 Local R/W main:heap
0x007FF000 0x007FFFFF 4096 Local R/W main:flhlog
0x00800000 0x009FFFFF 2097152 Iomem R/W iomem
0x03000000 0x037FFFFF 8388608 Flash R/O flash
0x0304033C 0x037A7D3F 7764484 IText R/O flash:text
Note: In some earlier Cisco IOS Software versions, this command is not available. The show region output is part of the show tech-support output from Cisco IOS Software Release 12.0(9).
Addresses are displayed in hexadecimal format. The addresses that fall within the "Start" and "End" ranges are valid memory addresses.
Main corresponds to main memory or dynamic RAM (DRAM).
iomem corresponds to input/output (I/O) memory, which means different parts for different platforms. For example, DRAM for the Cisco 2500, shared RAM (SRAM) for the Cisco 4000.
Still using the previous example, System restarted by bus error at PC 0x30EE546, address 0xBB4C4, this bus error crash comes from a Cisco 2500 router with the show region output. The address 0xBB4C4 is equivalent to 0x000BB4C4. Using the show region output, this address falls within the range of "main", or more specifically, "main:heap" or 0x000666B4-0x007FEFFF. As mentioned earlier, "main" corresponds to the main memory or the DRAM, so the DRAM chips need to be checked.
If this is a new router, or if the router has been moved from one location to another, the memory chips often become loose. It's a good idea to reseat or firmly push the memory chips into the slot. Most of the time, this is sufficient for solving this type of crash.
For bus error crashes with addresses that do not fall within the show region address ranges, use the Output Interpreter Tool to decode the output of the show stacks command and identify the Cisco IOS Software bug that is causing the bus error. If you are uncertain which bug ID may match or which Cisco IOS software version contains the fix for the problem, upgrading your Cisco IOS software to the latest version in your release train is one option that often resolves the issue since this usually contains the fix for a large number of bugs.
On RISC processors, Cisco IOS Software uses virtual addresses through the use of the Translation Lookaside Buffer (TLB) that translates virtual addresses into physical addresses. The address reported by bus errors on RISC processors is therefore the virtual address as opposed to the physical address used by the 68000 processors.
The output of the show region command must be used to check the address reported by the bus error. To illustrate this, let's take the following example:
System was restarted by bus error at PC 0x60104864, address 0xC
Using the show region command output below, you can verify that 0xC is not a valid virtual address, and you can conclude that the bus error was caused by a software problem. Use the Output Interpreter Tool (registered customers only) to decode the output of the show stacks or show technical-support (from enable mode) command and identify the Cisco IOS Software bug that is causing the bus error.
Another advantage of using the show region command is that the memory mapping depends on the amount of memory installed on the router. For example, if you have 64 MB of DRAM (64 x 1024 x 1024 = 67108864 bytes = 0x4000000 bytes), the DRAM range is 0x60000000 - 0x63FFFFFF for 64 MB. This is confirmed with the show region command:
Router#show version | i of memory
cisco RSP2 (R4700) processor with 65536K/2072K bytes of memory.
Start End Size(b) Class Media Name
0x40000000 0x40001FFF 8192 Iomem REG qa
0x40002000 0x401FFFFF 2088960 Iomem R/W memd
0x48000000 0x48001FFF 8192 Iomem REG QA:writethru
0x50002000 0x501FFFFF 2088960 Iomem R/W memd:(memd_bitswap)
0x58002000 0x581FFFFF 2088960 Iomem R/W memd:(memd_uncached)
0x60000000 0x63FFFFFF 67108864 Local R/W main
0x60010908 0x60C80B11 13042186 IText R/O main:text
0x60C82000 0x60F5AF1F 2985760 IData R/W main:data
0x60F5AF20 0x610E35FF 1607392 IBss R/W main:BSS
0x610E3600 0x611035FF 131072 Local R/W main:fastheap
0x61103600 0x63FFFFFF 49269248 Local R/W main:heap
0x80000000 0x83FFFFFF 67108864 Local R/W main:(main_k0)
0x88000000 0x88001FFF 8192 Iomem REG QA_k0
0x88002000 0x881FFFFF 2088960 Iomem R/W memd:(memd_k0)
0xA0000000 0xA3FFFFFF 67108864 Local R/W main:(main_k1)
0xA8000000 0xA8001FFF 8192 Iomem REG QA_k1
0xA8002000 0xA81FFFFF 2088960 Iomem R/W memd:(memd_k1)
If you have a bus error at 0x65FFFFFF, the show region output takes the amount of memory into account and tells you that it's an illegal address (software bug).
Use the show region command to verify whether the address indicated by the bus error is within the address ranges used by the router.
If the address falls within a virtual address range, replace the hardware corresponding to this range.
If the address does not fall within a virtual address range, use the Output Interpreter Tool (registered customers only) to decode the output of the show stacks or the show technical-support (from enable mode) command and identify the Cisco IOS software bug that is causing the bus error.
Give serious consideration to installing the most recent maintenance release of the Cisco IOS Software train that you are currently running.
A special type of bus error crash is when the crash is caused by a corrupted program counter (PC). The PC value is the location of the instruction which the processor was executing when the bus error occured. When a bus error caused by a corrupted PC occurs, the following message appears on the console:
%ALIGN-1-FATAL: Corrupted program counter
pc=0x0, ra=0x601860BC, sp=0x60924540, at=0x60224854
In this case, the PC has jumped to the address 0x0 (probably because of a null pointer), but this is not where the instruction is located. This is a software problem so there is no need to check with the show region command.
On other RISC platforms (Cisco 3600, 4500, and so forth), you get a SegV exception when jumping to an illegal PC, not a bus error.
Another type of bus error crash that occurs from time to time is when the PC value is equal to the address value. For instance:
System returned to ROM by bus error at PC 0x606B34F0, address 0x606B34F0
Notice the k1 register value is 0x14 (hexadecimal) which is equal to 20 in decimal. This points to a Cache Parity Exception. In this particular case, the parity error is not handled properly and is being masked by a bus error. The router has crashed due to a software bus error in the function handling a Cache Parity Exception.
You should also consider upgrading the Cisco IOS software release to a version which has a fix for CSCdv68388 - "Change cache error exception handler to resume not crash" which has been fixed since Cisco IOS Software Release 12.2(10).
Verify that all network cards are supported by the Cisco IOS software. The Software Advisor (registered customers only) gives you the minimum versions of Cisco IOS software needed for hardware. Verify, also, that the bootflash image supports the hardware installed if you have a router that supports a boot image such as the Cisco 7200 or Cisco 7500 series router.
On 2600 and 3600 routers, the router's I/O memory is configurable as a percentage of the main memory. If the I/O memory settings are inappropriate for the installed network modules or WAN interface cards (WICs), the 2600/3600 platform may have trouble booting and may crash with bus errors.
If a software configuration change has recently been made, and the router is in a booting loop, a software bug may be causing this issue.
If the router is not able to boot up, you can bypass the configuration to identify whether that is causing the issue. Follow these steps:
Break into ROMMON by sending the break sequence to the router during the first 60 seconds of boot up.
From ROM Monitor, use the confreg command to change the configuration register to a setting, such as 0x2142, to ignore the router's configuration:
rommon 1 > confreg 0x2142
You must reset or power cycle for new config to take effect
rommon 2 > reset
If the router boots without any errors, there is a configuration issue causing the problem. Verify that your configuration is supported in the Cisco IOS software and by the hardware. If it is supported, use the Bug Toolkit (registered customers only) to identify any software bugs that you may be experiencing. Give serious consideration to installing the most recent maintenance release of the Cisco IOS software train that you are currently running.
If you are experiencing a bus error exception booting loop, it may be caused by mis-seated hardware. For lower-end platforms such as the 3600 or 4000 router, reseat the network modules/network processors.
For higher-end platforms such as the 7200 or 7500 routers, reseat the processor, VIP, port adapters, or line card that is reloading due to a bus error exception.
The information contained in the bus error does not help to isolate the hardware. Therefore, it is important to remove and reinsert cards to find the problem hardware. Here are some recommended steps to isolate the problem:
**If the router does not experience the continuous loop after following the troubleshooting steps above, then the problem may have been caused by a mis-seated network module. It is recommended that you monitor the router for 24 hours to be sure that the router continues to function without experiencing the issue again.
If you still need assistance after following the troubleshooting steps above and want to open a case with Cisco Technical Support, be sure to include the following information for troubleshooting a bus error or bus error exception:
Troubleshooting performed before opening the case
show technical-support output (if possible, in enable mode)
show log output or console captures, if available
crashinfo file (if present and not already included in the show technical-support output)
show region output (if not already included in the show technical-support output)
Attach the collected data to your case in non-zipped, plain text format (.txt). You can attach information to your case by uploading it using the Case Query Tool (registered customers only) . If you cannot access the Case Query Tool, you can attach the relevant information to your case by sending it to firstname.lastname@example.org with your case number in the subject line of your message.
Note: Do not manually reload or power-cycle the router before collecting the above information unless required to troubleshoot a bus error exception as this can cause important information to be lost that is needed for determining the root cause of the problem.