This document describes how to analyze commonly seen hardware failure symptoms on the Aggregation Services Routers 903 (ASR903) and their troubleshooting methodology.
Cisco recommends that you have basic knowledge of these topics:
Cisco IOS-XE software
ASR 903 CLI
The information in this document was created from devices in a specific lab environment where failure symptoms were observed. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command
The Cisco ASR 903 Router is a fully-featured aggregation platform designed for the cost-effective delivery of converged mobile and business services. With shallow depth, low power consumption, and an extended temperature range, this compact 3 Rack-Unit (RU) router provides high service scale, full redundancy, and flexible hardware configuration. The Cisco ASR 903 Router is positioned as a pre-aggregation router in IP Radio Access Network (RAN) networks or an aggregation router in Carrier Ethernet networks.
The platform comprises of the following major Field Replaceable Units (FRU) as depicted in the figure below:
Interface modules (IM)
Two Route Switch Processor (RSP) unit slots. Supports RSP1A-55, RSP1B-55, RSP2A-64 and RSP2A-128
Redundant DC power units
During normal operation, any of the Field Replaceable Units (FRU's) can exhibit failure symptoms. Often this ends up in replacement of the hardware components which may not be necessarily a hardware failure. By following certain troubleshooting techniques you can recover these modules from its failure state and thereby reduce network downtime.
Failure reported by DC Power Supply (A900-PWR550-D)
Measure input DC Voltage at the DC PSU (Power Supply Unit) connector using a Multi-meter to verify the power source. The reading should be in the range of 24V to 60V.
If the input voltage reading is OK, check status of the LED's on the panel ('Input Ok' and 'Output Fail'). If both LEDs are OFF, replace the DC PSU.
If ‘Input OK’ LED is green, but ‘output Fail’ LED is amber/red then first remove the input power connector and then jack out the complete DC PSU. Wait for 15 seconds. Insert the DC PSU back and connect input power connector. This exercise needs to be done for both the DC PSU (if system has two DC PSU).
If ‘Input OK’ LED is green, and ‘output FAIL’ LED is not glowing at all, replace the DC PSU.
Note: Router can be operational with single Power Supply. The secondary Power Supply unit needs to be physically inserted if not powered ON.
Failure reported by Fan Tray
The Cisco ASR 903 Router uses a modular fan tray that is separate from the power supply. The fan tray contains twelve fans and provides sufficient capacity to maintain operation even in the event of a fan failure. There are two types of Fan Tray modules (A903-FAN and A903-FAN-E) depending on the environment where the router is used. The latter (A903-FAN-E) comes with a 8mm fan dust filter which prevents dust from entering the unit and avoids possible damage to the components.
Scenario 1: Individual Fan module in the Tray has failed
Use the command "show platform" or "show facility-alarm status" to determine status of Fans in the Tray. In the event of a Fan failure, the Fan tray status will be displayed as "fail" along with the details of the individual units which has failed.
show platform | in FAN|State
Chassis type: ASR-903
Slot Type State Insert time (ago) P2 A903-FAN-E
f2, f4, f6, fail
sh facility-alarm status
System Totals Critical: 1 Major: 3 Minor: 0 SourceSeverityDescription[Index]
Fan Tray CRITICAL Multiple Fan Failures 
Fan Tray MAJOR Fan 2 Failure 
Fan Tray MAJOR Fan 4 Failure 
Fan Tray MAJOR Fan 6 Failure 
These outputs show Fan modules in slot f2, f4 and f6 have failed and need to be replaced.
Scenario 2: Fan Tray reported as "Unknown"
In some cases, the Fan Tray may be reported as "Unknown" in the "show platform" output and the Network Management System (NMS) station may generate an alarm as well.
sh platform | in P2
Chassis type: ASR-903
Slot Type State Insert Time (ago) P2 Unknown N/A never
Perform the following steps which may help recover the module:
Perform a physical reseat of the FAN module. Allow at least 2 minutes for the system to reinitialize after the fan tray has been removed or replaced. If you are using the model “A903-FAN-E” with dust filter, try cleaning the filter to make sure it is not clogging the FAN modules.
Perform a power cycle of the router and verify if the FAN Tray is detected or not.
If the FAN tray is still reporting “unknown”, a replacement may be required to resolve the issue.
Note: There is a known cosmetic defect which is documented in CSCuu75796 where the FAN tray will be reported as unknown. To avoid erroneous failure messages, allow at least 2 minutes for the system to reinitialize after the fan tray has been removed or replaced.
Failure reported by RSP
Scenario 1: RSP is reported as Unknown
show platform | in R1
Chassis type: ASR-903
Slot Type State Insert Time (ago) R1 A903-RSP1B-55 unknown 1d01h
Execute the command “hw-module slot R1 reload” and verify if the processor is initializing.
If the standby RSP toggles between “booting” and “unknown” state without transitioning to “init,standby” state, the issue is mostly due to missing IOS-XE image in the local bootflash.
Use USB flash drive with a valid IOS-XE image to boot the RSP. If the module continues to be in "unknown" state, perform a physical reseat of the module.
If all the above steps fail, collect console logs from the RSP module and open a service request with TAC.
Scenario 2: Standby RSP toggles between "booting" and "init,standby" state
One of the common reasons for the standby RSP module to exhibit this behavior is because of configuration sync failure between the active and standby RSP. The following commands should be executed to verify this:
If the RSP module continues to remain in a boot loop, check the device logs for any link errors as indicated below. If yes, the RSP module may need to be replaced if a physical reseat does not fix it.
%IOSXE-3-PLATFORM: R0/0: kernel: pciehp 0000:02:07.0:pcie24: Link Training Error occurs %IOSXE-3-PLATFORM: R0/0: kernel: pciehp 0000:02:07.0:pcie24: Failed to check link status
Interface Module(IM) fails to initialize
Whenever a module is installed, the IM transitions through specific states (out of service->inserted->booting->OK). If an Interface Module (IM) in any of the six available slots fails past the booting state, perform the following steps:
Chassis type: ASR-903
Slot Type State Insert Time (ago) 0/4 A900-IMA8S inserted/unkown 00:27:02 (physical)
Reload the affected module using the command "hw-module subslot <slot/subslot> reload" command. Verify if the module has recovered.
ASR903#hw-module subslot 0/1 reload Proceed with reload of module? [confirm] %IOSXE_OIR-6-SOFT_RELOADSPA: SPA(A900-IMA1X) reloaded on subslot 0/1
Physically reseat the module in the same slot. If module stays "unknown", try inserting it in another slot to rule out a faulty line card slot on the chassis.
Observe the logs and watch for any kernel/link errors as indicated below:
%IOSXE-3-PLATFORM: R0/0: kernel:pciehp 0000:02:07.0:pcie24: Link Training Error occurs %IOSXE-3-PLATFORM: R0/0: kernel:pciehp 0000:02:07.0:pcie24: Failed to check link status
The "link training" error basically means that there's a communication error along the Peripheral Component Interconnect Express (PCIe) bus for a particular slot. The PCIe hot plug module is hosted on the RSP engine. Perform a RSP switch-over so that the modules are registered with the PCIe bus of the standby RSP (Route-Switch Processor). If the module recovers post the switchover, the previous active RSP module needs to be replaced.
ASR903#redundancy force-switchover Proceed with switchover to standby RP? [confirm]
Note: For further assistance please open a service request with Cisco Technical Assistance Centre (TAC) with details of the troubleshooting done as well as the ‘show tech-support’ output from the router.