Guest

Cisco ASR 1000 Series Aggregation Services Routers

Troubleshoot Cisco ASR 1000 Series Aggregation Services Routers Crashes

Document ID: 109723

Updated: Mar 17, 2009

   Print

Introduction

This document provides information on how to troubleshoot crashes on the Cisco® ASR 1000 Series Aggregation Services Routers.

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

The information in this document is based on these software and hardware versions:

  • All Cisco ASR 1000 Series Aggregation Services Routers, including the 1002, 1004, and 1006.

  • All Cisco IOS XE Software versions that support the Cisco ASR 1000 Series Aggregation Services Routers.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Cisco ASR 1000 Series Aggregation Services Routers Crashes

Types of Crashes

The Cisco ASR 1000 Series Aggregation Services Routers introduce the Cisco IOS XE Software as their software architecture. Based on the Cisco IOS Software, the Cisco IOS XE Software is a modular operating system built on a Linux kernel on a Route Processor (RP), Embedded Services Processor (ESP), or SPA Interface Processor (SIP). The IOS daemon (IOSD) and other IOS XE processes run on the Linux kernel, so there are several types of crashes shown in Table 1 on the Cisco ASR 1000 Series Aggregation Services Routers.

Table 1 – Types of Crashes

Types of Crashes Module Description
IOSD Crash RP Cisco IOS Software runs as IOSD on a Linux kernel on RP.
SPA Driver Crash SIP Limited Cisco IOS Software runs to control SPA on SIP.
Cisco IOS XE Process Crash RP ESP SIP Several Cisco IOS XE Processes run on a Linux kernel. For example, the chassis manager, the forwarding manager, interface manager, and so on run on RP.
Cisco Quantum Flow Processor (QFP) Microcode Crash ESP The microcode runs on QFP. QFP is a packet forwarding ASICs on ESP.
Linux Kernel Crash RP ESP SIP Linux kernel runs on RP, ESP, and SIP.

Get Information About the Crash

If you encounter an unexpected reload of module, you must make sure that the console output, crashinfo file directory, and core dump file directory are available for troubleshooting. In order to determine the cause, the first step is to capture as much information about the problem as possible. This information is necessary to determine the cause of the problem:

  • Console logs — For more information, see Applying Correct Terminal Emulator Settings for Console Connections.

  • Syslog information — If you have set the router up to send logs to a syslog server, you are able to obtain information about what happened. For details, see How to Configure Cisco Devices for Syslog.

  • show platform — The show platform command displays the status for RPs, ESPs, SPAs, and the power supplies.

  • show tech-support — The show tech-support command is a compilation of many different commands that include show version and show running-config. When a router runs into problems, the Cisco Technical Assistance Center (TAC) engineer usually asks for this information to troubleshoot the hardware issue. You must collect the show tech-support before you do a reload or power-cycle because these actions can cause a loss of information about the problem.

    Note: The show tech-support command does not include the show platform or show logging commands.

  • Boot Sequence Information — The complete bootup sequence if the router experiences boot errors.

  • Crashinfo file (if available) — See the Crashinfo File section.

  • Core Dump file (if available) — See the Core Dump File section.

  • Tracelog file (if available) — On the Cisco ASR 1000 Series Aggregation Services Routers, the trace logs of Cisco IOS XE processes are generated under harddisk:tracelogs (ASR 1006 or ASR 1004) or bootflash:tracelogs (ASR 1002) on the active RP. When the Cisco IOS XE processes crashes, the Cisco TAC engineer usually asks to collect this information in order to troubleshoot the issue.

Crashinfo File

When the IOSD or SPA driver crashes, a crashinfo file is generated under the location shown in Table 2.

Table 2 – Crashinfo File Location

Models Types of Crashes Crashinfo File Location
ASR 1002 IOSD Crash SPA Driver Crash bootflash: on the RP
ASR 1004 ASR 1006 IOSD Crash bootflash: on the RP
SPA Driver Crash harddisk: on the RP

Table 3 displays the crashinfo file names.

Table 3 – Crashinfo File Name

Types of Crashes Crashinfo File Name Example
IOSD Crash crashinfo_RP_SlotNumber_00_Date-Time-Zone crashinfo_RP_00_00_20080807-063430-UTC
SPA Driver Crash crashinfo_SIP_SlotNumber_00_Date-Time-Zone crashinfo_SIP_00_00_20080828-084907-UTC

Core Dump File

When a process crashes, you can find a core dump file under the location shown in Table 4. A core dump is a full copy of the memory image of the process. It is recommended that you save the core dump files until troubleshooting is done. This is because a core dump includes much more information about a crash problem than a crashinfo file, and it is needed for deep investigation. In the case of the Cisco ASR 1002 Router, since it does not have a harddisk: device, a core dump file is generated under bootflash:core/.

Table 4 – Core Dump File Location

Models Core Dump File Location
ASR 1002 bootflash:core/ on the RP
ASR 1004 ASR 1006 harddisk:core/ on the RP

Not only the core dump of RP, but the core dump of ESP or SIP processes are generated under the same location. In the case of the Cisco ASR 1006 Router, you must check the same location of the standby RP because it was the active RP when the problem occurred.

Table 5 – Core Dump File Name

Types of Crashes Core Dump File Name Example
IOSD Crash hostname_RP_SlotNumber_ppc_linux_iosd-_ProcessID.core.gz Router_RP_0_ppc_linux_iosd-_17407.core.gz
SPA Driver Crash hostname_SIP_SlotNumber_mcpcc-lc-ms_ProcessID.core.gz Router_SIP_1_mcpcc-lc-ms_6098.core.gz
IOS XE Process Crash hostname_FRU_SlotNumber_ProcessName_ProcessID.core.gz Router_RP_0_fman_rp_28778.core.gz Router_ESP_1_cpp_cp_svr_4497.core.gz
Cisco QFP Crash hostname_ESP_SlotNumber_cpp-mcplo-ucode_ID.core.gz Router_ESP_0_cpp-mcplo-ucode_042308082102.core.gz
Linux Kernel Crash hostname_FRU_SlotNumber_kernel.core Router_ESP_0_kernel.core

IOSD Crash

The IOS Daemon (IOSD) runs as its own Linux process (ppc_linux_iosd-) on RP. On the dual IOS mode (Cisco ASR 1002 Router and Cisco ASR 1004 Router only), two IOSDs run on the RP.

In order to identify an IOSD crash, find the exception output below on the console. In the case of a Cisco ASR 1002 Router or Cisco ASR 1004 Router crash without dual IOS mode, the box is reloaded. In the case of a Cisco ASR 1002 Router or Cisco ASR 1004 Router crash with dual IOS mode, the IOSD is switched over on the RP. In the case of a Cisco ASR 1006 Router crash, the RP is switched over and a new standby RP is reloaded.

Exception to IOS Thread:
Frame pointer 2C111978, PC = 1029ED60

ASR1000-EXT-SIGNAL: U_SIGSEGV(11), Process = Exec
-Traceback= 1#106b90f504fce8544ce4979667ec2d5d  
   :10000000+29ED60 :10000000+29ECB4 :10000000+2A1A9C 
:10000000+2A1DAC :10000000+492438 :10000000+1C22DC0 
   :10000000+4BBBE0 

Fastpath Thread backtrace: 
-Traceback= 1#106b90f504fce8544ce4979667ec2d5d  
   c:BC16000+C2AF0 c:BC16000+C2AD0 
iosd_unix:BD73000+111DC pthread:BA1B000+5DA0 

Auxiliary Thread backtrace: 
-Traceback= 1#106b90f504fce8544ce4979667ec2d5d  
   pthread:BA1B000+95E4 pthread:BA1B000+95C8 
c:BC16000+D7294 iosd_unix:BD73000+1A83C 
   pthread:BA1B000+5DA0 

PC  = 0x1029ED60  LR  = 0x1029ECB4  MSR = 0x0002D000
CTR = 0x0BD83C2C  XER = 0x20000000
R0  = 0x00000000  R1  = 0x2C111978  R2  = 0x2C057890  R3  = 0x00000034
R4  = 0x000000B4  R5  = 0x0000003C  R6  = 0x2C111700  R7  = 0x00000000
R8  = 0x12B04780  R9  = 0x00000000  R10 = 0x2C05048C  R11 = 0x00000050
R12 = 0x22442082  R13 = 0x13B189AC  R14 = 0x00000000  R15 = 0x00000000
R16 = 0x00000000  R17 = 0x00000001  R18 = 0x00000000  R19 = 0x00000000
R20 = 0x00000000  R21 = 0x00000000  R22 = 0x00000000  R23 = 0x00000001
R24 = 0x00000001  R25 = 0x34409AD4  R26 = 0x00000000  R27 = 0x2CE88448
R28 = 0x00000001  R29 = 0x00000000  R30 = 0x3467A0FC  R31 = 0x2C1119B8

Writing crashinfo to bootflash:crashinfo_RP_00_00_20080904-092940-UTC
Buffered messages: (last 4096 bytes only)
...

When the IOSD crashes, the crashinfo file and core dump file are generated on the RP.

Router#dir bootflash:
Directory of bootflash:

bootflash:crashinfo_RP_00_00_20080904-092940-UTC


Router#dir harddisk:core
Directory of harddisk:core/

3620877  -rw-    10632280   Sep 4 2008 09:31:00 +00:00  
   Router_RP_0_ppc_linux_iosd-_17407.core.gz

SPA Driver Crash

The SPA drivers have limited IOS functions for SPA control and run on SIP because of the mcpcc-lc-ms process and one of the Cisco IOS XE processes. You can identify the SPA driver crash if you find that the process mcpcc-lc-ms is held down. After the SPA driver crashes, the SPA reloads.

Aug 28 08:52:12.418: %PMAN-3-PROCHOLDDOWN: SIP0: 
   pman.sh:  The process mcpcc-lc-ms has been helddown (rc 142)
Aug 28 08:52:12.425: %ASR1000_OIR-6-REMSPA: 
   SPA removed from subslot 0/0, interfaces disabled
Aug 28 08:52:12.427: %SPA_OIR-6-OFFLINECARD: 
   SPA (SPA-1X10GE-L-V2) offline in subslot 0/0
Aug 28 08:52:13.131: %ASR1000_OIR-6-INSSPA: 
   SPA inserted in subslot 0/0
Aug 28 08:52:19.060: %LINK-3-UPDOWN: SIP0/0: 
   Interface EOBC0/1, changed state to up
Aug 28 08:52:20.064: %SPA_OIR-6-ONLINECARD: 
   SPA (SPA-1X10GE-L-V2) online in subslot 0/0

When the SPA driver crashes, the crashinfo file and core dump file are generated on the RP.

Router#dir harddisk:
Directory of harddisk:/

   14  -rw-      224579  Aug 28 2008 08:52:06 +00:00  
   crashinfo_SIP_00_00_20080828-085206-UTC

Router#dir harddisk:core
Directory of harddisk:/core/

4653060  -rw-     1389762  Aug 28 2008 08:52:12 +00:00  
   Router_SIP_0_mcpcc-lc-ms_6985.core.gz

Cisco IOS XE Process Crash

The Cisco IOS XE processes run on a Linux kernel on RP, ESP, and SIP. Table 6 lists their main processes. If a crash occurs, the module reloads.

Table 6 – Main Cisco IOS XE Processes

Title Process Name Module
Chassis Manager cmand RP
cman_fp ESP
cmcc SIP
Environmental Monitoring emd RP, ESP, SIP
Forwarding Manager fman_rp RP
fman_fp_image ESP
Host Manager hman RP, ESP, SIP
Interface Manager imand RP
imccd SIP
Logging Manager plogd RP, ESP, SIP
Pluggable Service psd RP
QFP Client Control Process cpp_cr_svr ESP
QFP Driver Process cpp_driver ESP
QFP HA Server cpp_ha_top_level_server ESP
QFP Client Service Process cpp_sp_server ESP
Shell Manager smand RP

In case the cpp_cp_svr process crashes on an ESP of the Cisco ASR 1006 Router, this message can appear on the console.

Jan 24 23:37:06.644 JST: %PMAN-3-PROCHOLDDOWN: 
   F0: pman.sh:  The process cpp_cp_svr has been helddown (rc 134)
Jan 24 23:37:06.727 JST: %PMAN-0-PROCFAILCRIT: F0: pvp.sh:  
   A critical processcpp_cp_svr has failed (rc 134)
Jan 24 23:37:11.539 JST: %ASR1000_OIR-6-OFFLINECARD: 
   Card (fp) offline in slot F0

You can find the core dump file on harddisk:core/.

Router#dir harddisk:core
Directory of harddisk:/core/

1032194  -rw-    38255956  Jan 24 2009 23:37:06 +09:00  
   Router_ESP_0_cpp_cp_svr_4714.core.gz

The tracelog of the process can include useful outputs.

Router#dir harddisk:tracelogs/cpp_cp*
Directory of harddisk:tracelogs/

4456753  -rwx       24868  Jan 24 2009 23:37:15 +09:00  
   cpp_cp_F0-0.log.4714.20090124233714

Cisco Quantum Flow Processor Microcode Crash

Cisco designed the Cisco Quantum Flow Processor as both hardware and software architecture. The first generation resides on two pieces of silicon; later generations can be single-chip solutions that adhere to the same software architecture described here. The term "Cisco QuantumFlow Processor" alone refers to the overall hardware and software architecture of the network processor.

When the QFP ucode crashes, ESP reloads. In order to identify the QFP ucode crash, find this output on the console or the core dump file of cpp-mcplo-ucode:

Dec 17 05:50:26.417 JST: %IOSXE-3-PLATFORM: F0: 
   cpp_cdm: CPP crashed, core file /tmp/corelink/
   Router_ESP_0_cpp-mcplo-ucode_121708055026.core.gz
Dec 17 05:50:28.206 JST: %ASR1000_OIR-6-OFFLINECARD: 
   Card (fp) offline in slot F0

You can find the core dump file.

Router#dir harddisk:core
Directory of harddisk:core/

3719171  -rw-     1572864  Dec 17 2008 05:50:31 +09:00 
   Router_ESP_0_cpp-mcplo-ucode_121708055026.core.gz

Linux Kernel Crash

On the Cisco ASR 1000 Series, a Linux kernel runs on RP, ESP, and SIP. When a Linux kernel crashes, the module reloads without the crash output. After it boots up again, you can identify the Linux kernel crash if you find the core dump file of the Linux kernel. The size of kernel core file can be more than 100MByte.

Router#dir harddisk:core
Directory of harddisk:/core/

393230  ----   137389415  Dec 19 2008 01:19:40 +09:00  
   Router_RP_0_kernel_20081218161940.core

Information to Collect if You Open a TAC Service Request

If you still need assistance after you follow the steps above and want to open a service request with the Cisco TAC, be sure to include this information to troubleshoot a router crash:
  • Troubleshooting performed before you opened the service request
  • The show platform output (if possible, in enable mode)
  • The show logging output or console captures, if available
  • The show tech-support output (if possible, in enable mode)
  • The crashinfo file (if present)
  • The core dump file (if present)
Attach the collected data to your service request in non-zipped, plain text format (.txt). You can attach information to your service request if you upload it with the TAC Service Request tool (registered customers only) . If you cannot access the Service Request tool, you can attach the relevant information to your service request if you send it to attach@cisco.com with your case number in the subject line of your message.

Note: Do not manually reload or power-cycle the router before you collect this information unless you are required to troubleshoot a router crash because this can cause important information to be lost that is needed to determine the root cause of the problem.

Related Information

Updated: Mar 17, 2009
Document ID: 109723