Guest

Cisco ASR 1000 Series Aggregation Services Routers

Field Notice: FN - 63764 - Some ASR1000 Products Might Fail to Boot Up After a Power Cycle - Fix on Failure

Field Notice: FN - 63764 - Some ASR1000 Products Might Fail to Boot Up After a Power Cycle - Fix on Failure

Revised June 6, 2014
Revised April 9, 2014
March 3, 2014


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
1.2
06-JUNE-2014
Workaround/Solution Updated
1.1
09-APR-2014
Workaround/Solution Updated
1.0
03-MAR-2014
Initial Public Release

Products Affected

Products Affected
ASR1001(=)
ASR1000-ESP40(=) 
ASR1002-F(=) 
ASR1000-ESP10(=) 
ASR1000-ESP20(=)
ASR1000-ESP5(=) 

Problem Description

Some Aggregation Services Router (ASR) 1000 products (see the Products Affected section) might fail to boot up after a power cycle operation.

Background

Cisco has been working with some customers on an issue related to memory components manufactured by a single supplier between 2005 and 2010. These memory components are widely used across the industry and are included in a number of Cisco products. 

Although the majority of Cisco products using these components are experiencing field failure rates below expected levels, some components may fail earlier than anticipated. A handful of our customers have recently experienced a higher number of failures, leading us to change our approach to managing this issue. 

While other vendors have chosen to address this issue in different ways, Cisco believes its approach is the best course of action for its customers. Despite the cost, we are demonstrating that we always make customer satisfaction a top priority. Customers can learn more about this topic at Memory Component Issue web page.

A degraded component will not affect the ongoing operation of a device, but will be exposed by a subsequent power cycle event. This event will result in a hard failure of the device, which cannot be recovered by a reboot or additional power cycle. For these reasons, additional caution is recommended for operational activities requiring the simultaneous power cycling of multiple devices. This issue has been observed most commonly on devices that have been in service for 24 months or more.

Problem Symptoms

If the suspected ASR 1000 hardware has been in operation for approximately 24 months, the product hardware might fail to boot up due to memory failure during a power cycle event. This is caused by one or more of these actions:

  • Power cycle the chassis
  • Online Insertion Removal/Replacement (OIR)

Note: This issue does not affect modules while in operation. The memory failure in base ASR 1000 platforms affects only the hardware cryptographic capability of the system. The board failure might occur after one or more of the actions listed is executed.

These are the linecard symptoms that are observed. Error messages associated with this failure:

CDT: %IMGR-0-FIPS_FMFP_N2_SEVERE_ERR_FAIL: F[x]: fman_fp_image:

Cryptographic coprocessor severe failure: RSA operation error 

CDT: %ASR1000_OIR-6-OFFLINECARD: Card (fp) offline in slot F[x].
May 27 23:39:16 CDT: %ASR1000_RP_ALARM-6-INFO: ASSERT MAJOR module
F[x] Unknown state 

CDT: %ASR1000_RP_ALARM-6-INFO: ASSERT CRITICAL module R[x] No Working ESP 

CDT: %CPPHA-3-FAULT: F0: cpp_ha: CPP:0 desc:CPP Client process failed:
FMAN-FP det:HA class:CLIENT_SW sev:FATAL id:1 cppstate:RUNNING res:UNKNOWN
flags:0x0 cdmflags:0x0 

CDT: %IOSXE-6-PLATFORM: F[x]: cpp_ha: Shutting down CPP MDM while client(s)
still connected 

CDT: %PMAN-3-PROCHOLDDOWN: F[x]: pman.sh: The process fman_fp_image has been
helddown (rc 134) 

CDT: %PMAN-3-PROCHOLDDOWN: F[x]: pman.sh: The process cpp_ha_top_level_server
has been helddown (rc 69) 

CDT: %PMAN-0-PROCFAILCRIT: F[x]: pvp.sh: A critical process fman_fp_image
has failed (rc 134)

Workaround/Solution

There are two workarounds that are available for this problem. Refer to the table in order to obtain the necessary action for your configuration:

ASR 1000 Configuration
Action
FIPS is not enabled and IPSec is not configured
Download a SW Patch
FIPS is enabled or IPSec is configured
Fix on Failure Replacement Guidelines

Download a SW Patch 

If the ASR 1000 does not have FIPS mode enabled and it is not configured for IPSec, then it is possible to fix the issue with a software (SW) patch. Complete these steps in order to verify these configurations: 

  1. Enter this command in order to verify if FIPS mode is enabled:
    ASR1000#show romvar | inc FIPS
    FIPS_MODE = 1
    If FIPS_MODE = 1 is displayed, then FIPS mode is enabled and this workaround is not an option. If another number (or no number) is displayed, then FIPS mode is not enabled.

  2. Complete these steps in order to verify if IPSec is configured:

    1. Determine if a crypto map is applied to a physical interface:
      interface Serial0  
      ip address 20.20.20.20 255.255.255.0
      no ip mroute-cache
      no fair-queue
      crypto map armadillo
    2. Determine if Tunnel Protection is applied to the tunnel interface:
      interface Tunnel0  
      bandwidth 1000
      ip address 10.0.0.1 255.255.255.0
      <snip>
      tunnel protection ipsec profile vpnprof
      <snip>
      OR
      interface virtual-template 1 type tunnel  
      bandwidth 1000
      ip address 10.0.0.1 255.255.255.0
      <snip>
      tunnel protection ipsec profile vpnprof
      <snip>
      If any of these configuration snippets reflect your router configuration, then IPSec is configured and this workaround is not an option.

If FIPS mode is not enabled and IPSec is not configured, then it is possible to resolve this problem with a software patch. Use this information in order to download the minimum version software patch: 

For ESP10, ESP20, or ESP40 (CSCuc82634): 

  • 15.2(4)S3/XE3.7.3S 
  • 15.3(1)S2/XE3.8.2S 
  • 15.3(2)S/XE3.9.0S 
  • 15.3(3)S/XE3.10.0S 

For ESP5, ASR1002-F, or ASR1001 (CSCui03023):

  • 15.2(04)S05/XE3.7.5S 
  • 15.3(03)S01/XE3.10.1S 
  • 15.4(01)S/XE3.11.0S 

With a CDC user ID, you can locate the software patch in the Download Software Cisco article. Dependent upon the product that is affected, select the product type and search for the Cisco IOS®-XE version that is listed in this document. Download the software and apply the patch to your router.

Fix on Failure Replacement Guidelines

Request RMA product through normal service support channels.

PID
Action
Replacement PID
Quantity
ASR1000-ESP5
Replace card
ASR1000-ESP5=
1
ASR1000-ESP10
Replace card
ASR1000-ESP10=
1
ASR1000-ESP20
Replace card
ASR1000-ESP20=
1
ASR1000-ESP40
Replace card
ASR1000-ESP40=
1
ASR1001
Replace system
ASR1001=
1
ASR1002-F
Replace system
ASR1002-X=
1

How To Identify Hardware Levels

Enter the show inventory command in order to obtain the Product ID (PID). If the CLI is not available, physically inspect the device in order to locate the PID.

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.