Guest

Cisco 7600 Series Routers

Field Notice: FN - 63553 - C7600 Might Fail to Boot Up after a Software Upgrade or Power Cycle – Fix on Failure

Field Notice: FN - 63553 - C7600 Might Fail to Boot Up after a Software Upgrade or Power Cycle – Fix on Failure

Revised March 27, 2014
March 3, 2014


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
1.2
27-MAR-2014
Fixed Products Affected Section
1.1
06-MAR-2014
Fixed Workaround/Solution Section
1.0
03-MAR-2014
Initial Public Release

Products Affected

Products Affected
WS-IPSEC-3(=)
WS-X6582-2PA(=) 
76-ES+T-20G(=)
76-ES+T-2TG(=)
76-ES+T-40G(=)
76-ES+T-4TG(=)
76-ES+XC-20G3C(=)
76-ES+XC-20G3CXL(=)
76-ES+XC-40G3C(=) 
76-ES+XC-40G3CXL(=) 
76-ES+XT-2TG3C(=)
76-ES+XT-2TG3CXL(=)
76-ES+XT-4TG3C(=)
76-ES+XT-4TG3CXL(=)
7600-ES+20G3CXL(=)
7600-ES+2TG3C(=)
7600-ES+40G3C(=)
7600-ES+40G3CXL(=)
7600-ES+4TG3C(=) 
7600-ES+4TG3CXL(=)
7600-ES20-10G3C(=)
7600-ES20-10G3CXL(=)
7600-ES20-GE3C(=)
7600-ES20-GE3CXL(=)
7600-SIP-200(=)
7600-SIP-400(=)
7600-SIP-600(=)
RSP720-3C-10GE(=)
RSP720-3C-GE(=)
RSP720-3CXL-10GE(=)
RSP720-3CXL-GE(=)
7600-ES+20G3C(=) 
7600-ES+2TG3CXL(=)
7600-ES20-GE3C(=)
7600-SSC-400(=)

Problem Description

C7600 might fail to boot up after a software upgrade or other user action where the board requires a power cycle operation.

Background

Cisco has been working with some customers on an issue related to memory components manufactured by a single supplier between 2005 and 2010. These memory components are widely used across the industry and are included in a number of Cisco products.

Although the majority of Cisco products using these components are experiencing field failure rates below expected levels, some components may fail earlier than anticipated. A handful of our customers have recently experienced a higher number of failures, leading us to change our approach to managing this issue.

While other vendors have chosen to address this issue in different ways, Cisco believes its approach is the best course of action for its customers. Despite the cost, we are demonstrating that we always make customer satisfaction a top priority. Customers can learn more about this topic at our Memory Component Issue web page.

A degraded component will not affect the ongoing operation of a device, but will be exposed by a subsequent power cycle event. This event will result in a hard failure of the device, which cannot be recovered by a reboot or additional power cycle. For these reasons, additional caution is recommended for operational activities requiring the simultaneous power cycling of multiple devices. This issue has been observed most commonly on devices that have been in service for 24 months or more.

Problem Symptoms

If the suspected hardware has been in operation for approximately 24 months, the product hardware might fail to boot up due to memory failure during a power cycle event. This is caused by one or more of these actions:

  • Upgrade the software
  • Reload the entire product
  • Reload after installation
  • Online Insertion Removal/Replacement (OIR)

Note: This issue does not affect boards while the boards are in operation. The board failure might occur after one or more of the actions listed are executed.

The card symptoms observed are shown here:

The card fails to boot up. One of these symptoms might be observed in the syslog:

*May 16 02:59:54.575: %PM_SCP-SP-1-LCP_FW_ERR: System resetting module 1 to 
recover from error: Linecard received system exception

*May 16 02:59:54.575: %OIR-SP-3-PWRCYCLE: Card in module 1, is being power-cycled Off (Module Reset due to exception or user request)

Alternatively, the card might crash repeatedly with this error reported in the syslog:

%EARL-DFC<n>-2-PATCH_INVOCATION_LIMIT: 10 Recovery patch invocations in the
last 30 secs have been attempted. Max limit reached

Offline diagnostic test item ST4 captures the suspect memory chip.

Workaround/Solution

Fix on Failure Replacement Guidelines: Request Memory module on failure via RMA process.

For other Product Affected cards, request a spare PID through RMA process on failure.

In some C7600 cards, the memory module can be replaced in customer sites and are listed in the table below.

PID
Action
PID Replacement
Quantity
7600-SIP-200(=)
Replace memory module on failure
MEM-SIP-200-512M=
1
MEM-SIP-200-1G=
1
WS-X6582-2PA(=)
Replace memory module on failure
MEM-CC-WAN-512M=
1
MEM-CC-WAN-256M=
1

See Memory Installation and Removal for instructions to replace the Memory modules on this product.

How To Identify Hardware Levels

Enter the show inventory command in order to obtain the Product ID (PID). If the CLI is not available, physically inspect for the PID.

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.