Guest

Cisco 10700 Series Routers

Field Notice: Cisco 10720 Uplink Card Resets


Revised March 27, 2002

January 4, 2002


Products Affected

Product

Top Assembly

Printed Circuit Assembly

Comments

Part Number

Rev.

Part Number

Rev.

10720-IR-LC

800-09554-02

A0

73-5346-03

A0

Except with deviation D040729 or higher

10720-LR1-LC

800-09018-02

A0

73-5347-03

A0

Except with deviation D040729 or higher

10720-SR-LC

800-09017-02

A0

73-5345-03

A0

Except with deviation D040729 or higher

Problem Description

A design problem on the uplink module causes the 10720 router to crash when packets having a specific data pattern are transmitted.

Background

A design problem on the uplink module exposes the 10720 router to a crash upon receipt of a specific type of data pattern. This type of traffic is unlikely in a production network but may occur during lab testing.

Problem Symptoms

The message below spuriously appears on some systems, causing a system crash and crashinfo file to be created on flash. Some of the values may change depending on the exact crash location.

2d00h: %Camr-3-MISTRAL_IO_ERROR: MISTRAL_IO_BUS_INT_MASK_LO: 28, 
Error Address 
= 0x16000BF8, IO status = 0x440 
-Traceback= 50222120 501D3690 
2d00h: %Camr-3-MISTRAL_TIMEOUT_ERROR: MISTRAL_SYSAD_TIMEOUT_DPATH_INT_MASK_HI: 
39, sysad_dpath_cmd_log = 0x10020904, 
sysad_dpath_addr_log = 0x10020914,sysad_dpath_parity_log = 
0x10020924,sysad_dpath_data_log = 0x10020934 
-Traceback= 502221CC 501D3690 
2d00h: %Camr-3-INTPROC: Process Traceback= 50041824 50045B10 50192574 501B232C 
501B2318 
-Traceback= 50221E90 5022233C 501D3690

Depending on the settings in the Cisco 10720 configuration register, the Cisco 10720 may return to service on its own or it may crash into and stay in ROMMON and require user intervention to bring it back into service. A value of 0x2102 (which is the default value) will allow the Cisco 10720 to reboot without user intervention. A value of 0x0 will cause the Cisco 10720 to stay in ROMMON once it has crashed. To view the current value of the configuration register, use the show version command. In the example shown below, the configuration register has a value of 0x0 which will cause the Cisco 10720 to stay in ROMMON once it has crashed.

lab1#show version
Cisco Internetwork Operating System Software 
IOS (tm) 10700 Software (C10700-P-M), Experimental Version 12.0(20011101:144431)
 [tpaiemen-yb_isp 196]
Copyright (c) 1986-2001 by cisco Systems, Inc.
Compiled Fri 14-Dec-01 16:20 by tpaiemen
Image text-base: 0x50010960, data-base: 0x50672000
 
ROM: System Bootstrap, Version 12.0(20010529:144545) [yuwang-rommon1 149],
 DEVELOPMENT SOFTWARE

lab1 uptime is 2 days, 3 hours, 19 minutes
System returned to ROM by power-on
Running default software

cisco C10720 (R5000) processor (revision 0xFF) with 256000K/6144K bytes of memory.
R527x CPU at 200Mhz, Implementation 40, Rev 10.0
Last reset from power-on
Toaster processor tmc0 is running.
Toaster processor tmc1 is running.
1 one-port OC48 SONET based SRP controller.
1 24 Port 100 Mbps Fast Ethernet TX controller.
24 FastEthernet/IEEE 802.3 interface(s)
1 SRP network interface(s)
509K bytes of non-volatile configuration memory.

16384K bytes of Flash internal SIMM (Sector size 512KB).
49152K bytes of Flash internal SIMM (Sector size 512KB).
Configuration register is 0x0

Workaround/Solution

Replace the defective hardware.

As of Dec 7, 2001 the uplink cards shown above in the Products Affected section that were manufactured under the deviation D040729 are guaranteed to be free of this problem. Refer to How to Identify Hardware Levels below for instructions on how to view the deviation on an in-service product. Products shipped from Logistics may still exhibit this problem at this time.

As of Jan 7, 2002, the line cards shown in Products Affected above that were manufactured under the Engineering Change Order (ECO) E047436 are also guaranteed to be free of this problem. Refer to How to Identify Hardware Levels below for instructions on how to view the version on an in-service product. Products shipped from Logistics may still exhibit this problem at this time.

Customers who wish to proactively replace one or more of their products which are affected by the problem described in this field notice with units which do not exhibit this problem should call the TAC and request a return material authorization (RMA) coded ARFN (Administrative Request - Field Notice), referencing this field notice. This forces fulfillment from manufacturing instead of Logistics.

Customers who wish to request an RMA for an affected product due to this or some other failure but who want to ensure they receive a replacement which does not exhibit this problem should also request their RMA be coded ARFN, referencing this field notice. Because this forces fulfillment from San Jose, California manufacturing instead of Logistics, it may not be possible to meet 2-hour, 4-hour, or next business day service level agreements (SLAs).

The ECO inititates a global purge of Service Logistics inventory. A global purge of Service Logistics inventory takes between 6 and 12 months. Once the global purge of Service Logistics inventory has been completed, customers can safely RMA product using Service Logistics inventory (will no longer need to code RMAs ARFN). This field notice will be updated when the global purge of Service Logistics inventory has been completed.

To prevent the 10720 from crashing into and staying in ROMMON, change the configuration register and reset the router as shown in the example below.

lab1#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
lab1(config)#config-register 0x2102
lab1(config)#end
lab1#show version
Cisco Internetwork Operating System Software 
IOS (tm) 10700 Software (C10700-P-M), Experimental Version 12.0(20011101:144431)
 [tpaiemen-yb_isp 196]
Copyright (c) 1986-2001 by cisco Systems, Inc.
Compiled Fri 14-Dec-01 16:20 by tpaiemen
Image text-base: 0x50010960, data-base: 0x50672000

ROM: System Bootstrap, Version 12.0(20010529:144545) [yuwang-rommon1 149],
 DEVELOPMENT SOFTWARE

lab1 uptime is 2 days, 3 hours, 20 minutes
System returned to ROM by power-on
Running default software

cisco C10720 (R5000) processor (revision 0xFF) with 256000K/6144K bytes of memory.
R527x CPU at 200Mhz, Implementation 40, Rev 10.0
Last reset from power-on
Toaster processor tmc0 is running.
Toaster processor tmc1 is running.
1 one-port OC48 SONET based SRP controller.
1 24 Port 100 Mbps Fast Ethernet TX controller.
24 FastEthernet/IEEE 802.3 interface(s)
1 SRP network interface(s)
509K bytes of non-volatile configuration memory.

16384K bytes of Flash internal SIMM (Sector size 512KB).
49152K bytes of Flash internal SIMM (Sector size 512KB).
Configuration register is 0x0 (will be 0x2102 at next reload)
lab1# reset

How To Identify Hardware Levels

Two conditions must be met for an uplink card to be potentially affected by this problem.

  • The uplink card must have an 800-level part number within the affected range listed above.

  • The uplink card must not have been built to deviation D040729.

These conditions can be checked either on-line using the CLI or by physical inspection.

Using the Command Line Interface (CLI)

Checking the 800-level Part Number

Use the show diags command to view the uplink card 800-level part number. In the example shown below the uplink card has an 800-level part number of 800-09017-02 and a revision of A0 which falls within the affected range listed above in the Products Affected section. This is shown in the following example:

Router#show diags 1 
SLOT 1: 1 one-port OC48 SONET based SRP controller. 
TX FPGA ver.: 0x0012 
RX FPGA ver.: 0x0012 
RAC A ver...: 0x0004 
RAC B ver...: 0x0004 
Framer A ver: 0x0006 
Framer B ver: 0x0006 
PCA (73) Item Num: 73-05345-03 
PCA (73) Item Num - Rev: A0 
Fab (28) Ver: 3 
Unit (800) Item Num: 800-09017-02 
Unit (800) Item Num - Rev: A0
Serial Number: CAT0532000T 
Optical Hardware Configuration: Short Reach (SR)

Checking the Deviation Number

Use the show hardware uplink idprom command to view the uplink card deviation number. The location of the deviation number is shown in bold red letters in the following example. The deviation number is displayed in hexadecimal. A hex value of 00 9F 19 equates to a deviation number of 040729. In the example shown below the uplink card has a deviation of 00 9F 19 in hex (which is 040729 in decimal and is therefore not affected by the problem described by this field notice).

Router#show hardware uplink idprom 
Uplink - IDPROM 
00 04 00 00 00 49 00 14 E1 01 50 02 00 00 00 00
03 20 00 23 39 01 50 00 9F 19 00 00 00 00 00 00
43 41 42 30 35 30 37 48 33 54 4C 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
FD 01 10 DF AB 12 34 CD FF FF FF FE 00 00 00 00
50 8A 74 D8 50 21 95 A4 50 F2 06 1C 50 F2 04 E0
80 00 00 1A 00 00 00 01 00 00 00 00 00 00 00 00
50 8E BB 84 50 C0 02 B8 00 00 00 04 00 49 00 14

Physical Inspection

The uplink card 800-level part number can be found on a sticker which is located on the PCB. Refer to the picture below.

Note: The 800-level part number sticker may not always be in the location shown below.

fn17012_goii5s.gif

The uplink card deviation number can be found on a sticker which is located on the PCB. Refer to the picture below.

Note: The deviation number sticker may not always be in the location shown below.

fn17012_gokcvf.gif

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Product Alert Tool - Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.