Guest

Cisco 12000 Series Routers

Field Notice: QOC48/OC192 SBEs/MBEs


September 23, 2003



Products Affected

Product

Top Assembly

Printed Circuit Assembly

Comments

Part Number

Rev.

Part Number #

Rev.

4OC48/POS-LR-FC(=)

800-08878-02

A0

73-4203-04

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

4OC48/POS-LR-SC(=)

800-07928-02

A0

73-4203-04

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

4OC48/POS-SR-FC(=)

800-08876-02

A0

73-4203-04

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

4OC48/POS-SR-SC(=)

800-05517-02

A0

73-4203-04

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

OC192/POS-IR-FC(=)

800-14692-02

A0

73-4202-03

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

OC192/POS-IR-SC(=)

800-07900-02

A0

73-4202-03

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

OC192/POS-SR-FC(=)

800-14691-02

A0

73-4202-03

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

OC192/POS-SR-SC(=)

800-05515-02

A0

73-4202-03

A0

Earlier Top Assembly Numbers (and revisions) are also affected.

OC192/POS-SR2-SC(=)

800-17536-01/02

C0

73-7085-02

C0

Earlier Top Assembly Numbers (and revisions) are also affected.

4OC48E/POS-LR-SC=

800-18674-02

A

? ?

Edge 4 Port OC-48c/STM-16c SONET/SDH LR with SC

4OC48E/POS-SR-SC=

800-05517-04

A

? ?

Edge 4 Port OC-48c/STM-16c SONET/SDH SR with SC

OC192E/POS-IR-SC=

800-18670-01/02

B0

? ? ?

OC192E/POS-SR-SC=

800-18668-01/02

B0

? ? ?

Problem Description

Multiple SDRAM_SBE: Error, Single-Bit Errors (SBE) or SDRAM_MBE: Error, Multi-Bit Errors(MBE) are logged for a Cisco 12000 4OC48 or OC192 line card.

This problem affects less than one percent of the total install base for Cisco 12000 4OC48 or OC192 line cards.

Background

A single bit error (SBE) is a single bit of data that is incorrect when read from memory. A multi-bit error (MBE) is when more than one bit is incorrect. Both OC192 and 4OC48 line cards have Error Checking and Correcting (ECC) circuitry to detect and handle error conditions for all on-board memories.

SBE errors are automatically corrected by IOS without disrupting traffic. MBE errors are detected by IOS and will either force the line card to reset or possibly crash depending on the IOS revision running. If an MBE error triggers a line card crash, the line card will be reloaded and brought back into normal operation by the route processor.

Both SBE's and MBE's are rare. MBE's very rarely happen, since they require multiple bits in the same word to be incorrect. Less than one percent of the total field population of 4xOC48 and OC192 line cards listed above are affected by this problem.

Example of an SBE failure:

SLOT 6:Jul 19 07:37:34: %TX192-3-SDRAM_SBE: Error=0x2 - DIMM1 Syndrome=0x7600

Example of an MBE failure:

SLOT 5:Jul 25 16:58:51: %MCC192-3-SDRAM_MBE: Error=0x808 - DIMM0 Syndrome=0x18000000

Problem Symptoms

Field diagnostics on 4OC48 or OC192 cards indicate multiple SDRAM_SBE: Error or SDRAM_MBE: Error messages.

Workaround/Solution

This problem affects less than one percentof the total install base for Cisco 12000 4OC48 or OC192 line cards. If multiple SBE or MBE errors are seen, you should reseat the line card.

The card reseat is guaranteed to fix transient soft errors for all IOS versions. The newest Cisco 12000 IOS versions incorporate a rewrite of the synchronous dynamic random access memory (SDRAM) location to remove the potential for this error being caused by software.

If card reseat does not fix the problem you can request replacement cards through return material authorization (RMA) using the advance replace failure notice (ARFN) code.

How To Identify Hardware Levels

Using the Command Line Interface (CLI)

Use the show diag slot command to view the line card 800-level part number.

RTR12410-2#show diag 6
SLOT 6  (RP/LC 6 ): 4 Port Packet Over SONET OC-48c/STM-16 Single Mode/SR SC cor
MAIN: type 67,  800-5517-01 rev dev 0

Physical Inspection

The line card serial number, deviation number, top assembly number (TAN) and printed circuit board (PCB) assembly number can all be found on stickers which are located on the line card PCB.

How to Identify SBE or MBE Errors.

You can check line card memory for MBE or SBE errors by using field diagnostics

Example below:

FDIAG_STAT_IN_PROGRESS(5): test #12 TX SDRAM Marching Pattern
FD 5> RIM:
FD 5> TX Registers
FD 5> INT_CAUSE_REG = 0x00000680
FD 5> Unexpected L3FE Interrupt occured.
FD 5> ERROR: TX BMA Asic Interrupt Occured
FD 5> *** 0-INT: External Interrupt ***
FDIAG_STAT_DONE_FAIL(5) test_num 12, error_code 1
Field Diagnostic: ****TEST FAILURE**** slot 5: last test run 12,
TX SDRAM Martching Pattern, error 1
Field Diag eeprom values: run 5 fail mode 1 (TEST FAILURE) slot 5 last test failed was 12, error code 1

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods: