Cisco 12000 line cards may reset after single event upset (SEU) failures. This field notice highlights some of those failures, why they occur, and what work arounds are available.
Unlike hard errors, soft errors are spontaneous, non-reoccurring or transient, and non-reproducible. The error is called "soft" because:
The device functions normally after data is restored.
The transient error is present in data stored in memory devices on line cards.
The error is caused by system noise or by ionizing radiation.
SEU failures are often caused by the following:
Alpha particles emitted by radioactive packaging and wafer processing materials on synchronous random-access memory (SRAM) and dynamic random-access memory (DRAM) products.
Thermal neutron from cosmic radiation of energy less then 15ev.
Terrestrial high energy cosmic particles, neutrons, protons, pions and muons.
The chance for single event upset (SEU) failures in memory devices increases as densities rise and core voltages drop.
IOS performs error recovery which is the ability to detect soft errors and ensure they don't adversely affect product performance. The methods used by IOS on Cisco 12000 include:
ECC (Error Correction Code)
Replacement from backup data sources.
Hitless switchover to redundant line cards.
Cards are showing memory parity errors or application-specific integrated circuit (ASIC) errors which may have resulted in a card reload with a two to three minute recovery. Data is passing normally after the card reloaded.
The Cisco IOS® Software Release 12.0(25)S and later include several SEU error recovery improvements for the Cisco 12000 series.
IOS releases 12.0(21)S6, 12.0(22)S4, 12.0(23)S2, 12.0(21)S1 and later include SEU failure fixes for Cisco 12000 Engine 3 based line cards. These improvements reduce the chance of card reload due to SEU failures, reduce reload time if it occurs, and provides better text messaging for the failure types.
For customers using Engine 3, 4, or 4+ based line cards, these IOS improvements have significantly reduced error recovery time to under three seconds.
Note: Customers should not replace hardware after a single SEU failure. The linecard should be monitored for further instances. If additional failures occur, contact Cisco Technical Support.
To follow the bug ID link below and see detailed bug information, you must be a registered user and you must be logged in.
| DDTS |
Description |
|---|---|
| CSCea34650 (registered customers only) |
ISE: Parity Error recovery time |
| CSCea35822 (registered customers only) |
ISE: Line card crash during parity error injection |
| CSCea35881 (registered customers only) |
ISE: No consistancy between Tx and Rx SRAM64 fault injection |
| CSCeb13025 (registered customers only) |
ISE: Alpha error results in card crash |
| CSCea57600 (registered customers only) |
ISE: Line card cpu hit high utilization with low pps packet punt to cpu |
Cisco IOS Software Releases earlier than 12.0(25)S for the 12000 series are more likely to have SEU error recovery problems.
To download IOS for upgrade of your Cisco 12000 go to the Cisco Software Center on Cisco.com.
For additional information about Single Event Upset (SEU) failures, you can go to one of the following Cisco on-line documents:
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
Product Alert Tool - Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.