Guest

Cisco UCS B-Series Blade Servers

Field Notice: FN 63651 - UCS-B M3-Series Blade Servers Might Get Memory Errors Due to Voltage Regulator Setting – Firmware Update Required

Field Notice: FN 63651 - UCS-B M3-Series Blade Servers Might Get Memory Errors Due to Voltage Regulator Setting – Firmware Update Required

June 21, 2013


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
1.0
21-JUN-2013
Initial Public Release

Products Affected

Products Affected
CBI-B200-M3-D 
UCSB-B200-M3
UCSB-B200-M3-CH 
UCSB-B200-M3-D 
UCSB-B200-M3-U 
UCSB-B200-M3=
UCSB-B22-M3
UCSB-B22-M3-CH 
UCSB-B22-M3-D 
UCSB-B22-M3-U 
UCSB-B22-M3=
UCSB-B420-M3 
UCSB-B420-M3-CH 
UCSB-B420-M3-D
UCSB-B420-M3-U 
UCSB-B420-M3= 
UCSB-DBUN-B200-301
UCSB-DBUN-B200-302
UCSB-DBUN-B200-303
UCSB-DBUN-B200-304 
UCSB-DBUN-B200-305
UCSB-DBUN-B22-361
UCSB-DBUN-B22-362 
UCSB-DBUN-B22-363 
UCSB-DBUN-B22-364 
UCSB-DBUN-B420-341
UCSB-DBUN-B420-342 
UCSB-DBUN-B420-343
UCSB-DBUN-B420-344 
UCUCS-EZ-B200M3
UCS-EZ-B200M3-256 
UCS-EZ-B200M3-384
UCS-EZ-ENSC-B200 
UCS-EZ-ENSP-B200M3
UCS-EZ-ENTS-B200M3 
UCS-EZ-ENTS-B22 
UCS-EZ-ENTS-B22M3
UCS-EZ-ENTS-B2M3
UCS-EZ-ENTV-B200 
UCS-EZ-ENTV-B200M3 
UCS-EZ-ENTV-B2M3
UCS-EZ-ENTV-B420M3 
UCS-EZ-ENVP-B200M3 
UCS-EZ-PERF-B200
UCS-EZ-PERF-B200M3 
UCS-EZ-PERF-B2M3
UCS-EZ-PR-FIO-B200 
UCS-EZ-VDI-B200PK
UCS-SP-ENSP-B200M3 
UCS-SP-ENTS-B200M3 
UCS-SP-ENTS-B22 
UCS-SP-ENTS-B22M3 
UCS-SP-ENTV-B200M3 
UCS-SP-ENVP-B200M3 
UCS-SP-PERF-B200 
UCS-SP-PERF-B200M3 
UCS-SP4-ENTS-B2M3 
UCS-SP4-ENTV-B2M3
UCS-SP4-PERF-B2M3
UCS-SP5-ENSC-B200 
UCS-SP5-ENTV-B200

Problem Description

A voltage regulator on affected UCS B-Series Blade Servers can cause excessive loss in drive voltage on the DDR3 power rails when under light loading conditions. This can cause correctable or uncorrectable memory errors. A firmware update is recommended to improve the performance of the voltage regulator.

Background

A firmware setting for a voltage regulator on affected UCS B-Series Blade Servers can cause signal corruption which can lead to memory errors and failure to boot in some cases. This issue has been observed in the Cisco manufacturing facility as a failure to boot Linux OS. The problem is fixed in an updated firmware image for affected UCS Blade Servers. These Cisco bug IDs are related to this issue:

  • CSCug93076 - B200M3-DDR voltage regulator may have excessive noise under light load
  • CSCug93221 - B420M3-DDR voltage regulator may have excessive noise under light load
  • CSCug98662 - B22M3-DDR voltage regulator may have excessive noise under light load

Problem Symptoms

Memory errors caused by the excessive voltage drop could show up in these ways:

  • Increased correctable memory error rates might cause DIMMs to be mapped out at boot time or be flagged as Degraded or Inoperable.
  • Uncorrectable memory errors might cause DIMMs to be mapped out at boot time, or might cause the system to crash while the operating system boots or runs. A system crash is accompanied by a catastrophic error (CATERR) in the System Event Log (SEL).

Example:

Key word to search for: "Predictive Failure asserted | Asserted" often followed immediately by "Predictive Failure deasserted | Asserted"

91 | 05/31/2013 10:39:36 | CIMC | Processor CATERR_N #0x74 | Predictive Failure asserted | Asserted
92 | 05/31/2013 10:39:36 | CIMC | Platform alert LED_BLADE_STATUS #0x99 | LED color is amber | Asserted
93 | 05/31/2013 10:39:37 | CIMC | Processor CATERR_N #0x74 | Predictive Failure deasserted | Asserted
94 | 05/31/2013 10:39:37 | CIMC | Platform alert LED_BLADE_STATUS #0x99 | LED color is green | Asserted

Instructions on how to retrieve SEL information can be found here:

Workaround/Solution

The issue is resolved in these Cisco UCS B-Series software releases:

  • UCS B-Series Software Release 2.0(5c) or later
  • UCS B-Series Software Release 2.1(1f) or later

Note: A host power off and a Cisco Integrated Management Controller (CIMC) reboot is required for this fix to take effect. This is NOT a normal upgrade sequence. Please carefully read and follow the steps documented in the release note.

UCS software can be downloaded from Cisco.com from this location: Download Software

UCS software upgrade instructions can be found at this location: UCS Install and Upgrade Guides

How To Identify Hardware Levels

Some UCS B-Series M3 Blade Servers have shipped from Cisco with updated voltage regulator firmware and an older software version than those noted in the Workaround/Solution section. These Blade Servers can be identified by a deviation sticker affixed near the front of the unit. These Deviations apply for the issue described in this Field Notice:

  • B200 M3: D135723
  • B22 M3: D136090
  • B420 M3: D135856

Deviation Label examples:

A UCS B-Series M3 Blade Server with the versions noted in the Workaround/Solution section have the required fix for the issue described in this Field Notice.

In order to view the version installed on the UCS Blade Server in UCS Manager, navigate to this location for each blade:
UCSM > Chassis > Servers > Server > Installed Firmware

Example:

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.