Guest

Cisco UCS C-Series Rack Servers

Field Notice: FN - 63442 - LSI RAID Controller Chip Potential Premature Failure - Hardware Replacement Required

September 21, 2011


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
1.0
21-SEPT-2011
Initial Public Release

Products Affected

Products Affected Comments
MDE-RAID01-CTRL
Some affected units installed as option to MDE-1100-K9 Media Delivery Engine 1100
DMS-PCIE-RAID
Some affected units installed as option to SNS-SVR-C200WG-K9 or SNS-SVR-C200WG-K9= DMS Show and Share Server
N20-B6625-1
 
N20-B6625-1-UPG
 
N20-B6625-1=
 
N20-B6625-1D
 
R2X0-ML002
Some affected units installed as option under these PIDs: N1K-C1010, R200-1120402W, R200-BUN-1, R200-BUN-2, R200-BUN-3, R200-BUN-4, R210-2121605W, R210-BUN-2, R210-STAND-CNFGW, UCS-SP2-C210V, UCS-SP-C210E
R2X0-ML002=
 
UCS-B200M2-VCS1
 

Problem Description

A quality problem exists in the LSI 1064e RAID controller installed in certain Cisco products, which causes the potential for premature failure.

Affected units were shipped from Cisco Systems between June 2, 2011, and July 8, 2011, and are identifiable by serial number. Cisco will replace these units free of charge.

Background

Cisco is replacing LSI RAID controllers based on recommendation from LSI. Affected units are either permenantly installed on server blades or are in mezzanine form factor in rack servers.

Certain lots shipped to Cisco are affected with a quality issue. During the manufacturing process, a problem with a plasma cleaner caused a percentage of products to develop intermetallic crack (IMC) at the interface between the BGA ball and package pad. The IMC is not isolated to a specific pin and can result in an open circuit on any pin.

Failure rate projection is up to 3% in the first 6 months of use. There is a 3% chance of data corruption if a failure occurs due to the described issue.

Problem Symptoms

In most cases, the 1064E controller from this defective batch/lot will not be functional, and there will be premature failure(s).

Workaround/Solution

Replace the affected hardware. Please follow instructions in the How to Identify Hardware Levels section to determine if you have an affected RAID Controller mezzanine card or B200 M2 blade server. Use the form below to submit an order for your replacement. The serial number of the affected RAID controller mezzanine card or B200 M2 blade server is required on the order form.

How To Identify Hardware Levels

Affected blade servers were shipped from Cisco Systems between June 9, 2011, and July 8, 2011.

Affected mezzanine cards were shipped from Cisco Systems between June 2, 2011, and June 16, 2011.

The procedures below describe how to identify specific affected units.

Blade Server Identification

  1. Confirm that your blade server(s) is identified by one of the following product IDs: N20-B6625-1, N20-B6625-1=, N20-B6625-1D, N20-B6625-1-UPG, or UCS-B200M2-VCS1
  2. Log the serial number(s) of the potentially affected blade server(s) for validation. The serial number and product ID can be retrieved using one of the following methods:
    • Method 1: Physically inspect the blade server. Product information is displayed on a sticker on the bottom of the sheet metal blade housing, on a pullout tab on the face of the unit, or on a sticker directly on the face of the unit.
    • Method 2: Log in to UCS Manager, and navigate to Equipment > chassis number > blade number. The product ID and serial number are displayed (as shown in this image):
  3. Use the Cisco Blade Server Serial Number Validation Tool to determine the blade server serial number(s) is affected.

LSI 1064e RAID Controller Mezzanine Cards Identification

  1. LSI 1064e RAID controllers sold as spares by Cisco have the product ID and serial number labeled on the box. If you have product ID R2X0-ML002=, check the box and note the serial number for validation with the tool below. The following image illustrates product ID and serial number locations on the box label:
  2. LSI 1064e RAID controller mezzanine cards installed in Cisco rack servers do not have an electronically identifiable serial number, so they must be physically inspected to determine the serial number. This operation requires removal of the chassis cover. In order to reduce this disruptive activity as much as possible, you can check the chassis serial number to validate whether it was originally shipped with an affected LSI 1064e RAID controller. The mezzanine card must still be physically inspected, but a user can avoid opening a server unneccessarily by checking the chassis serial number first. The first step to check rack servers is to make sure it is one of these affected product IDs: N1K-C1010, R200-1120402W, R200-BUN-1, R200-BUN-2, R200-BUN-3, R200-BUN-4, R210-2121605W, R210-BUN-2, R210-STAND-CNFGW, UCS-SP2-C210V, or UCS-SP-C210E.
  3. For affected rack server(s), note the chassis serial number(s) for validation. The serial number and product ID can be retrieved using one of the following methods:
    • Method 1: Physically inspect the rack server. Product information is displayed on a sticker on the bottom of the unit. The serial number is also on a sticker placed on the left front mounting ear.
    • Method 2: Log in to the Cisco Integrated Management Controller (CIMC), and note serial number information displayed on the summary page (as shown in this image):
    • Method 3: For systems integrated with UCS Manager (UCSM) environments, log in to UCS Manager, and navigate to Equipment > Rack-Mounts > rack server name. The product ID and serial number are displayed (as shown in this image):
    • Method 4: Server chassis product IDs and serial numbers can be retrieved via the command-line interface using these commands: scope chassis and show detail. For example:
  4.       [servername]$ ssh server_address
           password: <password>
           servername# scope chassis
           servername /chassis # show detail
           Chassis:
           Power: on
           Serial Number: QCI140205ZZ
           Product Name: UCS C210 M2
           PID : R210-2121605W
           UUID: F2A5E738-D8FE-DE11-76AE-8843E138ADA4
           Locator LED: off
           Description:
           Power Restore Policy: power-off
           Power Delay Type: fixed
           Power Delay Value(sec): 0
  5. Use the Cisco Rack Server Serial Number Validation Tool to check the rack server serial numbers to see if they were shipped with an affected LSI 1064e RAID controller.
  6. Rack servers that are possibly affected will be identified as affected by the validation tool. For rack servers that are possibly affected, the cover must be removed, and the LSI 1064e RAID controller serial number must be retrieved in order to make final validation of whether it is affected. Instructions for access to the mezzanine card can be found in the Installing a Mezzanine Card section of these Cisco documents:

    These images show the location of the LSI 1064e mezzanine card within the rack server chassis:

    The HDD cable must be temporarily removed in order to view the serial number. These images show the serial number location on the LSI 1064e RAID controller mezzanine card:

  7. After collection of potentially affected LSI 1064e RAID controller mezzanine card serial numbers, confirm whether the units are affected by entering the serial number(s) into the LSI 1064e Mezzanine Card Serial Number Validation Tool. Affected units can be ordered using the form below.

Upgrade Program

R2X0-ML002=, N20-B6625-1=
If you want to check status on a previously booked upgrade, please refer to the Status Tool (Please note: you must have a sales order number in addition to a CCO User ID and Password to access this site):http://tools.cisco.com/qtc/status/tool/action/LoadOrderQueryScreen

For questions about an upgrade order already submitted, please send an email with your order number in the subject line to:upgrades-support@cisco.com

You will receive a response within 24 hours Monday-Friday not including US Holidays.

Note: Fields marked with an asterisk (*) are required fields.

Requestor Information
*Name
*E-mail Address
TAC SR Number
Customer Shipping Information
*Company
*Address
Address_line2
*City
*State/Province
*ZIP/Postal Code
*Country
Product
Product *Quantity *Serial# 2
R2X0-ML002=
N20-B6625-1=
Customer Contact Information
*First Name
*Last Name
*Phone 1 Ext.
Fax 1 Ext.
*E-Mail
Please use the following format: user@domain.com
*Upgrade Order Reference Number
Please provide a number that you can use when inquiring about order status
Notes
1 For phone and fax, include 011 and the country code outside North America.

2 The serial number input field for each Product ID can hold up to 4,000 characters, including commas and white space. For longer lists of serial numbers, please submit additional requests.

3 For customers in Japan only *** please enter the building and the floor in the address field. Also, enter the contact person's name, the telephone number and the e-mail address in the appropriate fields..

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.