Guest

Cisco UCS C-Series Rack-Mount Standalone Server Software

Field Notice: FN - 63737 - UCS C-Series Rack Servers - 1 TB SATA HDD Crash - HDD Replacement Required

Field Notice: FN - 63737 - UCS C-Series Rack Servers - 1 TB SATA HDD Crash - HDD Replacement Required

March 5, 2014


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
1.0
05-MAR-2014
Initial Public Release

Products Affected

Products Affected
A03-D1TBSATA

Problem Description

It is possible that 1 TB Hard Disk Drives (HDD) in a Unified Computing System (UCS) C-Series Rack Server could fail. Initially customers might observe data errors and/or eventually the HDD will not function.

Background

Cisco's UCS C-Series Rack Servers with 1 TB HDDs from one vendor (bounded by a serial number range) could fail at a higher than expected rate. This is due to a quality issue with the internal crash stop that shortens the HDD operational lifespan.

The A03-D1TBSATA HDD is hot swappable, so Cisco recommends to hot swap the drives in order to eliminate any issues due to this HDD failure. The steps to complete this hot swap are:

  1. Stop all I/Os to the drive.
  2. Slightly pull the drive from the system connector.
  3. Let the drive sit in the system for 30 seconds while the spindle motor spins down and the heads park.
  4. Remove the drive from the system and install the replacement.

If you are not able to hot swap HDDs, the entire server must be powered down. This should only be performed when the replacement drive is ready to install (for example, mounting brackets).

Tests suggest that there will not be an issue with UCS C-Series Rack Servers that run at normal operations between 35° C and 50° C.

In summary:

  • Any crash stop affected drive (identified by serial number) that is powered down runs the risk that it will not come back up. If you do not have to, do not power down drives.
  • If you have to power down a drive, you will be less likely to exhibit an unlatch DNR the warmer it is and with a smaller amount of time powered off. This has been very successful at a temperature of 35° C or higher.

Problem Symptoms

Cisco was informed by it's supplier, Seagate, that Seagate's supplier of the crash stops used in the 1 TB disk drives had a marginal lot of crash stops that were too "sticky". This causes the actuator to fail on occasion.

Workaround/Solution

The A03-D1TBSATA HDD is hot swappable, so Cisco recommends to hot swap the drives in order to eliminate any issues due to this HDD failure. The steps to complete this hot swap are:

  1. Stop all I/Os to the drive.
  2. Slightly pull the drive from the system connector.
  3. Let the drive sit in the system for 30 seconds while the spindle motor spins down and the heads park.
  4. Remove the drive from the system and install the replacement.

If you are unable to hot swap HDDs, the entire server must be powered down. This should only be performed when the replacement drive is ready to install (for example, mounting brackets).

See the How to Identify Hardware Levels section in order to determine if you have an affected HDD.

How To Identify Hardware Levels

Affected hard disk drives are identified by serial number, and can be used in Cisco UCS B-Series Blade Servers and UCS C-Series Rack Servers. If you have the affected drive installed on a Cisco UCS server, you can retrieve the serial number from your drive and use the FN 63737 Serial Number Validation Tool link in order to confirm whether the unit is on the list of affected units.

Check the HDD Serial Number on UCS B-Series Blade Servers

Note: This method also applies to UCS C-Series Rack Servers that use integrated UCS Manager (UCSM) management.

  1. Log in to the UCSM.
  2. Choose Chassis > Server > Inventory > Storage.
  3. Note the serial number for each hard disk drive that matches a Product ID (PID) noted in the Products Affected section of this Field Notice. If the displayed serial number is longer than 8 alphanumeric digits, note only the first 8 digits. See this example:

  4. If desired, the CLI can also be used to capture hard disk drive serial numbers. See this example:

    show local-disk detail expand

  5. After you collect the serial numbers from potentially affected hard drives, use the FN 63737 Serial Number Validation Tool in order to determine if the HDDs are affected.

Check the HDD Serial Number on UCS C-Series Rack Servers

  1. Log in to the Cisco Integrated Management Controller (CIMC).
  2. Choose Inventory > Storage > Physical Drive Info.
  3. For C-Series CIMC users, the manufacturer model number is displayed in the Product ID field instead of the orderable Product ID. The model number of the affected drive is ST91000640NS. An example is shown here:

  4. If desired, the CLI can also be used to capture hard disk drive information. See this CLI example:
    test-system /chassis/storageadapter # scope physical-drive 12
    test-system /chassis/storageadapter/physical-drive #show detail
    Physical Drive Number 12:
    Controller: SAS
    Health: Good
    Status: Unconfigured Good
    Manufacturer: ATA
    Model: ST91000640NS
    Predictive Failure Count: 0
    Drive Firmware: CC02
    Coerced Size: 952720 MB
    Type: HDD
    test-system /chassis/storageadapter/physical-drive # show inquiry-data
    Physical Drive Number 12:
    Controller: SAS
    Info Valid: Yes
    Info Invalid Cause:
    Vendor: ATA
    Product ID: ST91000640NS
    Drive Firmware: CC02
    Drive Serial Number: 9XG2K0YP
  5. Note the serial numbers of any HDDs that have a Model/Product ID of ST91000640NS.
  6. After you collect the serial numbers from potentially affected hard drives, use the FN 63737 Serial Number Validation Tool in order to determine if the HDDs are affected.

Upgrade Program

FN-63737 - UCS C Series Servers - 1TB SATA HDD Crash - HDD Replacement Required
You will receive an email within a couple of days with an Order Number.If you need status of your Order, follow these directions:

If you were given a Sales Order Number for the shipment of your replacement parts, please refer to the SO Status Tool (Please note: you must have a CCO User ID and Password to access this site): http://tools.cisco.com/qtc/status/tool/action/LoadOrderQueryScreen

If you were given an RMA Number for the shipment of your replacement parts, please refer to the "Service Order QuickSearch" Tool at the following location (Please note: you must have a CCO User ID and Password to access this site): http://tools.cisco.com/support/serviceordertool/home.svo

If you have not received an email with an Order Number after 3 business days, please send an email with your Request Number, Customer Name and PID in the subject line to:
umpire-escalations@cisco.com

Note: Fields marked with an asterisk (*) are required fields.

Requestor Information
*Name
*E-mail Address
TAC SR Number
Customer Shipping Information
*Company
*Address
Address_line2
*City
*State/Province
*ZIP/Postal Code
*Country
Product
Product *Quantity *Serial# 2
A03-D1TBSATA=
Customer Contact Information
*First Name
*Last Name
*Phone 1 Ext.
Fax 1 Ext.
*E-Mail
Please use the following format: user@domain.com
*Upgrade Order Reference Number
Please provide a number that you can use when inquiring about order status
Notes
1 For phone and fax, include 011 and the country code outside North America.

2 The serial number input field for each Product ID can hold up to 4,000 characters, including commas and white space. For longer lists of serial numbers, please submit additional requests.

3 For customers in Japan only *** please enter the building and the floor in the address field. Also, enter the contact person's name, the telephone number and the e-mail address in the appropriate fields..

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.