THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
12-May-15 |
Initial Release |
10.0 |
13-Oct-17 |
Migration to new field notice system |
10.1 |
21-May-18 |
Fixed Broken Image Links |
10.2 |
08-Apr-20 |
Removed the How to Identify Affected Products Section |
Affected Product ID | Comments |
---|---|
UCSW-WT-M128SSD |
|
UCSW-WT-M256SSD |
|
UCSW-WT-M512SSD |
Defect ID | Headline |
---|---|
CSCut95465 | systems with Micron C400 FW 070H have b/t 3 to 22 drives fall off RAID |
There is an issue with Solid State Drives (SSD) on Whiptail Invicta systems which causes the drives to become inaccessible after 12 or more hours of idle time. The issue is only seen on legacy Whiptail systems and is not found on current Invicta systems shipped by Cisco.
Micron's SSD Product Engineering team recently identified a possible failure condition for C400 drives. When a C400 drive has been in an idle state for 12.25 hours it starts a periodic self-test media scan which is used for data integrity. The firmware writes the results of the self-test into a DRAM buffer. When the buffer is full it commits the self-test log to the NAND.
The failure is timing dependent. If a host data request interrupts the periodic self-test when the DRAM buffer allocated to the self-test log is full, but prior to the buffer full check, the core responsible for host commands takes over the sequencer. When the core responsible for internal drive operations checks the buffer and finds it full, it checks if the sequencer is available to commit the log to NAND. The sequencer is currently busy handling the core responsible for the host command. In this case, rather than wait for the sequencer to become ready again, the drive's firmware halts drive operations.
When the firmware failure occurs, the drive is unresponsive to the host until the drive is power-cycled. This results in one or more drives in the RAID to drop out of the RAID, which causes a loss of availability of the Invicta storage.
The drives become inaccessible after 12 or more hours of no activity. The issue can affect multiple drives at once. The issue is not seen on systems with normal usage patterns and has only been observed on non-production or lab systems with long idle periods.
Recovery of the drives requires a power reset of the Invicta system. In order to avoid this issue, prevent the system from being idle either from normal use or implement a cron job which touches files on a regular basis. Customers who have systems with long idle times should contact Cisco TAC for assistance to set up the cron job workaround.
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.