Guest

Cisco UCS C-Series Rack Servers

UCS C-Series Rack Servers CLI Commands for Troubleshooting HDD Issues

Document ID: 115025

Updated: Dec 07, 2012

Contributed by Andreas Nikas, Cisco TAC Engineer.

   Print

Introduction

This document describes several command-line interface (CLI) commands, as well as other troubleshooting techniques, that can help troubleshoot hard disk drive (HDD) issues. The best method for troubleshooting HDD issues is to use the LEDs, GUI, BIOS, LSI Option ROM / MegaRaid GUI, and logs. However, these options are not always available. In this case, you can use the CLI.

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

This document is not restricted to specific software and hardware versions.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

CLI Commands

Show the Product Name

Note: Some of the commands listed in this document depend on if you have an LSI MegaRaid controller as not all of them are supported by the 1064/1068e LSI controllers.

Enter the show pci-adapter command in order to view the product name. This example shows an LSI 1064e adapter.

ucs-c2xx-m1 /chassis #show pci-adapter 
Slot Vendor ID  Device ID  SubVendor ID  SubDevice ID  Product Name  
---- ---------  ---------  ------------  ------------  ------------------------ 
M    0x1000     0x0056     0x152d        0x896d        Cisco LSI 1064E Mezzan...

Show the HDD Status

Enter the show hdd command in order to view the status of the HDDs.

ucs-c2xx-m1 /chassis #show hdd
Name                    Status               
--------------------    -------------------- 
HDD_01_STATUS           present              
HDD_02_STATUS           absent               
HDD_03_STATUS           absent               
HDD_04_STATUS           absent

Show the Virtual and Physical Drive Status

Enter the show virtual-drive command in order to view the status of the virtual drives. This command is useful since it does not require you to shut down the server and enter the BIOS to view the information.

ucs-c210-m2/chassis #scope storageadapter SLOT-5

ucs-c210-m2/chassis/storageadapter #show virtual-drive
Virtual Drive   Status              Name                   Size       RAID Level 
--------------  ------------------  ---------------------- ---------  ---------- 
0               Optimal                                    139236 MB  RAID 1
1               Degraded                                   974652 MB  RAID 5

Enter the show physical-drive command in order to view the status of the physical drives.

ucs-c210-m2 /chassis/storageadapter #show physical-drive

                                                  Predictive
Slot                                              Failure    Drive    Coerced
Number Controller Status Manufacturer Model       Count      Firmware Size      Type  
------ ---------- ------ ------------ ----------- ---------- -------- --------- ---- 
0      SLOT-5
1      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD 
2      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
3      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
4      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
5      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
6      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
7      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
9      SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD   
10     SLOT-5     online SEAGATE      ST9146852SS 0          0005     139236 MB HDD

Show the Number of Correctable and Uncorrectable Errors

Enter the show error-counters command in order to view the number of correctable and uncorrectable errors.

ucs-c210-m2 /chassis/storageadapter #show error-counters 

PCI Slot SLOT-5:

    Memory Correctable Errors: 0

    Memory Uncorrectable Errors: 0

Show the RAID Controller Configuration

Enter the show hw-config command in order to view the RAID controller configuration.

ucs-c210-m2 /chassis/storageadapter #show hw-config 

PCI Slot SLOT-5:

    SAS Address 0: 500e004aaaaaaa3f

    SAS Address 1: 0000000000000000

    SAS Address 2: 0000000000000000

    SAS Address 3: 0000000000000000

    SAS Address 4: 0000000000000000

    SAS Address 5: 0000000000000000

    SAS Address 6: 0000000000000000

    SAS Address 7: 0000000000000000

    BBU Present: true

    NVRAM Present: true

    Serial Debugger Present: true

    Memory Present: true

    Flash Present: true

    Memory Size: 512 MB

    Cache Memory Size: 394 MB

    Number of Backend Ports: 8

Show the Number of HDDs

Enter the show physical-drive-count command in order to view the number of HDDs.

ucs-c210-m2 /chassis/storageadapter #show physical-drive-count 

PCI Slot SLOT-5:

    Physical Drive Count: 9

    Critical Physical Drive Count: 0

    Failed Physical Drive Count: 0

Technical Support File

In the event that you do not have access to the CLI, you can view the technical support file (/tmp/tech_support) in order to obtain information about the status of the HDDs. Here is an excerpt from the technical support file that shows the HDDs from the Intelligent Platform Management Interface (IPMI) sensors:

Querying All IPMI Sensors:
Sensor Name | Reading | Unit      | Status  | LNR | LC  | LNC | UNC | UC | UNR       

HDD0_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD1_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD2_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD3_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD4_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD5_INFO   | 0x0     | discrete  | 0x2181  | na  | na  | na  | na  | na | na        
HDD6_INFO   | na      | discrete  | na      | na  | na  | na  | na  | na | na        
HDD7_INFO   | na      | discrete  | na      | na  | na  | na  | na  | na | na

Here is an excerpt from the technical support file that shows a breakdown of the HDD status:

Bit[15:10] - Unused
Bit[9:8]   - Fault
Bit[7:4]   – LED Color 
Bit[3:0]   – LED State
Fault:
0x100 – On Line
0x200 - Degraded
LED Color: 
0x10 – GREEN
0x20 – AMBER
0x40 – BLUE
0x80 – RED
LED State:
0x01 – OFF
0x02 – ON
0x04 – FAST BLINK 
0x08 – SLOW BLINK

Here is an excerpt from the technical support file that shows the HDD status (with a status code of 0x2181):

0x2181 

Fault:
0x100 --- HDD is On Line

LED Color:
0x80 --- RED

LED State:
0x01 --- OFF

Battery Backup Unit

You have the option to use a battery backup unit (BBU) with some server deployments. The BBU is an intelligent battery backup unit that protects disk write cache data on the RAID controller for up to 72 hours during a power loss.

This example shows how to use the MegaCli in order to check the status of the BBU:

bash$ sudo /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 -NoLog
 Password:
 
 . . .

  Battery Replacement required            : Yes
 
 . . .
 
 Relative State of Charge: 99 %
 Absolute State of charge: 76 %
 
 . . .
 
 Date of Manufacture: 11/08, 2008
 Design Capacity: 700 mAh
 Design Voltage: 3700 mV
 Specification Info: 33
 Serial Number: 243
 Pack Stat Configuration: 0x6cb0
 Manufacture Name: LSI113000G
 Device Name: 2970700
 Device Chemistry: LION
 Battery FRU: N/A

This example shows how to use the CLI in order to check the status of the BBU:

ucs-c200-m2 /chassis/storageadapter #show bbu detail
Controller SLOT-7:
     Battery Type: iBBU
     Battery Present: true
     Voltage: 4.023 V
     Current: 0.000 A
     Charge: 100%
     Charging State: fully charged
     Temperature: 34 degrees C
     Voltage Low: false
     Temperature High: false
     Learn Cycle Requested: false
     Learn Cycle Active: false
     Learn Cycle Failed: false
     Learn Cycle Timeout: false
     I2C Errors Detected: false
     Battery Replacement Required: true
     Remaining Capacity Low: true

Related Information

Updated: Dec 07, 2012
Document ID: 115025