This article describes how to troubleshoot disk, RAID, and hardware issues.
You can check the disk health from the Central Manager or from the command line. On a Central Manager, choose the device to check, and then choose Monitor > Disks to get a report on the disk status. For more details, see the Disks Report section in the Cisco Wide Area Application Services Configuration Guide.
From the command line, you can use the show disks details command as follows:
WAE674# show disks details RAID Physical disk information: disk00: Online J8WM2DTC 286102 MB disk01: Rebuilding J8WMPV9C 286102 MB <-------replaced disk is rebuilding disk02: Online J8WMYG6C 286102 MB RAID Logical drive information: Drive 1: RAID-5 Critical <-------RAID logical drive is rebuilding Enabled (read-cache) Enabled (write-back) Mounted file systems: MOUNT POINT TYPE DEVICE SIZE INUSE FREE USE% /sw internal /dev/sda1 991MB 892MB 99MB 90% /swstore internal /dev/sda2 991MB 733MB 258MB 73% /state internal /dev/sda3 7935MB 176MB 7759MB 2% /local/local1 SYSFS /dev/sda6 22318MB 139MB 22179MB 0% .../local1/spool PRINTSPOOL /dev/data1/spool 991MB 32MB 959MB 3% /obj1 CONTENT /dev/data1/obj 248221MB 130MB 248091MB 0% /dre1 CONTENT /dev/data1/dre 248221MB 130MB 248091MB 0% /ackq1 internal /dev/data1/ackq 991MB 32MB 959MB 3% /plz1 internal /dev/data1/plz 2975MB 64MB 2911MB 2% Disk encryption feature is disabled.
It is also useful to check the Predictive Failure Analysis (PFA) flag for RAID-5 disks by using the show disks tech-support command. You will find the PFA flag at the end of the output. If the PFA flag is set to Yes, it indicates a predicted drive failure and you should replace the disk. A critical alarm is also raised on the WAE.
Disk failures are automatically detected by the system. Failed disks are automatically removed from service.
You can also shut down a disk for scheduled replacement by using the following commands:
For a RAID-5 system:
WAE674# disk disk-name disk01 replace Controllers found: 1 Command completed successfully.
After you replace a disk on a RAID-5 system, the system rebuilds the logical RAID drive automatically.
For a RAID-1 system:
WAE7326# config WAE7326(config)# disk disk-name disk01 shutdown Device maybe busy while going offline ... please wait! mdadm: set /dev/sdb1 faulty in /dev/md0 mdadm: set /dev/sdb2 faulty in /dev/md1 . . .
After you replace the disk on a RAID-1 system, use the following command to reenable the disk:
WAE7326# config WAE7326(config)# no disk disk-name disk01 shutdown
On a RAID-5 system, a RAID rebuild occurs when a hard disk is replaced, and a RAID synchronization occurs when WAAS is installed onto a system by CD or when you run the disk recreate-raid EXEC command. During a RAID rebuilding or synchronization process, which is managed by the RAID firmware, the hard disk LEDs blink constantly as the drives are set up with the RAID configuration. The RAID array rebuilding or synchronization process can take up to 6 hours to complete on a WAE-7371 with six 300-GB hard disks. Unfortunately, there is no indication of the time remaining.
CAUTION: Do not power cycle or remove a disk from the system when any of the drive LEDs are blinking because the disk can be damaged.
If you do remove a disk during the RAID build process, reinsert the disk and wait up to 6 hours for the RAID build process to complete.
There are slight differences in the RAID rebuild and synchronization, as follows:
Ensure that your WAE-7341/7371/674 appliance has the recommended RAID controller firmware, 5.2-0 (15418). You can check the RAID controller firmware with the show disks tech-support command as follows:
wae# show disks tech-support Controllers found: 1 ---------------------------------------------------------------------- Controller information ---------------------------------------------------------------------- Controller Status : Okay Channel description : SAS/SATA Controller Model : IBM ServeRAID 8k Controller Serial Number : 40453F0 Physical Slot : 0 Installed memory : 256 MB Copyback : Disabled Data scrubbing : Disabled Defunct disk drive count : 0 Logical drives/Offline/Critical : 1/0/0 --------------------------------------------------- Controller Version Information --------------------------------------------------- BIOS : 5.2-0 (15418) Firmware : 5.2-0 (15418) <-----Firmware version Driver : 1.1-5 (2449) Boot Flash : 5.1-0 (15418) --------------------------------------------------- . . .
If your RAID controller firmware needs to be updated, obtain the recommended version from the Cisco software download website (registered customers only) and upgrade the firmware as described in the documentation that accompanies the firmware.
WAE-7341/7371/674 appliances are designed to boot from the internal compact flash storage device, not the hard disk. If the WAE BIOS is inadvertently changed to boot from the hard disk, the WAE will fail to boot.
If you encounter this situation, change the BIOS back to boot from the compact flash to allow proper booting. For details on how to change the startup sequence, see the chapter Using the Configuration/Setup Utility Program in the Cisco Wide Area Application Engine 7341, 7371, and 674 Hardware Installation Guide. You can choose the option Load Default Settings to restore the correct default settings, which include booting from the internal compact flash storage device.
Sometimes after multiple power cycles during the device boot, the serial port becomes disabled.
If you encounter this situation, you should reenable the serial port. For details, see the chapter Using the Configuration/Setup Utility Programin the Cisco Wide Area Application Engine 7341, 7371, and 674 Hardware Installation Guide. You can choose the option Load Default Settings to restore the correct default settings, which include enabling the serial port.
To monitor the boot process on Cisco WAE and WAVE appliances, connect to the serial console port on the appliance as directed in the Hardware Installation Guide.
Cisco WAE and WAVE appliances have video connectors that should not be used in normal operation. The video output is for troubleshooting purposes only during the BIOS boot and stops displaying output as soon as the serial port becomes active.
If you are monitoring the video output, it may appear that the device has stopped booting when the output stops, but it is normal for the video output to stop while the device continues booting.
If you are running WAAS version 4.0.11 or an earlier release on a WAE-612 device and a disk fails, the replacement procedure varies, depending on the failure symptoms and the WAAS version that is in use. See the following sections, depending on the failure symptoms:
If you are running WAAS version 4.0.13 or a later release, see the Performing Disk Maintenance for RAID-1 Systems section in the Cisco Wide Area Application Services Configuration Guide for the hot-swap disk replacement procedure.
NOTE: On a WAE-612 that is running any WAAS version from 4.0.13 through 4.0.19, which supports the hot-swap replacement of drives, a problem may occur while replacing the drives while the unit is running. Occasionally, after a drive hot-swap procedure, the WAE-612 may stop operating and require a reboot. To avoid this problem, upgrade your WAAS software to version 4.0.19 or a later release.
If the disk only in slot 01 (right slot) fails and disk00 is good, use the following procedures to replace the disk, depending on the WAAS version on the device.
WAAS version 4.0.5 and earlier releases
WAAS versions 4.0.7 through 4.0.11
If disk00 fails and disk01 shows a status of Problematic, with an asterisk (*) next to the status (the asterisk means the disk is marked bad), it means that disk00 has failed but disk01 is misclassified as bad and its partition table has been removed. In this situation, all data will be lost after the disk replacement.
Use the following procedures to replace the disk, depending on the WAAS version on the device.
WAAS version 4.0.5 and earlier releases
You should see RAID rebuilds from disk00 to disk01.
WAAS versions 4.0.7 through 4.0.11
If disk00 fails and there is no asterisk (*) next to the status of disk01 (an asterisk means that the disk is marked bad), it means that disk00 has failed and the partition table of disk01 is intact. The status of disk01 may show as Problematic or as something else. In this situation, data will not be lost after disk replacement.
Use the following procedures to replace the disk, depending on the WAAS version on the device.
WAAS version 4.0.5 and earlier releases
You should see RAID rebuilds from disk00 to disk01.
WAAS versions 4.0.7 through 4.0.11
You should see RAID rebuilds from disk00 to disk01.