PDF(9.3 KB) View with Adobe Reader on a variety of devices
ePub(67.8 KB) View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
Mobi (Kindle)(68.8 KB) View on Kindle device or Kindle app on multiple devices
Updated:September 28, 2018
The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This article is an extension to the document “Nexus 7000 Supervisor 2/2E Compact Flash Failure Recovery” that addresses all possible failure scenarios. A possibility where flash recovery tool fails to run, this document may come handy. It is recommended to have console access to the device to perform the changes. Also, it is strongly recommended to not make any changes under the Linux kernel, which is not mentioned in the document, as this may have an impact on the switch operations. Cisco TAC supervision is advisable.
As explained in the other document, each N7K supervisor 2/2E is equipped with 2 eUSB flash devices in RAID1 configuration, one primary and one mirror. Together they provide non-volatile repositories for boot images, startup configuration and persistent application data. In a situation where the Raid fails for a supervisor in the chassis, we run the flash recovery tool, to fix the same. In almost all cases, we resort to reloading/failing over the supervisor, if the flash recovery tool fails to run. There is a possibility to fix this without a reload/failover in certain scenario.
Cisco recommends that you have knowledge ofCisco Nexus OS, storage or flash disk recovery methods and Linux level debugging.
Nexus 7000 series switches
Raid failure is observed on a supervisor and while trying to recover the flash for the affected supervisors, following error appears when running the flash recovery tool,
Switches would run into Raid failure state with error code - 0xe1
ERROR: Cannot perform recovery. /dev/sdb has incorrect partition info. ERROR: Disk /dev/sdb needs to be manually inspected for errors. INFO: No recovery was attempted on module 5. All flashes left intact. INFO: A detailed copy of the this log was saved as volatile:flash_repair_log_mod5.tgz.
Load the debug plugin on the switch, to login to the linux shell,
Please be careful, while running the commands here.
Once we get the linux prompt, look for the affected partition as per the error message. In our case it is /dev/sdb. It could be some other partitions too.
Linux(debug)# ls -l /dev/sd? brw-r----- 1 root root 8, 0 Aug 28 2015 sda brw-rw-r-- 1 root disk 8, 32 Dec 18 2013 sdc brw-rw-r-- 1 root disk 8, 48 Dec 18 2013 sdd brw-rw-r-- 1 root disk 8, 64 Dec 18 2013 sde brw-rw-r-- 1 root disk 8, 80 Dec 18 2013 sdf brw-rw-r-- 1 root disk 8, 96 Dec 18 2013 sdg brw-rw-r-- 1 root disk 8, 112 Dec 18 2013 sdh brw-rw-r-- 1 root disk 8, 128 Dec 18 2013 sdi brw-rw-r-- 1 root disk 8, 144 Dec 18 2013 sdj brw-rw-r-- 1 root disk 8, 160 Dec 18 2013 sdk brw-rw-r-- 1 root disk 8, 176 Dec 18 2013 sdl brw-rw-r-- 1 root disk 8, 192 Dec 18 2013 sdm
The partition is found to be missing, leading to error, while running the recovery tool. Create the missing partition manually, with same permission as other blocks.
Linux(debug)# mknod -m 664 /dev/sdb b 8 16
Now, we can see the sdb partition under /dev,
Linux(debug)# ls -l /dev/sd? brw-r----- 1 root root 8, 0 Aug 28 2015 sda brw-rw-r-- 1 root root 8, 16 May 26 07:31 sdb brw-rw-r-- 1 root disk 8, 32 Dec 18 2013 sdc brw-rw-r-- 1 root disk 8, 48 Dec 18 2013 sdd brw-rw-r-- 1 root disk 8, 64 Dec 18 2013 sde brw-rw-r-- 1 root disk 8, 80 Dec 18 2013 sdf brw-rw-r-- 1 root disk 8, 96 Dec 18 2013 sdg brw-rw-r-- 1 root disk 8, 112 Dec 18 2013 sdh brw-rw-r-- 1 root disk 8, 128 Dec 18 2013 sdi brw-rw-r-- 1 root disk 8, 144 Dec 18 2013 sdj brw-rw-r-- 1 root disk 8, 160 Dec 18 2013 sdk brw-rw-r-- 1 root disk 8, 176 Dec 18 2013 sdl brw-rw-r-- 1 root disk 8, 192 Dec 18 2013 sdm
Exit from the linux shell and run the flash recovery tool again.
This time without any error messages and the Raid failure on the primary flash was recovered (0xf0). Confirmed the same using the command,
"slot x show system internal raid | i i cmos|block | head line 5"
It should run fine without such errors and should be able to recover the affected Supervisor from the Raid failure state. In case, recovery tool continues to fail to run, it could be due to another reason, or an actual corruption with the partition, and we may have to resort to a reload/failover.