Introduction
This document describes the various methods to copy the software into ACI switches, CLI commands used while preparing a switch for bootup and the most frequently asked questions (FAQ) associated with the process of replacing the switch in ACI.
Problem
If a switch does not load a software due to a specific issue. Here, are a few ways to work out on a problem that helps the user to fix the issue.
This document also explores a few CLI commands that can be leveraged to confirm the booting process of the switch.
Solution
Methods to copy the software
There are a few methods that can be used to copy a software to a switch.
USB
A USB drive can be used to copy a software in a switch. A user must format the USB drive with FAT32 file system and then It can be used to copy a software in a switch.
In general, all the USB drives are supported. If there is a challenge with USB, it is better to check the data sheet of a given platform and check for any specific recommendations mentioned in that data sheet about the use of USB drive.
Each switch has 2 USB slots. Use “dir” command to check the slot number. To boot the code from USB drive, Use the command –
boot usb#:aci-image.bin ; where # is the slot of the USB
This command works in loader prompt as well in switch prompt.
To copy the software into bootflash, Use :
copy usb#:aci-image.bin bootflash:
In the example, you see usb slot 1 is used to connect and it is detected with ACI image 14.2.4i code.
loader > dir
usb1::
System Volume Information
aci-image.bin
bootflash::
CpuUsage.Log
lxc
disk_log.txt
nxos.7.0.3.I7.3.bin
auto-s
libmon.logs
.stats_pref.txt
bios_bootup_scratch_not_cleared
loader > boot usb1:aci-image.bin
Security Lock
Booting usb1:aci-image.bin
Trying diskboot
Filesystem type is fat, partition type 0xc
Image valid
Image Signature verification was Successful.
SCP from APIC to switch
Enable SCP server feature and SCP services can be used to copy a software from APIC to a switch. Configure the management 0 interface with an IP address, set up a default gateway for the management virtual routing and forwarding (VRF) instance. Make sure that pings works from management VRF to the APIC.
Configuration Steps :
On Switch
switch# configure terminal
switch(config)# interface mgmt 0
switch(config-if)# ip address ipv4-address{ [/length] | [subnet-mask]}
switch(config-if)# no shutdown
switch(config-if)# exit
switch(config)# vrf context management
switch(config-vrf)# ip route 0.0.0.0/0 default-gw-ip
switch(config-vrf)# exit
switch(config)# feature scp-server
switch(config)# exit
switch# copy running-config startup-config
On APIC
admin@apic:~>scp /firmware/fwrepos/fwrepo/<aci-image.bin> admin@<node-mgmt-ip>:<aci-image.bin>
<node-mgmt-ip> is the management IP given on the switch.
Using external SCP/FTP/TFTP server
This method is similar to method 2 but instead of copying the software from APIC, external SCP/FTP/TFTP server must be used. Configuration steps remain same except, SCP service does not need to be enabled. Make sure that the ping works from management VRF to the external server.
switch# configure terminal
switch(config)# interface mgmt 0
switch(config-if)# ip address ipv4-address{ [/length] | [subnet-mask] }
switch(config-if)# no shutdown
switch(config-if)# exit
switch(config)# vrf context management
switch(config-vrf)# ip route 0.0.0.0/0 default-gw-ip
switch(config-vrf)# end
Then copy the image from the external server to the switch.
switch# copy tftp://tftpuser@<IP_TFTP>/path/to/aci-image.bin bootflash: vrf management
(Assuming TFTP server is being used and IP_TFTP is the IP address configured on the TFTP server)
EOBC method
This allows to boot from the primary over eobc channel. For ACI the full procedure to recover from this would be:
(i) Use the eobc command from loader on the secondary supervisor to boot this supervisor over the Ethernet Out-of-Band Channel (eobc) from the primary.
(ii) Login to the secondary supervisor as admin, it is now in standby mode.
(iii) Copy the image from the primary supervisor to the standby from /bootflash-remote/ to /bootflash: cp /bootflash-remote/<image> /bootflash/
(iv) Run prepare-mfg.sh <image> to setup the supervisor and set the bootvars.
(v) Reload the standby supervisor from the primary just to ensure that it comes up fresh from the image that was installed on its bootflash: reload module <module_number>
Be informed that this method must be used only when there is no other option available as this method is very time consuming.
loader > ?
? Print the command list
boot Boot image
dir List file contents on a device
eobc Booting image from active sup via EOBC channel
help Print the command list or the specific command usage
ip Setting IP address or gateway address
reboot Reboot the system
set Set network configuration
show Show loader configuration
loader>
loader > eobc
Finding driver for NIC vendor 8086 Device 1523
Found the device 8086:1523 at ioaddr e060, membase f0160000 at 1:0
Probing...igb: e1000_set_media_typeMedia type is serdes 005400c0
igb: e1000_set_media_typeMedia type is serdes 005400c0
igb: INTEL MAC. Link already up reset (ctrl 0x081c1a41)
Ethernet addr: 00:00:00:1C:00:00
igb: INTEL link status is 0x80280683
Link is up
Link speed = 1000 Mbps, Full Duplex
Useful CLI commands during ACI switch recovery
Use the method when dealing with replacing a leaf switch or spine switch:
Step 1.) Power on the new switch/SUP and connect a console.
Step 2.) Make sure that it is running the same ACI code as in the fabric. If not, use any one of the above-mentioned methods to copy a software to the new switch/SUP
One the software is copied, Use the steps –
verify MD5 checksum
switch(config)# show file bootflash:aci-image-name md5sum
switch(config)# no boot nxos
switch(config)# copy running-config startup-config
switch(config)# boot aci bootflash:aci-image-name
switch(config)# reload
Step 3.) From the new switch console run the command “setup-clean-config.sh”
Reload (run the command reload) to clean up any configurations that already exist on the switch.
Step 4.) Use the commands to verify the boot statements –
cat /mnt/cfg/0/boot/grub/menu.lst.local
cat /mnt/cfg/1/boot/grub/menu.lst.local
Step 5.) In case, switch does not show correct boot statements, Use the set of commands to clear out old boot statements and set a new boot statement –
clear-bootvars.sh
setup-bootvars.sh <aci-image.bin>
Step 6.) Proceed with commissioning the switch into the fabric.
You can refer to the link –
https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/213617-aci-leaf-or-spine-replacement-procedure.html
FAQs related to recovery of the ACI switch
Which method must be used to copy the software in the switch?
A.There are 4 methods which have been discussed in this document to accomplish this task.
If there are no restrictions in the data centers regarding the use of external laptop or USB drive or external server like TFTP/FTP/SCP, then USB drive method must be the one on your list.
It is because it is very fast and efficient and saves user’s time and energy.
If a user cannot use USB drive in the data center, then they need to go for method 2 or method 3 depending on the restrictions in the data center.
Which software must be installed in the RMA’ed leaf switch or Spine SUP?
A.Make sure that new switch/SUP is upgraded to the same software which is being used in the ACI fabric, otherwise, leaf switch or spine SUP remains stuck in discovery process.
Can we upgrade/replace the spine switch without reload?
A.If there is only one SUP in a spine, then we cannot upgrade or replace it without reload. There can be production impact.
If there is a need to replace or upgrade the standby SUP (in case of dual SUP in a spine switch), then the procedure can be used –
(i) Plug in the NX-OS supervisor in the standby slot and enter a break sequence (Ctrl-C or Ctrl-]) during the initial boot sequence to access the loader> prompt.
(ii) Plug the USB drive containing the ACI image into the standby supervisor USB slot.
(iii) Boot the ACI image.
How to replace both SUPs in the spine switch?
A.
Step 1.) Insert both the SUPs in the spine switch.
Step 2.) Take the console access of each SUP and check the code running on the SUPs
Step 3.) If it is NX-OS, go ahead with copying the intended ACI code in each SUP.
From active SUP connection only, you can use the commands –
copy usb1:aci-image.bin bootflash://sup-local
copy usb1:aci-image.bin bootflash://sup-remote
Step 4.) Change the boot statements and verify the boot statements.
Step 5.) Reload the entire chassis with the command – reload
There is one more command to power cycle the switch (Hard reboot) "/usr/sbin/chassis-power-cycle.sh"
Step 6.) Verify that spine switch is running on the intended code, you can then proceed with commissioning the switch into the fabric.
What to do If Standby SUP remains stuck in “inserted” state ?
A.
Copy a fresh copy of software in USB and boot the SUP from USB.
Copy the software in SUP and verify the boot statements.
Run the command – Prepare-mfg.sh aci-image.bin
Verify in the GUI as well if Standby SUP starts showing over there.
How redundancy works in a spine switch with dual supervisors?
A.ACI Spine switch supports warm (stateless) standby where the state is not synched between the active and the standby supervisor modules. For an online insertion and removal (OIR) or reload of the active supervisor module, the standby supervisor module becomes active, but all modules in the switch are reset because the switchover is stateless.
In the output of the show system redundancy status command, warm standby indicates stateless mode.
To test this redundancy, you can either execute a command “system switchover” from cli or reload active SUP from GUI.