About Online Diagnostics
With online diagnostics, you can test and verify the hardware functionality of the device while the device is connected to a live network.
The online diagnostics contain tests that check different hardware components and verify the data path and control signals. Disruptive online diagnostic tests (such as the disruptive loopback test) and nondisruptive online diagnostic tests (such as the ASIC register check) run during bootup, line module online insertion and removal (OIR), and system reset. The nondisruptive online diagnostic tests run as part of the background health monitoring, and you can run these tests on demand.
Online diagnostics are categorized as bootup, runtime or health-monitoring diagnostics, and on-demand diagnostics. Bootup diagnostics run during bootup, health-monitoring tests run in the background, and on-demand diagnostics run once or at user-designated intervals when the device is connected to a live network.
Bootup Diagnostics
Bootup diagnostics run during bootup and detect faulty hardware before Cisco NX-OS brings a module online. For example, if you insert a faulty module in the device, bootup diagnostics test the module and take it offline before the device uses the module to forward traffic.
Bootup diagnostics also check the connectivity between the supervisor and module hardware and the data and control paths for all the ASICs. The following table describes the bootup diagnostic tests for a module and a supervisor.
Diagnostic |
Description |
---|---|
OBFL |
Verifies the integrity of the onboard failure logging (OBFL) flash. |
BootupPortLoopback |
Runs only during module bootup. Tests the packet path from the Supervisor CPU to each physical front panel port on the ASIC. |
USB |
Nondisruptive test. Checks the USB controller initialization on a module. |
ManagementPortLoopback |
Disruptive test, not an on-demand test. Tests loopback on the management port of a module. |
EOBCPortLoopback |
Disruptive test, not an on-demand test. Ethernet out of band. |
Bootup diagnostics log failures to onboard failure logging (OBFL) and syslog and trigger a diagnostic LED indication (on, off, pass, or fail).
You can configure the device to either bypass the bootup diagnostics or run the complete set of bootup diagnostics.
Runtime or Health Monitoring Diagnostics
Runtime diagnostics are also called health monitoring (HM) diagnostics. These diagnostics provide information about the health of a live device. They detect runtime hardware errors, memory errors, the degradation of hardware modules over time, software faults, and resource exhaustion.
Health monitoring diagnostics are nondisruptive and run in the background to ensure the health of a device that is processing live network traffic. You can enable or disable health monitoring tests or change their runtime interval.
The following table describes the health monitoring diagnostics and test IDs for a module and a supervisor.
Diagnostic |
Default Interval | Default Setting |
Description |
---|---|---|---|
Module | |||
ACT2 |
30 minutes |
active |
Verifies the integrity of the security device on the module. |
ASICRegisterCheck |
1 minute |
active |
Checks read/write access to scratch registers for the ASICs on a module. |
PrimaryBootROM |
24 hours 1 |
active |
Verifies the integrity of the primary boot device on a module. |
SecondaryBootROM |
24 hours 1 |
active |
Verifies the integrity of the secondary boot device on a module. |
PortLoopback |
On demand [for releases prior to Cisco NX-OS 7.0(3)I1(2)] 30 minutes [starting with Cisco NX-OS Release 7.0(3)I1(2)] |
active |
Checks diagnostics on a per-port basis on all admin down ports. |
RewriteEngineLoopback |
1 minute |
active |
Verifies the integrity of the nondisruptive loopback for all ports up to the 1 Engine ASIC device. |
AsicMemory |
Only on boot up |
Only on boot up - inactive |
Checks if the AsicMemory is consistent using the Mbist bit in the ASIC. |
FpgaRegTest |
30 seconds |
Health monitoring test - every 30 seconds - active |
Test the FPGA status by read/write to FPGA. |
Supervisor | |||
NVRAM |
5 minutes |
active |
Verifies the sanity of the NVRAM blocks on a supervisor. |
RealTimeClock |
5 minutes |
active |
Verifies that the real-time clock on the supervisor is ticking. |
PrimaryBootROM |
30 minutes |
active |
Verifies the integrity of the primary boot device on the supervisor. |
SecondaryBootROM |
30 minutes |
active |
Verifies the integrity of the secondary boot device on the supervisor. |
BootFlash |
30 minutes |
active |
Verifies access to the bootflash devices. |
USB |
30 minutes |
active |
Verifies access to the USB devices. |
SystemMgmtBus |
30 seconds |
active |
Verifies the availability of the system management bus. |
Mce |
30 minutes |
Health monitoring test - 30 minutes - active |
This test uses the mcd_dameon and reports any machine check error reported by the Kernel. |
Pcie |
Only on boot up |
Only on boot up - inactive |
Reads PCIe status registers and check for any error on the PCIe device. |
Console |
Only on boot up |
Only on boot up - inactive |
This runs a port loopback test on the management port on boot up to check for its consistency. |
FpgaRegTest |
30 seconds |
Health monitoring test - every 30 seconds - active |
Test the FPGA status by read/write to FPGA. |
On-Demand Diagnostics
On-demand tests help localize faults and are usually needed in one of the following situations:
-
To respond to an event that has occurred, such as isolating a fault.
-
In anticipation of an event that may occur, such as a resource exceeding its utilization limit.
You can run all the health monitoring tests on demand. You can schedule on-demand diagnostics to run immediately.
You can also modify the default interval for a health monitoring test.
High Availability
A key part of high availability is detecting hardware failures and taking corrective action while the device runs in a live network. Online diagnostics in high availability detect hardware failures and provide feedback to high availability software components to make switchover decisions.
Cisco NX-OS supports stateless restarts for online diagnostics. After a reboot or supervisor switchover, Cisco NX-OS applies the running configuration.
Virtualization Support
Online diagnostics are virtual routing and forwarding (VRF) aware. You can configure online diagnostics to use a particular VRF to reach the online diagnostics SMTP server.