Table Of Contents
Configuring Online Diagnostics
Information About Online Diagnostics
Online Diagnostic Overview
Bootup Diagnostics
Runtime or Health Monitoring Diagnostics
On-Demand Diagnostics
High Availability
Licensing Requirements for Online Diagnostics
Guidelines and Limitations
Default Settings
Configuring Online Diagnostics
Setting the Bootup Diagnostic Level
Activating a Diagnostic Test
Setting a Diagnostic Test as Inactive
Starting or Stopping an On-Demand Diagnostic Test
Clearing Diagnostic Results
Simulating Diagnostic Results
Verifying the Online Diagnostics Configuration
Configuration Examples for Online Diagnostics
Additional References
Related Documents
Feature History for Online Diagnostics
Configuring Online Diagnostics
Beginning with Cisco MDS NX-OS Release 6.2, the Cisco MDS 9000 Family supports the generic online diagnostics (GOLD) feature. This chapter describes how to configure the GOLD feature on a Cisco MDS 9000 Family switch.
This chapter includes the following sections:
•
Information About Online Diagnostics
•
Licensing Requirements for Online Diagnostics
•
Guidelines and Limitations
•
Default Settings
•
Configuring Online Diagnostics
•
Verifying the Online Diagnostics Configuration
•
Configuration Examples for Online Diagnostics
•
Additional References
•
Feature History for Online Diagnostics
Note
For complete syntax and usage information for the commands in this chapter, see the
Cisco MDS 9000 Family Command Reference.
Information About Online Diagnostics
Online diagnostics help you verify that hardware and internal data paths are operating as designed so that you can rapidly isolate faults.
This section includes the following topics:
•
Online Diagnostic Overview
•
Bootup Diagnostics
•
Runtime or Health Monitoring Diagnostics
•
On-Demand Diagnostics
•
High Availability
Online Diagnostic Overview
With online diagnostics, you can test and verify the hardware functionality of the device while the device is connected to a live network.
The online diagnostics contain tests that check different hardware components and verify the data path and control signals. Disruptive online diagnostic tests (such as the disruptive loopback test) and nondisruptive online diagnostic tests (such as the ASIC register check) run during bootup, line module online insertion and removal (OIR), and system reset. The nondisruptive online diagnostic tests run as part of the background health monitoring, and you can run these tests on demand.
Online diagnostics are categorized as bootup, runtime or health-monitoring diagnostics, and on-demand diagnostics. Bootup diagnostics run during bootup, health-monitoring tests run in the background, and on-demand diagnostics run once or at user-designated intervals when the device is connected to a live network.
Bootup Diagnostics
Bootup diagnostics run during bootup and detect faulty hardware before a Cisco MDS 9000 Family switch brings a module online. For example, if you insert a faulty module in the device, bootup diagnostics test the module and take it offline before the device uses the module to forward traffic.
Bootup diagnostics also check the connectivity between the supervisor and module hardware and the data and control paths for all the ASICs. Table 9-1 describes the bootup diagnostic tests for a module and a supervisor.
Table 9-1 Bootup Diagnostics
Diagnostic
|
Description
|
Module
|
EOBCPortLoopback
|
Disruptive test, not an on-demand test.
Ethernet out of band.
|
OBFL
|
Verifies the integrity of the onboard failure logging (OBFL) flash.
|
BootupPortLoopback
|
Disruptive test, not an on-demand test.
A PortLoopback test that runs only during module bootup.
|
Supervisor
|
USB
|
Nondisruptive test.
Checks the USB controller initialization on a module.
|
ManagementPortLoopback
|
Disruptive test, not an on-demand test.
Tests loopback on the management port of a module.
|
EOBCPortLoopback
|
Disruptive test, not an on-demand test.
Ethernet out of band.
|
OBFL
|
Verifies the integrity of the onboard failure logging (OBFL) flash.
|
Bootup diagnostics log failures to onboard failure logging (OBFL) and syslog and trigger a diagnostic LED indication (on, off, pass, or fail).
You can configure a Cisco MDS 9000 Family switch to either bypass the bootup diagnostics or run the complete set of bootup diagnostics. See the "Setting the Bootup Diagnostic Level" section.
Runtime or Health Monitoring Diagnostics
Runtime diagnostics are also called health monitoring (HM) diagnostics. These diagnostics provide information about the health of a live device. They detect runtime hardware errors, memory errors, the degradation of hardware modules over time, software faults, and resource exhaustion.
Health monitoring diagnostics are nondisruptive and run in the background to ensure the health of a device that is processing live network traffic. You can enable or disable health monitoring tests or change their runtime interval. Table 9-2 describes the health monitoring diagnostics and test IDs for a module and a supervisor.
Table 9-2 Health Monitoring Nondisruptive Diagnostics
Diagnostic
|
Default Interval
|
Default Setting
|
Description
|
Module
|
ASICRegisterCheck
|
1 minute
|
active
|
Checks read/write access to scratch registers for the ASICs on a module.
|
PrimaryBootROM
|
30 minutes
|
active
|
Verifies the integrity of the primary boot device on a module.
|
SecondaryBootROM
|
30 minutes
|
active
|
Verifies the integrity of the secondary boot device on a module.
|
PortLoopback
|
15 minutes
|
active
|
Verifies connectivity through every port on every module in the system.
|
RewriteEngineLoopback
|
1 minute
|
active
|
Verifies the integrity of the nondisruptive loopback for all ports up to the 1 Engine ASIC device.
|
SnakeLoopback test
|
20 minutes
|
active
|
Performs a nondisruptive loopback on all ports, even those ports that are not in the shut state. The ports are formed into a snake during module bootup, and the supervisor checks the snake connectivity periodically.
|
Supervisor
|
ASICRegisterCheck
|
20 seconds
|
active
|
Checks read/write access to scratch registers for the ASICs on the supervisor.
|
NVRAM
|
5 minutes
|
active
|
Verifies the sanity of the NVRAM blocks on a supervisor.
|
RealTimeClock
|
5 minutes
|
active
|
Verifies that the real-time clock on the supervisor is ticking.
|
PrimaryBootROM
|
30 minutes
|
active
|
Verifies the integrity of the primary boot device on the supervisor.
|
SecondaryBootROM
|
30 minutes
|
active
|
Verifies the integrity of the secondary boot device on the supervisor.
|
ExternalCompactFlash
|
30 minutes
|
active
|
Verifies access to the external compact flash devices.
|
PwrMgmtBus
|
30 seconds
|
active
|
Verifies the standby power management control bus.
|
SpineControlBus
|
30 seconds
|
active
|
Verifies the availability of the standby spine module control bus.
|
SystemMgmtBus
|
30 seconds
|
active
|
Verifies the availability of the standby system management bus.
|
StatusBus
|
30 seconds
|
active
|
Verifies the status transmitted by the status bus for the supervisor, modules, and fabric cards.
|
StandbyFabricLoopback
|
30 seconds
|
active
|
Verifies the connectivity of the standby supervisor to the crossbars on the spine card.
|
On-Demand Diagnostics
On-demand tests help localize faults and are usually needed in one of the following situations:
•
To respond to an event that has occurred, such as isolating a fault.
•
In anticipation of an event that may occur, such as a resource exceeding its utilization limit.
You can run all the health monitoring tests on demand.
You can schedule on-demand diagnostics to run immediately. See the "Starting or Stopping an On-Demand Diagnostic Test" section for more information.
You can also modify the default interval for a health monitoring test. See the "Activating a Diagnostic Test" section for more information.
High Availability
A key part of high availability is detecting hardware failures and taking corrective action while the device runs in a live network. Online diagnostics in high availability detect hardware failures and provide feedback to high availability software components to make switchover decisions.
Cisco MDS 9000 Family switches support stateless restarts for online diagnostics. After a reboot or supervisor switchover, a Cisco MDS 9000 Family switch applies the running configuration.
Licensing Requirements for Online Diagnostics
Product
|
License Requirement
|
Cisco NX-OS
|
Online diagnostics require no license. Any feature not included in a license package is bundled with the Cisco NX-OS system images and is provided at no extra charge to you. For a complete explanation of the Cisco NX-OS licensing scheme, see the Cisco MDS 9000 Family NX-OS Licensing Guide.
|
Guidelines and Limitations
•
You cannot run disruptive online diagnostic tests on demand.
Default Settings
Table 9-3 lists the default settings for online diagnostic parameters.
Table 9-3 Default Online Diagnostic Parameters
Parameters
|
Default
|
Bootup diagnostics level
|
complete
|
Nondisruptive tests
|
active
|
Configuring Online Diagnostics
This section includes the following topics:
•
Setting the Bootup Diagnostic Level
•
Activating a Diagnostic Test
•
Setting a Diagnostic Test as Inactive
•
Starting or Stopping an On-Demand Diagnostic Test
•
Clearing Diagnostic Results
•
Simulating Diagnostic Results
Setting the Bootup Diagnostic Level
To configure the bootup diagnostics to run the complete set of tests, or to bypass all bootup diagnostic tests for a faster module bootup time, perform these tasks:
Note
We recommend that you set the bootup online diagnostics level to complete.
|
Command
|
Purpose
|
Step 1
|
config terminal
Example:
switch# config terminal
Enter configuration commands, one per line. End
with CNTL/Z.
switch(config)#
|
Places you in global configuration mode.
|
Step 2
|
diagnostic bootup level {complete | bypass}
Example:
switch(config)# diagnostic bootup level complete
|
Configures the bootup diagnostic level to trigger diagnostics when the device boots:
• complete—Performs all bootup diagnostics. The default is complete.
• bypass—Does not perform any bootup diagnostics.
|
Step 3
|
show diagnostic bootup level
Example:
switch(config)# show diagnostic bootup level
|
(Optional) Displays the bootup diagnostic level (bypass or complete) that is currently in place on the device.
|
Step 4
|
copy running-config startup-config
Example:
switch(config)# copy running-config
startup-config
|
(Optional) Copies the running configuration to the startup configuration.
|
Activating a Diagnostic Test
To set a diagnostic test as active and optionally modify the interval, perform these tasks:
|
Command
|
Purpose
|
Step 1
|
config terminal
Example:
switch# config terminal
Enter configuration commands, one per line. End
with CNTL/Z.
switch(config)#
|
Places you in global configuration mode.
|
Step 2
|
diagnostic monitor interval module slot test
[test-id | name | all] hour hour min minutes
second sec
Example:
switch(config)# diagnostic monitor interval
module 6 test 3 hour 1 min 0 sec 0
|
(Optional) Configures the interval at which the specified test is run. If no interval is set, the test runs at the interval set previously, or the default interval.
The argument ranges are as follows:
• slot—The range is from 1 to 10.
• test-id—The range is from 1 to 14.
• name—Can be any case-sensitive alphanumeric string up to 32 characters.
• hour —The range is from 0 to 23 hours.
• minutes—The range is from 0 to 59 minutes.
• seconds—The range is from 0 to 59 seconds.
|
Step 3
|
diagnostic monitor module slot test [test-id |
name | all]
Example:
switch(config)# diagnostic monitor interval
module 6 test 3
|
Activates the specified test.
The argument ranges are as follows:
• slot—The range is from 1 to 10.
• test-id—The range is from 1 to 14.
• name—Can be any case-sensitive alphanumeric string up to 32 characters.
|
Step 4
|
show diagnostic content module {slot | all}
Example:
switch(config)# show diagnostic content module 6
|
(Optional) Displays information about the diagnostics and their attributes.
|
Setting a Diagnostic Test as Inactive
Note
Inactive tests keep their current configuration but do not run at at the scheduled interval.
To set a diagnostic test as inactive, perform this task:
Command
|
Purpose
|
no diagnostic monitor module slot test
[test-id | name | all]
Example:
switch(config)# no diagnostic monitor
interval module 6 test 3
|
Inactivates the specified test.
The argument ranges are as follows:
• slot—The range is from 1 to 10.
• test-id—The range is from 1 to 14.
• name—Can be any case-sensitive alphanumeric string up to 32 characters.
|
Starting or Stopping an On-Demand Diagnostic Test
You can start or stop an on-demand diagnostic test, optionally modify the number of iterations to repeat this test, and determine the action to take if the test fails.
Note
We recommend that you only manually start a disruptive diagnostic test during a scheduled network maintenance time.
To start or stop an on-demand diagnostic test, perform these tasks:
|
Command
|
Purpose
|
Step 1
|
diagnostic ondemand iteration number
Example:
switch# diagnostic ondemand iteration 5
|
(Optional) Configures the number of times that the on-demand test runs. The range is from 1 to 999. The default is 1.
|
Step 2
|
diagnostic ondemand action-on-failure {continue
failure-count num-fails | stop}
Example:
switch# diagnostic ondemand action-on-failure
stop
|
(Optional) Configures the action to take if the on-demand test fails. The num-fails range is from 1 to 999. The default is 1.
|
Step 3
|
diagnostic start module slot test [test-id | name
| all | non-disruptive] [port port-number | all]
Example:
switch# diagnostic start module 6 test all
|
Starts one or more diagnostic tests on a module. The module slot range is from 1 to 10. The test-id range is from 1 to 14. The test name can be any case-sensitive alphanumeric string up to 32 characters. The port range is from 1 to 48.
|
Step 4
|
diagnostic stop module slot test [test-id | name
| all]
Example:
switch# diagnostic stop module 6 test all
|
Stops one or more diagnostic tests on a module. The module slot range is from 1 to 10. The test-id range is from 1 to 14. The test name can be any case-sensitive alphanumeric string up to 32 characters.
|
Step 5
|
show diagnostic status module slot
Example:
switch# show diagnostic status module 6
|
(Optional) Verifies that the diagnostic has been scheduled.
|
Clearing Diagnostic Results
To clear the diagnostic test results, use the following command:
Command
|
Purpose
|
diagnostic clear result module [slot | all]
test {test-id | all}
Example:
switch# diagnostic clear result module 2
test all
|
Clears the test result for the specified test.
The argument ranges are as follows:
• slot—The range is from 1 to 10.
• test-id—The range is from 1 to 14.
|
Simulating Diagnostic Results
To simulate a diagnostic test result, use the following command:
Command
|
Purpose
|
diagnostic test simulation module slot test
test-id {fail | random-fail | success}
[port number | all]
Example:
switch# diagnostic test simulation module 2
test 2 fail
|
Simulates a test result. The test-id range is from 1 to 14. The port range is from 1 to 48.
|
To clear the simulated diagnostic test result, use the following command:
Command
|
Purpose
|
diagnostic test simulation module slot test
test-id clear
Example:
switch# diagnostic test simulation module 2
test 2 clear
|
Clears the simulated test result. The test-id range is from 1 to 14.
|
Verifying the Online Diagnostics Configuration
To display online diagnostics configuration information, use one of these commands:
Command
|
Purpose
|
show diagnostic bootup level
|
Displays information about bootup diagnostics.
|
show diagnostic content module {slot | all}
|
Displays information about diagnostic test content for a module.
|
show diagnostic description module slot test [test-name | all]
|
Displays the diagnostic description.
|
show diagnostic events [error | info]
|
Displays diagnostic events by error and information event type.
|
show diagnostic ondemand setting
|
Displays information about on-demand diagnostics.
|
show diagnostic result module slot [test [test-name | all]] [detail]
|
Displays information about the results of a diagnostic.
|
show diagnostic simulation module slot
|
Displays information about a simulated diagnostic.
|
show diagnostic status module slot
|
Displays the test status for all tests on a module.
|
show hardware capacity [eobc | fabric-utilization | forwarding | interface | module | power]
|
Displays information about the hardware capabilities and current hardware utilization by the system.
|
show module
|
Displays module information including the online diagnostic test status.
|
Configuration Examples for Online Diagnostics
This example shows how to start all on-demand tests on a module:
diagnostic start module 6 test all
This example shows how to activate a test and set the test interval on a module:
configure terminal
diagnostic monitor module 6 test 2
diagnostic monitor interval module 6 test 2 hour 3 min 30 sec 0
Additional References
For additional information related to implementing online diagnostics, see the following sections:
•
Related Documents
•
Feature History for Online Diagnostics
Related Documents
Related Topic
|
Document Title
|
Online diagnostics CLI commands
|
Cisco MDS 9000 Family Command Reference
|
Feature History for Online Diagnostics
Table 9-4 lists the release history for this feature.
Table 9-4 Feature History for Online Diagnostics
Feature Name
|
Releases
|
Feature Information
|
Generic Online Diagnostics (GOLD)
|
6.2
|
This feature was introduced.
|