Contents
- Configuring GOLD Health Monitoring for the Cisco ASR 903 Router
- Restrictions for the GOLD feature
- Information About GOLD
- Limitations of Existing Logging Mechanism
- Understanding the Importance of GOLD Functionality
- Understanding the GOLD Feature
- Configuring Online Diagnostics
- Configuring the Bootup Diagnostics Level
- Configuring On-Demand Diagnostics
- Scheduling Diagnostics
- Configuring Health-Monitoring Diagnostics
- Displaying Online Diagnostic Tests and Test Results
- Supported GOLD Tests on the Cisco ASR 903 Router
- How to Manage Diagnostic Tests
- Configuration Examples for GOLD Feature
Configuring GOLD Health Monitoring for the Cisco ASR 903 Router
Generic Online Diagnostic (GOLD) is a health monitoring feature implemented on the Cisco ASR 903 Router. The GOLD functionality is developed to provide online diagnostic capabilities that run at bootup, in the background on a periodic basis, or based on demand from the CLI.
Note
This is not applicable for Cisco ASR 900 RSP3 Module.Information About GOLD
The sections provide details of the GOLD feature.
Limitations of Existing Logging Mechanism
To provide high-availability for a router without any downtime it is imperative to analyze the stability of a system. The primary method of discovering the cause of system failure is system messages. However, there are certain system failures that do not send notifications. It is difficult to understand the cause of these system failures, as the existing logging mechanism fails to notify or maintain a log of these failures.
Understanding the Importance of GOLD Functionality
As there are certain system failures that do not send any notification or keep a log of failure, it is essential to address these limitations. The GOLD feature has been designed specifically to provide error detection by polling for errors for those system modules that do not have any notification mechanism. GOLD has been implemented on the Cisco ASR 903 Routerto actively poll for system errors. Online diagnostics is one of the requirements for high availability (HA). HA is a a set of quality standards that seeks to limit the impact of equipment failures on the network. A key part of HA is detecting system failures and taking corrective actions while the system is running in a live network.
Understanding the GOLD Feature
The GOLD feature is primarily used to poll for system errors targeted for those components, which do not send a notification upon failure. Although the infrastructure can be used to poll for both hardware and system errors, the main scope is to poll for status and error registers on physical hardware device. The Cisco ASR 903 Router uses a distributed GOLD implementation. In this model, the core Cisco IOS GOLD subsystem is linked on both the route service processor (RSP) and the interface modules.
Diagnostic tests can be registered either as local tests which run on the RP or as proxy tests which run on the line cards. When a proxy test is requested on the RP, a command is sent using Inter-Process Communication (IPC) to the line card to instruct it to run the test locally. The results are then returned to the RSP using IPC. Tests are specified by card type on a per slot/subslot basis. Diagnostic tests can be run either on bootup, periodically (triggered by a timer), or on demand from the CLI. GOLD feature is managed through a range of commands which are mainly used to provide on-demand diagnostic tests, schedule tests at particular intervals, monitor the system health on periodic basis and to view the diagnostic test results.
Configuring Online Diagnostics
The sections describe how to configure various types of diagnostics and view test reports.
Configuring the Bootup Diagnostics Level
ProcedureYou can configure the bootup diagnostics level as minimal or complete or you can bypass the bootup diagnostics entirely. Enter the complete keyword to run all bootup diagnostic tests and the minimal keyword to run minimal tests such as loopback. Enter the no form of the command to bypass all diagnostic tests. The default bootup diagnostics level is minimal.
Note
None of the currently implemented tests on the Cisco ASR 903 Router are bootup tests.Configuring On-Demand Diagnostics
ProcedureYou can run the on-demand diagnostic tests from the CLI. You can set the execution action to either stop or continue the test when a failure is detected or to stop the test after a specific number of failures occur by using the failure count setting. You can configure a test to run multiple times using the iteration setting.
Scheduling Diagnostics
ProcedureYou can schedule online diagnostics to run at a designated time of day or on a daily, weekly, or monthly basis. You can schedule tests to run only once or to repeat at an interval. Use the no form of this command to remove the scheduling.
To schedule online diagnostics, follow these steps:
Command or Action Purpose
Step 1 enable
Example:Router> enableEnables privileged EXEC mode. Enter your password if prompted.
Step 2 configure terminal
Example:Router# configure terminalEnters global configuration mode.
Step 3 diagnostic schedule {slot slot-no } test {test-id | test-id-range | all | complete | minimal | non-disruptive} {daily hh:mm | on mm dd year hh:mm | weekly day-of-week hh:mm }
Example:
Example:and
Example:
Example:diagnostic schedule {subslot slot/sub-slot} test {test-id | test-id-range | all | complete | minimal | non-disruptive | per-port {daily hh:mm | on mm dd year hh:mm | weekly day-of-week hh:mm | port {{num | port#range | all}{daily hh:mm | on mm dd year hh:mm | weekly day-of-week hh:mm}}}}
Example:This example shows how to schedule the diagnostic testing on a specific date and time for a specific slot:
Example:Router(config)# diagnostic schedule slot 1 test 1 on september 2 2009 12:00
Example:This example shows how to schedule the diagnostic testing to occur daily at a certain time for a specific slot:
Example:Router(config)# diagnostic schedule slot 1 test complete daily 08:00
Schedules on-demand diagnostic tests for a specific date and time, how many times to run (iterations), and what action to take when errors are found.
Configuring Health-Monitoring Diagnostics
ProcedureYou can configure health-monitoring diagnostic testing while the system is connected to a live network. You can configure the execution interval for each health monitoring test, whether or not to generate a system message upon test failure, or to enable or disable an individual test. Use the no form of this command to disable testing.
Note
Before enabling the diagnostic monitor test, you first need to set the interval to run the diagnostic test. An error message is displayed if the interval is not configured before enabling the monitoring.Displaying Online Diagnostic Tests and Test Results
ProcedureYou can display the online diagnostic tests that are configured and check the results of the tests using the show commands.
Supported GOLD Tests on the Cisco ASR 903 Router
This section discusses the GOLD test cases that have been implemented on Cisco ASR 903 Router. The Cisco ASR 903 Router supports the following categories of GOLD tests:
The following tests are currently supported:
Error Counter Monitoring Test—The error counter monitoring test is defined as a health monitoring test. The error counter monitoring test detects errors on ASICs attached to the active RSP. If errors exceed a certain threshold, the router displays a syslog message containing details including ASIC, register identifier, ASIC ID, ASIC instance, and counter values. The interval for polling for errors is fixed to 5 seconds.
For an example of an error counter monitoring test configuration, see Configuration Examples for GOLD Feature.
How to Manage Diagnostic Tests
ProcedureThis section describes how to manage the diagnostic tests. The following GOLD commands are used to to manage the ondemand and periodic diagnostic tests:
Configuration Examples for GOLD Feature
The following example shows a sample output of the test configuration, test attributes, and the supported coverage test levels for each test and for each slot:
Router#show diagnostic description slot R0 test ? Diagnostics test suite attributes: M/C/* - Minimal bootup level test / Complete bootup level test / NA B/* - Basic ondemand test / NA P/V/* - Per port test / Per device test / NA D/N/* - Disruptive test / Non-disruptive test / NA S/* - Only applicable to standby unit / NA X/* - Not a health monitoring test / NA F/* - Fixed monitoring interval test / NA E/* - Always enabled monitoring test / NA A/I - Monitoring is active / Monitoring is inactive Test Interval Thre- ID Test Name Attributes day hh:mm:ss.ms shold ==== ================================== ============ =============== ===== 1) TestErrorCounterMonitor ---------> ***N**F*A 000 00:00:05.00 50Copyright © 2017, Cisco Systems, Inc. All rights reserved.