Diagnostic Tools in Diagnostics for the UCS C-Series Servers Utility

This chapter includes the following sections:

Diagnostic Tools Functions


Note

Information in this chapter is applicable only to Diagnostics for the UCS C-Series Servers Utility.


Diagnostic tools allow you to:

  • Run tests on various server components to find out hardware issues and analysis of the test results in a tabular format.

  • Run all the tests using the Quick Tasks functionality without browsing through available tests.

  • Run tests serially, as running some tests in parallel may interfere with other tests.

  • Configure the test by entering different argument values other than the default ones.

  • Select tests you want to run using the Test Suite functionality.

  • Probe the current state of the server and view hardware issues.

The following table describes when you should use a specific diagnostic functionality.

Table 1. Using Diagnostics

Diagnostic Component

Function

F7 Option

Use this option to run a specific set of tests when the server is booting up.

The components that are tested are memory, processor, cache, Smart disk, UPI, memory pattern, and RAID adapter.

Quick Tests

Use this when you want to quickly check the status of a subsystem within a stipulated period. The components that can be tested under the quick test are processor, cache, memory, disk, video, network, UPI, CIMC, RAID, and chipset.

Comprehensive Tests

Use this when you want to test a subsystem in detail. These tests are designed to stress the subsystems and report the error. The tests that can be run are processor, memory, UPI, disk, and NUMA.

Quick Tasks

Allows for consolidated testing of both comprehensive and quick tests. You can run both types of tests using quick tasks.

Test Suite

All the tests available under the quick and comprehensive tests are available here. The test suite gives you an option to choose as many tests as you like (using a check box) and running them together.

Tests Log Summary

Use the test log summary to view the log, error log, and analysis of all the tests you have run. You can use four filters to sort the logs.

Tests Summary

This table on the left-hand navigation gives you the results of the tests you have run in the form of either passed tests, tests in queue or failed tests.

Using Diagnostic Tools

Using the F7 Diagnostics Option

UCS-SDU provides you with an option to run a few pre-defined diagnostic tests on the server when it is booting. You can initiate these diagnostic tests by using the F7 option. This F7 option boots the SDU image mounted on the FlexMMC and automatically runs a set of pre-defined diagnostic tests.

If the SDU image is not mounted on the FlexMMC, then you should have mapped the SDU image using virtual media. If you have not mapped the SDU image using virtual media, and the SDU image is not mounted on the FlexMMC on the server, then these diagnostic tests cannot be completed. After the tests are completed, the SDU interface appears and displays the test results. The interface displays a progress report indicating diagnostic tests that have passed, failed and those that are queued for completion.


Note

You can use this option only when the server is booting.


Quick Tests

You can run these tests quickly to determine any hardware issue. These tests usually take 20 to 30 minutes to run and test limited functionality for a few subsystems. The comprehensive test provides more exhaustive diagnostics.

To run the quick test follow these steps:

Procedure


Step 1

Click Diagnostic Tools from the left navigation pane.

Step 2

Click Tests.

Step 3

Click the Quick Tests collapsible button to view the types of quick tests available for you to run.

Step 4

Click a subsystem (such as memory, video, or network).

Step 5

On the content pane, click Run all quick tests.

The test is run and the status is displayed in the Tests Status area.

The following table describes the sub-systems covered under Quick Tests.

Table 2. Quick Tests

Test

Description

Cache Validation Test

Runs CPU cache-specific tests to exercise the CPU caches and checks for correctable/uncorrectable cache errors.

Chipset Test

Runs a test to check the chipset for any errors logged in the chipset RAS registers.

Enclosure Test

Runs test to check the enclosure.

Video Memory Stress

Runs stress tests on the Video Memory.

CIMC Test

Runs CIMC self-test through the IPMI interface and also checks for SEL fullness.

CPU Stress

CPU Stress Test

CPU Stream

CPU Stress Test using stream benchmark.

CPU Cache

CPU Stress Test that is run in parallel on all processors.

CPU Register

CPU Register access test

Memory Noise Test

Write or verify random data and its complement with large address variation.

Memory Butterfly Test

Each loop write, then verify address and address complement in next address (64-bit data).

Network Test

Tests the available network interfaces by running the internal loopback test, register test, Electrically Erasable Programmable Read Only Memory (EEPROM) test and interrupt test.

UPI Test

Tests the quick path interconnect fabric.

Note 

Applicable to Intel only.


Comprehensive Tests

The Comprehensive tests can run for hours and usually runs when quick tests cannot diagnose the issue with your server. The tests are designed to test multiple hardware components and find issues that may be caused due to multiple components on your server.

The individual tests can be customized and run to test some user-defined conditions. You can also select a group of tests to be run.

To run the comprehensive test, follow these steps:

Procedure


Step 1

Click Diagnostic Tools from the left navigation pane.

Step 2

Click Tests.

Step 3

Click the Comprehensive Tests collapsible button to view the types of comprehensive tests available for you to run.

Step 4

Click a subsystem (such as processor, memory, or network).

Step 5

On the content pane, click Run all comprehensive tests.

The test is run and the status is displayed in the Tests Status area.

The following table describes the sub-systems covered under comprehensive tests.

Table 3. Comprehensive Tests

Test

Description

Processor Stress Test

Imposes maximum stress on CPU and memory on the system. You can set the time (in minutes) that you want this test to run for.

Video Memory Stress

Runs stress tests on the video memory.

UPI Stress Test

Runs test to stress the UPI interconnect by generating traffic between the NUMA nodes.

Note 

Applicable to Intel only.

CIMC Test

Runs CIMC self-test through the IPMI interface and also checks for SEL fullness.

CPU Stress

CPU Stress Test

CPU Stream

CPU Stress Test using stream benchmark.

CPU Cache

CPU Stress Test that is run in parallel on all processors.

CPU Register

CPU Register access test

Memory Noise

Write or verify random data and its complement with large address variation.

Memory Random

Sequentially write random data to memory, verify, write complement, verify, increment seed for next loop.

Memory March

Each loop write 0, read 0/write 1 (up direction), then read 1, write 0/read 0 (down direction).

Memory Walk

Each loop walk ones followed by walk zeroes (64-bit data).

Memory Address

Using 64-bit addressing write address in address for each loop.

Memory Pattern

Write sequence 0x00 to 0xFF which is prime 257 byte sequence that ensures the low address starts with different byte each loop.

Memory Butterfly Test

Each loop write, then verify address and address complement in next address (64-bit data).

NUMA Test

Runs test to stress the NUMA memory access patterns and check for errors.


Quick Tasks

Quick Tasks allow you to get started with diagnostic tools immediately. You can run all the tests (Quick and Comprehensive) from here and report the details to Cisco to troubleshoot the logs and provide information about problems with your system. To use this feature, follow these steps:

Procedure


Step 1

Click Diagnostic Tools from the left navigation pane.

Step 2

Click Quick Tasks.

Step 3

Select either Run Quick Tests or Run Comprehensive Tests from the toolbar.

The status appears in the Tests Status pane. You can also view detailed test results under test log summary.


Tests Suite

The Test Suite allows you to run the quick test and comprehensive test in a batch. It lists the various tests available, along with the test type and description of the test. You can select any number of tests you want to run from the list and view the result in the Tests Status column.

To run the test suite, follow these steps:

Procedure


Step 1

Click Tests Suite from the left navigation pane.

Step 2

Select the tests you want to run by clicking the required check boxes.

Step 3

Click Run Tests Suite to run the tests you added to the test suite.

The status appears in the Tests Status pane along with the name, suite ID, Result, start time and end time. You can also view the Tests Log Summary to view the execution status of the tests in the test suite.


Tests Log Summary

Use the Tests Log Summary functionality to examine the test logs for troubleshooting. To view the Tests Log summary, follow these steps:

Procedure


Step 1

Click Diagnostic Tools on the left navigation pane.

Step 2

Click Tests Log Summary on the left navigation pane.

Step 3

Select a filter from the filter drop-down and click Go. The status, result, start time, and end time of the test displays.

Step 4

Click a specific log entry (for example, click Memory Test) for more details.

The Log, Error Log (if the test failed), and the analysis of the specific test appears in the content pane.