At blade start-up, POST diagnostics test the CPUs, DIMMs, HDDs, and adapter cards, and any failure notifications are sent to UCS Manager. You can view these notifications in the System Error Log or in the output of the show tech-support command. If errors are found, an amber diagnostic LED also lights up next to the failed component. During run time, the blade BIOS, component drivers, and OS all monitor for hardware faults and will light up the amber diagnostic LED for a component if an uncorrectable error or correctable errors (such as a host ECC error) over the allowed threshold occur.
LED states are saved, and if you remove the blade from the chassis the LED values will persist for up to 10 minutes. Pressing the LED diagnostics button on the motherboard causes the LEDs that currently show a component fault to light for up to 30 seconds for easier component identification. LED fault values are reset when the blade is reinserted into the chassis and booted, and the process begins from its start.
If DIMM insertion errors are detected, they may cause the blade discovery to fail and errors will be reported in the server POST information, which is viewable using the UCS Manager GUI or CLI. UCS blade servers require specific rules to be followed when populating DIMMs in a blade server, and the rules depend on the blade server model. Refer to the documentation for a specific blade server for those rules.
HDD status LEDs are on the front face of the HDD. Faults on the CPU, DIMMs, or adapter cards also cause the server health LED to light solid amber for minor error conditions or blinking amber for critical error conditions.