The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This chapter includes the following sections:
Tip | To monitor an individual component in a chassis, expand the node for that component. |
Step 1 | In the Navigation pane, click Equipment. | ||||||||||||||||||||||||||||||||||||||
Step 2 | Expand . | ||||||||||||||||||||||||||||||||||||||
Step 3 | Click the chassis that you want to monitor. | ||||||||||||||||||||||||||||||||||||||
Step 4 | Click one of the
following tabs to view the status of the chassis:
|
Step 1 | In the Navigation pane, click Equipment. | ||||||||||||||||||||||||||||
Step 2 | Expand . | ||||||||||||||||||||||||||||
Step 3 | Click the server that you want to monitor. | ||||||||||||||||||||||||||||
Step 4 | In the
Work pane, click one of the following tabs to
view the status of the server:
| ||||||||||||||||||||||||||||
Step 5 | In the Navigation pane, expand . | ||||||||||||||||||||||||||||
Step 6 | In the
Work pane, right-click one or
more of the following components of the adapter to open the navigator and view the status of the component:
|
Step 1 | In the Navigation pane, click Equipment. | ||||||||||||||||||||||||||||
Step 2 | Expand . | ||||||||||||||||||||||||||||
Step 3 | Click the server that you want to monitor. | ||||||||||||||||||||||||||||
Step 4 | In the
Work pane, click one of the following tabs to
view the status of the server:
| ||||||||||||||||||||||||||||
Step 5 | In the Navigation pane, expand . | ||||||||||||||||||||||||||||
Step 6 | In the
Work pane, right-click one or
more of the following components of the adapter to open the navigator and view the status of the component:
|
Step 1 | In the Navigation pane, click Equipment. | ||||||||||||||||
Step 2 | On the Equipment tab, expand . | ||||||||||||||||
Step 3 | Click the module that you want to monitor. | ||||||||||||||||
Step 4 | Click one of the
following tabs to view the status of the module:
|
Monitoring Management Interfaces
This policy defines how the mgmt0 Ethernet interface on the fabric interconnect should be monitored. If Cisco UCS detects a management interface failure, a failure report is generated. If the configured number of failure reports is reached, the system assumes that the management interface is unavailable and generates a fault. By default, the management interfaces monitoring policy is enabled.
If the affected management interface belongs to a fabric interconnect which is the managing instance, Cisco UCS confirms that the subordinate fabric interconnect's status is up, that there are no current failure reports logged against it, and then modifies the managing instance for the endpoints.
If the affected fabric interconnect is currently the primary inside of a high availability setup, a failover of the management plane is triggered. The data plane is not affected by this failover.
You can set the following properties related to monitoring the management interface:
Type of mechanism used to monitor the management interface.
Interval at which the management interface's status is monitored.
Maximum number of monitoring attempts that can fail before the system assumes that the management is unavailable and generates a fault message.
Step 1 | In the Navigation pane, click Admin. | ||||||||||||||||||||||||
Step 2 | In the Admin tab, expand . | ||||||||||||||||||||||||
Step 3 | Click Management Interfaces. | ||||||||||||||||||||||||
Step 4 | In the Work pane, click the Management Interfaces Monitoring Policy tab. | ||||||||||||||||||||||||
Step 5 | Complete the following fields:
| ||||||||||||||||||||||||
Step 6 | If you chose Mii Status for the monitoring mechanism, complete the following fields in the Media Independent Interface Monitoring area:
| ||||||||||||||||||||||||
Step 7 | If you chose Ping Arp Targets for the monitoring mechanism, complete the fields on the appropriate tab in the ARP Target Monitoring area. If you are using IPv4 addresses, complete the following fields in the IPv4 subtab:
If you are using IPv6 addresses, complete the following fields in the IPv6 subtab:
Type 0.0.0.0 for an IPv4 address to remove the ARP target or :: for an IPv6 address to remove the N-disc target. | ||||||||||||||||||||||||
Step 8 | If you chose Ping Gateway for the monitoring mechanism, complete the following fields in the Gateway Ping Monitoring area:
| ||||||||||||||||||||||||
Step 9 | Click Save Changes. |
Health Monitoring
You can monitor Cisco UCS Fabric Interconnect system statistics and faults that allow you to manage overall system health, such as:
Low kernel memory—This is the segment that the Linux kernel addresses directly. Cisco UCS Managerraises a major fault on a Fabric Interconnect when kernel memory falls below 100 MB. Two statistics, KernelMemFree and KernelMemTotal, alarm when low memory thresholds are met. KernelMemFree and KernelMemTotal statistics are added to threshold policy for system statistics where you can define your own thresholds.
Low memory faults are supported on the following UCS Fabric Interconnects, including:
To view Fabric Interconnect low memory statistics and correctable memory statistics:
Cisco UCS Manager system raises a major severity fault on a Fabric Interconnect when kernel memory free falls below 100 MB.
Low memory faults are supported on the following UCS Fabric Interconnects, including:
To view Fabric Interconnect
The Cisco Integrated Management Controller (CIMC) reports the following memory usage events for blade and rack-mount servers:
When memory falls below 1MB, CIMC has fatal memory usage. Reset is imminent.
When memory falls below 5 MB, CIMC has extremely high memory usage.
When memory falls below 10 MB, CIMC has high memory usage.
If CIMC reports two health events, one with major severity, the other with minor severity, the system raises a major severity fault and displays details under the Health tab Management Services subtab.
To view CIMC memory usage events:
If CIMC reports two health events, one with major severity, the other with minor severity, the system raises a major severity fault and displays details under the Health tab Management Services subtab. Each health event does not translate to a fault. The highest severity health event translates to a fault. Faults appear under tab. |
The Cisco Chassis Management Controller (CMC) reports memory usage events for I/O modules and chassis.
The system raises a fault on the aggregation of reported health status.
To view CMC memory usage events:
The Cisco UCS Manager reports the following statistics for Cisco Fabric Extenders (FEXs) under the System Stats:
Cisco 2200 Series and 2300 Series FEX support statistics monitoring.
Note | FEX stats are not supported on the Cisco UCS Mini platform. |
All FEX stats are added to threshold policy as FexSystemStats where user can define there own thresholds.
Local storage monitoring in Cisco UCS provides status information on local storage that is physically attached to a blade or rack server. This includes RAID controllers, physical drives and drive groups, virtual drives, RAID controller batteries (BBU), Transportable Flash Modules (TFM) and super-capacitors, FlexFlash controllers, and SD cards.
Cisco UCS Manager communicates directly with the LSI MegaRAID controllers and FlexFlash controllers using an out-of-band (OOB) interface, which enables real-time updates. Some of the information that is displayed includes:
RAID controller status and rebuild rate.
The drive state, power state, link speed, operability and firmware version of physical drives.
The drive state, operability, strip size, access policies, drive cache, and health of virtual drives.
The operability of a BBU, whether it is a supercap or battery, and information about the TFM.
LSI storage controllers use a Transportable Flash Module (TFM) powered by a super-capacitor to provide RAID cache protection.
Information on SD cards and FlexFlash controllers, including RAID health and RAID state, card health, and operability.
Information on operations that are running on the storage component, such as rebuild, initialization, and relearning.
Note | After a CIMC reboot or build upgrades, the status, start time, and end times of operations running on the storage component might not be displayed correctly. |
Detailed fault information for all local storage components.
Note | All faults are displayed on the Faults tab. |
The type of monitoring supported depends upon the Cisco UCS server.
Through Cisco UCS Manager, you can monitor local storage components for the following servers:
Cisco UCS B200 M3 blade server
Cisco UCS B420 M3 blade server
Cisco UCS B22 M3 blade server
Cisco UCS B200 M4 blade server
Cisco UCS B260 M4 blade server
Cisco UCS B460 M4 blade server
Cisco UCS C460 M2 rack server
Cisco UCS C420 M3 rack server
Cisco UCS C260 M2 rack server
Cisco UCS C240 M3 rack server
Cisco UCS C220 M3 rack server
Cisco UCS C24 M3 rack server
Cisco UCS C22 M3 rack server
Cisco UCS C220 M4 rack server
Cisco UCS C240 M4 rack server
Cisco UCS C460 M4 rack server
Note | Not all servers support all local storage components. For Cisco UCS rack servers, the onboard SATA RAID 0/1 controller integrated on motherboard is not supported. |
Only legacy disk drive monitoring is supported through Cisco UCS Manager for the following servers:
Note | In order for Cisco UCS Manager to monitor the disk drives, the 1064E storage controller must have a firmware level contained in a UCS bundle with a package version of 2.0(1) or higher. |
These prerequisites must be met for local storage monitoring or legacy disk drive monitoring to provide useful status information:
Note | The following information is applicable only for B200 M1/M2 and B250 M1/M2 blade servers. |
The legacy disk drive monitoring for Cisco UCS provides Cisco UCS Manager with blade-resident disk drive status for supported blade servers in a Cisco UCS domain. Disk drive monitoring provides a unidirectional fault signal from the LSI firmware to Cisco UCS Manager to provide status information.
The following server and firmware components gather, send, and aggregate information about the disk drive status in a server:
Physical presence sensor—Determines whether the disk drive is inserted in the server drive bay.
Physical fault sensor—Determines the operability status reported by the LSI storage controller firmware for the disk drive.
IPMI disk drive fault and presence sensors—Sends the sensor results to Cisco UCS Manager.
Disk drive fault LED control and associated IPMI sensors—Controls disk drive fault LED states (on/off) and relays the states to Cisco UCS Manager.
Flash life wear level monitoring enables you to monitor the life span of solid state drives. You can view both the percentage of the flash life remaining, and the flash life status. Wear level monitoring is supported on the Fusion IO mezzanine card with the following Cisco UCS blade servers:
Cisco UCS B22 M3 blade server
Cisco UCS B200 M3 blade server
Cisco UCS B420 M3 blade server
Cisco UCS B200 M4 blade server
Cisco UCS B260 M4 blade server
Cisco UCS B460 M4 blade server
Note | Wear level monitoring requires the following:
|
Step 1 | In the Navigation pane, click Equipment. | ||
Step 2 | Expand . | ||
Step 3 | Click the server for which you want to view the status of your local storage components. | ||
Step 4 | In the Work pane, click the Inventory tab. | ||
Step 5 | Click the Storage subtab to view the status of your RAID controllers and any FlexFlash controllers. | ||
Step 6 | Click the down arrows to expand the
Local Disk Configuration Policy,
Actual Disk Configurations,
Disks, and
Firmware bars and view additional status information.
|
The Check Consistency operation is not supported for RAID 0 volumes. You must change the local disk configuration policy to run Check Consistency. For more information, see Changing a Local Disk Configuration Policy.
With Cisco UCS Manager, you can view the properties for certain graphics cards and controllers. Graphics cards are supported on the following servers:
Note | Certain NVIDIA Graphics Processing Units (GPU) do not support Error Correcting Code (ECC) and vGPU together. Cisco recommends that you refer to the release notes published by NVIDIA for the respective GPU to know whether it supports ECC and vGPU together. |
LSI storage controllers use a Transportable Flash Module (TFM) powered by a supercapacitor to provide RAID cache protection. With Cisco UCS Manager, you can monitor these components to determine the status of the battery backup unit (BBU). The BBU operability status can be one of the following:
Operable—The BBU is functioning successfully.
Inoperable—The TFM or BBU is missing, or the BBU has failed and needs to be replaced.
Degraded—The BBU is predicted to fail.
TFM and supercap functionality is supported beginning with Cisco UCS Manager Release 2.1(2).
The CIMC sensors for TFM and supercap on the Cisco UCS B420 M3 blade server are not polled by Cisco UCS Manager.
If the TFM and supercap are not installed on the Cisco UCS B420 M3 blade server, or are installed and then removed from the blade server, no faults are generated.
If the TFM is not installed on the Cisco UCS B420 M3 blade server, but the supercap is installed, Cisco UCS Manager reports the entire BBU system as absent. You should physically check to see if both the TFM and supercap is present on the blade server.
The following Cisco UCS servers support TFM and supercap:
This procedure applies only to Cisco UCS servers that support RAID configuration and TFM. If the BBU has failed or is predicted to fail, you should replace the unit as soon as possible.
Note | This applies only to Cisco UCS servers that support RAID configuration and TFM. |
Trusted Platform Module (TPM) is included on all Cisco UCS M3 blade and rack-mount servers. Operating systems can use TPM to enable encryption. For example, Microsoft's BitLocker Drive Encryption uses the TPM on Cisco UCS servers to store encryption keys.
Cisco UCS Manager enables monitoring of TPM, including whether TPM is present, enabled, or activated.