I/O Module-Related Faults

This chapter contains the following sections:

fltEquipmentIOCardRemoved

Fault Code

F0376

Description

[sensor_name]: PCI Slot [id] riser or card missing: reseat or replace pci card [id]

Explanation

This fault indicates that an I/O card has been removed from the chassis, or that the card or the slot is faulty.

Recommended Action

If you see this fault, take the following actions:

  1. Re-seat or re-insert the I/O card.

    Before re-inserting this server component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations and warnings.

  2. If the issue continues, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: critical

Cause: equipment-removed

mibFaultCode: 376

mibFaultName: fltEquipmentIOCardRemoved

moClass: equipment: IOCard

Type: equipment

fltEquipmentIOCardThermalProblem

Fault Code

F0379

Description

[sensor_name]: Adaptor Unit [Id] is inoperable due to high temperature : Check Cooling

Explanation

This fault occurs when there is a thermal problem on an I/O card.

The possible contributing factors are as follows:

  • Temperature extremes can cause Cisco UCS equipment to operate at reduced efficiency and cause various problems, including early degradation, failure of chips, and failure of equipment. In addition, extreme temperature fluctuations can cause CPUs to become loose in their sockets.

  • Cisco UCS equipment must operate in an environment that provides an inlet air temperature not colder than 50F (10C) nor hotter than 95F (35C).

Recommended Action

If you see this fault, take the following actions:

  1. Review the product specifications to determine the temperature operating range of the I/O card.

  2. Review the Cisco UCS Site Preparation Guide to ensure that the servers have adequate airflow, including front and back clearance.

  3. Verify that the airflow to the server is not obstructed.

  4. Verify that the site cooling system is operating properly.

  5. Clean the installation site at regular intervals to avoid a buildup of dust and debris, which can cause a system to overheat.

  6. Replace faulty I/O cards.

    Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations and warnings.

  7. If the problem still persists, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: major

Cause: thermal-problem

mibFaultCode: 379

mibFaultName: fltEquipmentIOCardThermalProblem

moClass: equipment:IOCard

Type: environmental

fltEquipmentIOCardThermalThresholdCritical

Fault Code

F0730

Description

Adaptor Unit [id] Temperature is critical : Check Cooling

Explanation

This fault indicates that the temperature of an I/O card has exceeded a critical threshold value.

The possible contributing factors are as follows:

  • Temperature extremes can cause Cisco UCS equipment to operate at reduced efficiency and cause various problems, including early degradation, failure of chips, and failure of equipment. In addition, extreme temperature fluctuations can cause CPUs to become loose in their sockets

  • Cisco UCS equipment must operate in an environment that provides an inlet air temperature not colder than 50F (10C) nor hotter than 95F (35C).

  • If sensors on a CPU reach 179.6F (82C), the system takes that CPU offline

Recommended Action

If you see this fault, take the following actions:

  1. Review the product specifications to determine the temperature operating range of the I/O card.

  2. Verify that the site cooling system is operating properly.

  3. Clean the installation site at regular intervals to avoid a buildup of dust and debris, which can cause a system to overheat.

  4. If the problem still persists, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: major

Cause: thermal-problem

mibFaultCode: 730

mibFaultName: fltEquipmentIOCardThermalThresholdCritical

moClass: equipment:IOCard

Type: environmental

fltEquipmentIOCardThermalThresholdNonCritical

Fault Code

F0729

Description

Adaptor Unit [Id] Temperature is non critical : Check Cooling

Explanation

This fault indicates that the temperature of an I/O card has exceeded a non-critical threshold value, but is still below the critical threshold.

The possible contributing factors are as follows:

  • Temperature extremes can cause Cisco UCS equipment to operate at reduced efficiency and cause a variety of problems, including early degradation, failure of chips, and failure of equipment. In addition, extreme temperature fluctuations can cause CPUs to become loose in their sockets.

  • Cisco UCS equipment should operate in an environment that provides an inlet air temperature not colder than 50F (10C) nor hotter than 95F (35C).

  • If sensors on a CPU reach 179.6F (82C), the system will take that CPU offline

Recommended Action

If you see this fault, take the following actions:

  1. Review the product specifications to determine the temperature operating range of the I/O card.

  2. Verify that the airflow to the server is not obstructed.

  3. Verify that the site cooling system is operating properly.

  4. Clean the installation site at regular intervals to avoid buildup of dust and debris, which can cause a system to overheat.

  5. If the problem still persists, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: minor

Cause: thermal-problem

mibFaultCode: 729

mibFaultName: fltEquipmentIOCardThermalThresholdNonCritical

moClass: equipment:IOCard

Type: environmental

fltEquipmentIOCardThermalThresholdNonRecoverable

Fault Code

F0731

Description

Adaptor Unit [id] Temperature is non recoverable : Check Cooling

Explanation

This fault indicates that the temperature of an I/O card has been out of the operating range.

The possible contributing factors are as follows:

  • Temperature extremes can cause Cisco UCS equipment to operate at reduced efficiency and cause various problems, including early degradation, failure of chips, and failure of equipment. In addition, extreme temperature fluctuations can cause CPUs to become loose in their sockets.

  • Cisco UCS equipment must operate in an environment that provides an inlet air temperature not colder than 50F (10C) nor hotter than 95F (35C).

  • If sensors on a CPU reach 179.6F (82C), the system takes the CPU offline.

Recommended Action

If you see this fault, take the following actions:

  1. Review the product specifications to determine the temperature operating range of the I/O card.

  2. Verify that the airflow to the server is not obstructed.

  3. Verify that the site cooling system is operating properly.

  4. Clean the installation site at regular intervals to avoid a buildup of dust and debris, which can cause a system to overheat.

  5. If the problem still persists, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: critical

Cause: thermal-problem

mibFaultCode: 731

mibFaultName: fltEquipmentIOCardThermalThresholdNonRecoverable

moClass: equipment:IOCard

Type: environmental

fltEquipmentSystemIOControllerRemoved

Fault Code

F1744

Description

SIOC1_PRES: IO Module 1 missing: Please reseat or replace IO Module 1

Explanation

This fault indicates that one of the IO modules is missing.

Recommended Action

If you see this fault, take the following actions:

  1. Reseat or replace the I/O module.

    Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations and warnings.

  2. If the problem persists, create a tech-support file and contact Cisco TAC.

Fault Details

Severity: warning

Cause: equipment-missing

mibFaultCode: F1744

mibFaultName: fltEquipmentSystemIOControllerRemoved

moClass: equipment: IOCard

Type: equipment