Guest

Cisco UCS C-Series Rack Servers

Field Notice: FN - 63542 - PCIe Cards Installed in UCS C220 and C240 M3 May Overheat when System is Idle - Software Update Required

Field Notice: FN - 63542 - PCIe Cards Installed in UCS C220 and C240 M3 May Overheat when System is Idle - Software Update Required

Revised March 28, 2013
August 10, 2012


NOTICE:

THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Date Comment
2.0
28-MAR-2013
Added UCS C220 and references to additional issues fixed in this revision
1.0
10-AUG-2012
Initial Public Release

Products Affected

Products Affected
Comments
UCSC-10PK-C220M3L
Bundle Product ID which contains affected chassis
UCSC-10PK-C220M3S
Bundle Product ID which contains affected chassis
UCSC-C220-M3L
 
UCSC-C220-M3L-CH
 
UCSC-C220-M3L=
 
UCSC-C220-M3S
 
UCSC-C220-M3S-CH
 
UCSC-C220-M3S=
 
UCSC-C240-M3L
 
UCSC-C240-M3L=
 
UCSC-C240-M3S
 
UCSC-C240-M3S2
 
UCSC-C240-M3S2-CH
 
UCSC-C240-M3S2=
 
UCSC-C240-M3S=
 
UCSC-DBUN-C220-107
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-108
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-109
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-110
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-111
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-113
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-114
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-115
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-116
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-351
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-352
Bundle Product ID which contains affected chassis
UCSC-DBUN-C220-353
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-111
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-112
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-113
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-114
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-331
Bundle Product ID which contains affected chassis
UCSC-DBUN-C240-332
Bundle Product ID which contains affected chassis
UCSC-EZ-C240-108
Bundle Product ID which contains affected chassis
UCSC-EZ-C240-109
Bundle Product ID which contains affected chassis
UCSC-EZ-C240-110
Bundle Product ID which contains affected chassis
UCSC-M3L-C240-CH
 
UCSC-M3S-C240-CH
 
UCUCS-EZ-C220M3S
Bundle Product ID which contains affected chassis
UCS-EZ-C220-2650
Bundle Product ID which contains affected chassis
UCS-EZ-C220-2680
Bundle Product ID which contains affected chassis
UCS-EZ-C220P
Bundle Product ID which contains affected chassis
UCS-EZ-CONV-C220E
Bundle Product ID which contains affected chassis
UCS-EZ-CONV-C220V
Bundle Product ID which contains affected chassis
UCS-EZ-MSSHP-C220
Bundle Product ID which contains affected chassis
UCS-SP4-C220M3V
Bundle Product ID which contains affected chassis
UCS-SP5-C220E
Bundle Product ID which contains affected chassis
UCS-SP5-C220V
Bundle Product ID which contains affected chassis
UCS-SP5-MGD-C220V
Bundle Product ID which contains affected chassis
UCS-SP6-C220E
Bundle Product ID which contains affected chassis
UCS-SP6-C220P
Bundle Product ID which contains affected chassis
UCS-SP6-C220V
Bundle Product ID which contains affected chassis
UCS-SP4-C240M3V
Bundle Product ID which contains affected chassis

Problem Description

PCIe cards installed in UCS C220 M3 and UCS C240 M3 systems may overheat when the system is idle. The issue has been observed only with LSI 9266 RAID Controllers installed in UCS C220 M3 or C240 M3 chassis. This issue is caused by a cooling fan algorithm which reduces fan speed below the minimum required when the system is idle.

Background

When the system is idle, fan speed is reduced in order to lower power consumption. Some PCIe cards such as LSI 9266 draw high power even when idle, and therefore produce more heat than can be cooled at the idle fan speed setting. The software fix changes the idle fan speed setting to compensate for heat from higher powered PCIe cards.

Problem Symptoms

At the time of this Field Notice publication, symptoms noted for the issue have been observed only with the combination of UCS C220 or UCS C240 M3 chassis and LSI 9266 RAID Controller. The following indications have been observed:

  1. The server may hang after several hours without any activity or during a reboot.
  2. The /var/log/messages directory will have multiple entries similar to the following:

    kernel: scsi: killing requests for dead queue kernel: scsi: killing requests for dead queue

  3. When running the MegaCLI command with the parameters noted below, the output of it will have high chip temperatures as in this example:

    ./MegaCli64 -FwTermLog -Dsply -a0 | tail
    07/18/12 20:51:41: Max Temp is 151 Deg C on Channel 2l 4 is 150
    07/18/12 20:51:41: Measured chip temperature at Channel 0 is 146
    07/18/12 20:51:41: Measured chip temperature at Channel 1 is 149
    07/18/12 20:51:41: Measured chip temperature at Channel 2 is 151
    07/18/12 20:51:41: Measured chip temperature at Channel 3 is 146
    07/18/12 20:51:41: Measured chip temperature at Channel 4 is 151
    07/18/12 20:51:46: Max Temp is 151 Deg C on Channel 2

Workaround/Solution

The solution for this issue is to update the active CIMC and BIOS software to these levels.

For UCS C220 M3 servers (stand-alone or UCSM-managed):

  • CIMC Firmware 1.4(7g) (BIOS 1.4.7c.0) or higher, or
  • CIMC Firmware 1.5(1b) (BIOS 1.5.1c.0) or higher

For UCS C240 M3 servers (stand-alone or UCSM-managed):

  • CIMC Firmware 1.4(5h) (BIOS 1.4.5e.0) or higher, or
  • CIMC Firmware 1.5(1b) (BIOS 1.5.1c.0) or higher

It is recommended to update CIMC and BIOS along with other system components using the Host Upgrade Utility (HUU) in order to assure all levels are compatible with each other.

Process for Upgrading Firmware Using Host Upgrade Utility (HUU)

  1. Download the latest HUU from Cisco.com.
  2. Follow the procedure per instructions in the HUU Quick Start Guide.

DDTS

To follow the bug ID link below and see detailed bug information, you must be a registered customer and you must be logged in.

DDTS Description
CSCub12154 (registered customers only) C240: LSI Chip temperature very high when system is idle

How To Identify Firmware Levels

  1. Configure CIMC with IP address.
  2. From a browser, connect to CIMC.
  3. Log in as "admin" and password "password".
  4. The BIOS and firmware versions are displayed on the server summary page.

Example:

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.