November 4, 2009
NOTICE:
THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision History
Revision Date Comment 1.0 04-NOV-2009 Initial Public Release
Products Affected
Products Affected Comments MCS-7835-I2-xxx 7835-I2 Bare Metal, Appliance Servers, and IBM x3650 Customer-provided servers. MCS-7845-I2-xxx 7845-I2 Bare Metal, Appliance Servers, and IBM x3650 Customer-provided servers.
Problem Description
To be exposed to this issue, a server in the Products Affected section above must be running one of the affected application versions listed in the table found in the Workaround/Solution section.
Certain IBM RAID driver versions can cause instability in the RAID environment and lead to hard drives being marked as Read-Only. The affected RAID drivers are contained in certain Application versions listed in the table in the Workaround/Solution section. Since Write access to the RAID array is required with Unified Communications Applications, this problem can prevent critical files from being written to the array and eventually can cause a service outage.
For a list of affected and fixed Application versions, please see the table in the Workaround/Solution section.
Either the Root or Common partition can become Read-Only.
Background
In this Field Notice, the term "Appliance Server" refers to a turnkey software appliance, which is a server purchased from Cisco that has software pre-installed before shipping.
Problem Symptoms
Affected servers using one of the affected versions may suddenly experience a loss of service.
There are two ways to determine if a system is affected:
- CLI/Console Commands
- Examine log files
To determine if the Root partition is affected, do the following from a CLI or console session to the server:
- Type the command "utils iothrottle enable" without quotes and hit enter.
- The server should return a message which reads "I/O throttling has been enabled".
- Type the command "utils iothrottle disable" without quotes and hit enter.
- The server should return a message which reads "I/O throttling has been disabled".
If the server does not return those messages, it means the Root partition is in a Read-Only mode.
To determine if the Common partition is affected, do the following from a CLI or console session to the server:
- Enter the command "file list activelog syslog/* detail" without quotes and hit enter.
- The server will return the size of the file "CiscoSyslog" change, e.g.: 589,219 CiscoSyslog
- Two minutes later, repeat step 1.
- Compare the sizes returned. If the size is unchanged, it means the Common partition is in a Read-Only mode.
Any of the following Log messages may be visible on the server:
From the RTMT-System Logs or "messages" file on the local server:
kern 2 kernel: EXT3-fs error (device sda2): ext3_journal_start_sb: Detected aborted journal
kern 2 kernel: Remounting filesystem read-only
From the RTMT-Application Logs or "CiscoSyslog" file:
SyslogSeverityMatchFound events generated: SeverityMatch - Critical kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
-or-
SyslogSeverityMatchFound Detail:SyslogSeverityMatchFound events generated: SeverityMatch - Critical kernel: EXT3-fs error (device sdb1) in start_transaction: Journal has aborted
From the Console:
EXT3-fs error (device sda6) in start_transaction: Journal has aborted
Workaround/Solution
Do not request any Hardware Replacements (RMA) to resolve this issue. This issue is recoverable via SW upgrades in the field.
Workaround & Recovery
A temporary workaround is to use the Unified Communications Manager Recovery CD to restore write access to the file system. The Unified Communications Manager Recovery CD can be used on any server and application experiencing this issue. This process can be used to recover both Root and Common partition file systems that are affected. The latest Unified Communications Manager Recovery CD is available here:
Unified Communications Manager 7.1(3a) Recovery CD (registered customers only)
Use the following steps to recover write access to the file system:
- Boot the system using the recovery disk.
- From the recovery CD menu select option 'f' to run file system check.
- When completed select option 'q' to quit recovery CD.
- Eject the CD when prompted and reboot the system.
Solution
The permanent solution is to migrate to a fixed version either by upgrading or performing a fresh install. The exact action required depends on which partition is affected.
Use the commands shown in the Problem Symptom section to determine which partition is affected. Do not assume that the server filesystem is healthy if service has not yet been affected.
If neither the Root nor Common partition is affected, then:
- Perform a backup of the existing version, following the instructions for that application.
- Upgrade to a fixed version following the normal upgrade procedures for that application.
- Perform a backup of the upgraded version.
If only the Root partition is affected, then:
- Use the Recovery CD to restore the file system, following the instructions above from the Workaround section.
- Perform a backup of the existing version, following the instructions for that application.
- Upgrade to a fixed version following the normal upgrade procedures for that application.
- Perform a backup of the upgraded version.
If the Common partition is affected, and a valid backup exists, then:
- Perform a Fresh install of the current/affected version.
- Restore the system from the backup data.
- Upgrade to a fixed version.
- Perform a backup of the upgraded version.
Please complete steps 2 and 3 as soon as possible after performing the Fresh Install since the system is theoretically exposed to the issue during before steps 2 and 3 are completed.
If the Common partition is affected, and a valid backup does not exist, then perform a Fresh Install of a fixed version.
See the following table showing affected and fixed versions:
Product Bug ID & Link to Bug Toolkit (registered customers only) Affected Version Fixed Version Availability of Solution Cisco Unified Communications Manager 7.x
7.0(2a)SU2 and later;
7.1(2b) and later All available on Software Download site Cisco Unity Connection 7.x
7.0(2a)SU2 and later;
7.1(2b) and later All available on Software Download site Cisco Unified Presence 7.0(x) 7.0(5) and later Available on Software Download site Cisco Emergency Responder 7.0(3a) 7.1(1) and later Available on Software Download site Cisco Unified Mobility Advantage 7.1(x) 7.1(3) and later Available on Software Download site Cisco Unified Mobility (also known as MobilityManager) Not Affected Not Affected Not Affected Not Affected Cisco MeetingPlace Express Not Affected Not Affected Not Affected Not Affected
For More Information
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
Receive Email Notification For New Field Notices
Cisco Notification Service—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.