Guest

Cisco ONS 15500 Series

15540-CPU Switchover Problem


Updated April 29, 2003

April 25, 2003


Products Affected

Product

Top Assembly

Comments

Part Number

Rev.

15540-CPU=

800-15708-05

A0

Top Assembly Number (TAN) 800-15708-06 and above are not affected by this issue.

Printed Circuit Assembly (PCA) Version 73-5621-08 and above are not affected by this issue.

Problem Description

In extremely rare situations, the erratas of the system controller use on the CPU card may cause the system to:

  • Crash due to Memory ECC errors

  • Show a bus error exception

  • Corrupt data

In these rare situations the CPU may become non-responsive and will not respond to an NMI request.

Background

This was observed while performing software upgrades to the 15540 ESP.

Problem Symptoms

Following a switchover after a software upgrade, the Active/Standy CPU may become unresponsive and require manual reboot.

Workaround/Solution

If you are currently running the affected Functional Image version of the 15540-CPU in your 15540 System, please follow the upgrade procedure below to upgrade the Functional Image on the 15540-CPU to version 1.25 or later.

ONS15540 Image Upgrade Procedure for the Redundant Processors

  1. Download latest Cisco IOS® Software, ROMMON and CPU FPGA images onto the Active and Standby Bootflash. As of April, 2003, the latest versions are Cisco IOS Software Release 12.1(12c)EV1, ROMMON MANOPT_RM.srec.121-10r.EV1, and CPU FPGA fi-ons15540-ph0cpu.A.1-25.exo. These images are available from the Software Center (registered customers only) on Cisco.com.

  2. Verify that the system is configured with the auto-boot configuration, if not configure it using following command:

    15540# configure terminal 
    man(config)#config-register 0x2102
    
  3. Configure the boot system command to use the new image as the boot image and remove the old boot system command configuration.

    15540# configure terminal
    man(config)# boot system flash bootflash:ons15540-i-mz.121-12c.EV1.bin
    
  4. Save the current configuration by issueing the write memory command in the Active console.

  5. Reload the standby CPU from the Active CPU, which will bring up the standby CPU with the new Cisco IOS software image.

    15540# redundancy reload peer
    
  6. Verify whether the standby CPU came back up properly by issuing show redundancy command in the Active CPU console.

    Note:?You might see the CPU_REDUN-3-DRIVER_MISSING messages on the Active CPU console while standby is reloading. This message should be ignored as it is expected while still running old image in the Active CPU.

  7. Reload the Active CPU by issuing the following command from Active CPU console:

    15540# redundancy switch-activity
    
  8. Verify whether both the Active and standby CPUs came back up properly by issuing the show redundancy command from the Active CPU console. Also verify both CPUs are running same release of Cisco IOS software at this point.

  9. Go to redundancy config mode and enable the standby privilege-mode by issuing the following command:

    15540# configure terminal
    man(config)# redundancy 
    man(config-red)# standby privilege-mode enable
    
  10. From the Console Port, reprogram the Active CPU fpga first using the following steps (This cannot be done remotely):

    To determine which CPU is active issue the show redundancy command.

    If the Active CPU is in slot 6 then,

    15540# reprogram bootflash:fi-ons15540-ph0cpu.A.1-25.exo 6
    

    Note:?Disregard the message advising you to upgrade standby CPU first. If you upgrade the standby CPU first, the reprogram may fail.

    If the Active CPU is in slot 7 then,

    15540# reprogram bootflash:fi-ons15540-ph0cpu.A.1-25.exo 7
    

    You should see the Controller successfully Reprogrammed message on the console after it is completed. Ignore the message requesting a power cycle of the chassis or CPU as this will be done in step 14 below.

  11. Now reprogram the Active CPU rommon image using the following command:

    15540# reprogram bootflash:MANOPT_RM.srec.121-10r.EV1 rommon
    

    After reprogram, the Active CPU will be reloaded and it will do the CPU switchover and the peer CPU will take over as the new Active CPU.

  12. Now reprogram the new Active CPU fpga using the following command:

    To determine which CPU is active issue the show redundancy command.

    If the Active CPU is in slot 6 then,

    15540# reprogram bootflash:fi-ons15540-ph0cpu.A.1-25.exo 6
    

    If the Active CPU is in slot 7 then,

    15540# reprogram bootflash:fi-ons15540-ph0cpu.A.1-25.exo 7
    

    You should see the Controller successfully Reprogrammed message on the console after it is completed. Ignore the message requesting a power cycle of the chassis or CPU as this will be done in step 14 below.

  13. Now reprogram the new Active CPU rommon image using the following command from the Active CPU console:

    15540# reprogram bootflash:MANOPT_RM.srec.121-10r.EV1 rommon
    

    After the reprogram, the Active CPU will be reloaded and it will do the CPU switchover and the peer CPU will takeover as the new Active CPU.

  14. To make the new CPU FPGA effective you need to follow one of following procedures,

    1. Either you need to power cycle the chassis, which will automatically load the new CPU FPGA and ROMMON images.

      or

    2. If customer doesn't want to do the power cycle of the chasis then they can follow this alternative procedure,

      1. Remove and reinsert the standby CPU first. Wait two seconds between removal and insertion.

        The standby CPU will be automatically rebooted again with the new Cisco IOS software image, wait for the standby CPU to come up fully and make sure the standby CPU comes into the Standby HOT state.

      2. Now remove and reinsert the Active CPU. Wait two seconds between removal and insertion. This will initiate the CPU switchover and the new standby CPU will be automatically rebooted again with the new Cisco IOS Software image, wait for the standby CPU to come up fully and make sure the Active and Standby CPU come into normal state using show redundancy.

For additional information on updating functional images, please consult "Managing Your Cisco ONS 15540 (ESP/ESPx) System" in the user documentation.

How To Identify Hardware Levels

To determine the functional image version of the CPU, use either the show hardware detail or show hardware linecard slot# command in privileged EXEC mode.

The following example shows the functional image information in the controller for the line card in slot 6:

15540# show hardware linecard 6 
-------------------------------------------------- 
Slot Number : 6/* 
Controller Type : 0x1000 
On-Board Description : Queens_CPU_PHASE_0 
Orderable Product Number: 15540-CPU= 
Board Part Number : 73-5621-06 
Board Revision : 08 
Serial Number : CAB0531JX96 
Manufacturing Date : 10/03/2001 
Hardware Version : 6.2 
RMA Number : 0 
RMA Failure Code : 0 
Functional Image Version: 1.24 
Function-ID : 0

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:

Receive Email Notification For New Field Notices

Product Alert Tool - Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.