Guest

Cisco Catalyst 8500 Series Campus Switch Routers

Field Notice: Catalyst 8540 Switch Processor Cell Stuck Issues


Updated August 21, 2000
August 15, 2000



Products Affected

Product HW Rev.
C8546MSR-MSP-FCL=   8.2  
C8542CSR-SP=   7.2  

Problem Description

Switch processor (SP) cell stuck problems have been identified on the SPs of Catalyst 8540 Campus Switch Routers (CSRs) and Multiservice Switch Routers (MSRs). Hardware rev 7.2 for the C8542CSR-SP and 8.2 for the C8546MSR-MSP-FCL without a "1" set in the RMA field are subject to these problems. Issues that may be seen because of this are connectivity related. This could affect traffic through internal stactic VCs which can affect internal connections between ports.  

Background

High fallouts in manufacturing testing uncovered timing problems in the SPs for the CSR and MSR. Improvements have been made in two areas. All new SP cards, H/W rev 8.0 for CSR and 8.3 for MSR, have gone through improved diagnostic testing that insures better timing margins are being met and IOS releases since 12.0(4a)W(X)5(11) and 12.0(4a)W5(11a) have code to increase the tolerance by decreasing the noise voltage levels.  

Problem Symptoms

1) Cell stuck issue. This issue is caused by a marginal timing problem in the SP cards. Code was added to the 12.0(4a)W(X)5(11) for the CSR and 12.0(4a)W5(11a) for the MSR releases to increase the tolerance by decreasing the noise voltage levels. In addition to this manufacturing has implemented a test to tightly screen for this problem prior to shipping any new cards or RMA'd cards.

This screening process was effective 9/1/1999

a) How to troubleshoot:

Use the show switch fabric, show epc queue, and the sho epc status commands.

     router#sh sw fabr
     MMC Switch Fabric (idb=0x60CF1788)

     Key: Rej. Cells - # cells rejected due to lack of resources or policing (16-bit)
     Inv. Cells - # good cells that came in on a non-existent conn.
     Mem Buffs - # cell buffers currently in use
     RX Cells - # rx cells (16-bit)
     TX Cells - # tx cells (16-bit)
     Rx HEC - # cells Received with HEC errors
     Tx PERR - # cells with memory parity errors
     MSC# Rej. Cells Inv. Cells Mem. Buffs RxCells Tx Cells Rx HEC Tx PErr
     ----- ----------- ------------ ----------- ----------- ---------- ---------- ----------
     MSC 0: 0 110018 0 0 0 0 0
     MSC 1: 0 231044 0 0 0 0 0
     MSC 2: 0 234283 0 0 0 0 0
     MSC 3: 0 232492 0 0 0 0 0
     MSC 4: 0 242004 0 0 0 0 0
     MSC 5: 0 120995 345 0 0 0 0
     MSC 6: 0 111466 0 0 0 0 0
     MSC 7: 0 334398 0 0 0 0 0

     Switch Fabric Statistics
     ^
     Rejected Cells: 0 | --- mbufs held on one msc only
     Invalid Cells: 1616700
     Memory Buffers: 345
     Rx Cells: 0
     Tx Cells: 0
     RHEC: 0
     TPE: 0

Note: The show switch fabric should be done under the lightest possible traffic conditions as actual traffic may be using the mbufs.

     router#sh epc q
     INT X-INT VCI QCNT VCI QCNT
     <---- epc queue empty


     router# sh sw fabr
     MMC Switch Fabric (idb=0x60CF1788)

     Key: Rej. Cells - # cells rejected due to lack of resources or policing (16-bit)
     Inv. Cells - # good cells that came in on a non-existent conn.
     Mem Buffs - # cell buffers currently in use
     RX Cells - # rx cells (16-bit)
     TX Cells - # tx cells (16-bit)
     Rx HEC - # cells Received with HEC errors
     Tx PERR - # cells with memory parity errors

     MSC# Rej. Cells Inv. Cells Mem. Buffs RxCells Tx Cells Rx HEC Tx PErr
     ----- ----------- ------------ ----------- ----------- ---------- ---------- ----------
     MSC 0: 0 1932 0 0 0 0 0
     MSC 1: 0 4056 0 0 0 0 0
     MSC 2: 0 4101 0 0 0 0 0
     MSC 3: 0 4082 0 0 0 0 0
     MSC 4: 0 4251 0 0 0 0 0
     MSC 5: 0 2127 345 0 0 0 0
     MSC 6: 0 1958 0 0 0 0 0
     MSC 7: 0 5874 0 0 0 0 0

     Switch Fabric Statistics
     ^
     Rejected Cells: 0 | ---- mbufs held steady and only on one msc
     Invalid Cells: 28381
     Memory Buffers: 345
     Rx Cells: 0
     Tx Cells: 0
     RHEC: 0
     TPE: 0

Note: The show switch fabric should be done under the lightest possible traffic conditions as actual traffic may be using the mbufs.

     router#sh epc status
     Status of GigabitEthernet0/0/0: OK
     Status of GigabitEthernet0/0/1: OK
     Status of GigabitEthernet1/0/0: OK
     Status of GigabitEthernet1/0/1: OK
     Status of GigabitEthernet2/0/0: OK <--- all epc status ok
     Status of GigabitEthernet2/0/1: OK
     Status of GigabitEthernet3/0/0: OK
     Status of GigabitEthernet3/0/1: OK
     Status of GigabitEthernet9/0/0: OK
     Status of GigabitEthernet9/0/1: OK
     Status of GigabitEthernet10/0/0: OK
     Status of GigabitEthernet10/0/1: OK
     Status of GigabitEthernet11/0/0: OK
     Status of GigabitEthernet11/0/1: OK
     Status of GigabitEthernet12/0/0: OK
     Status of GigabitEthernet12/0/1: OK
     router#

The show switch fabric command is a clear on read command. Based on the information that the Mem. Buffs in sho switch fabric is being held as constant and only on one MSC chip, the sho epc status for all ports is reported as OK and the show epc queue is empty, then this is a cell stuck issue.

b) Options for resolution:

  • run 12.0(4a)W(X)5(11) for CSR and 12.0(4a)W5(11a) for MSR or later
  • replace the SP cards
  • run 12.0(4a)W(X)5(11) for CSR and 12.0(4a)W5(11a) for MSR and replace the SP cards






2) Port stuck issue. This issue is caused when an applet is downloaded to a port in the box and one ore more cells containing the applet is lost. Code was added to 12.0(4a)W(X)5(11) CSR software and the 12.0(4a)W5(11a) MSR software release to verify the integrity of the applet download and also timeout an incomplete download to prevent this problem.

a) How to troubleshoot:

Use the show switch fabric, show epc queue, show epc status, and the show controller commands.


     router1#show sw fab
     MMC Switch Fabric (idb=0x60CF1788)

     Key: Rej. Cells - # cells rejected due to lack of resources or policing (16-bit)
     Inv. Cells - # good cells that came in on a non-existent conn.
     Mem Buffs - # cell buffers currently in use
     RX Cells - # rx cells (16-bit)
     TX Cells - # tx cells (16-bit)
     Rx HEC - # cells Received with HEC errors
     Tx PERR - # cells with memory parity errors

     MSC# Rej. Cells Inv. Cells Mem. Buffs Rx Cells Tx Cells Rx HEC Tx PErr
     ----- ----------- ------------ ----------- ----------- ---------- ---------- ----------
     MSC 0: 389023 7896 14177 0 0 0 0
     MSC 1: 0 32709 2070 0 0 0 0
     MSC 2: 0 0 0 0 0 0 0
     MSC 3: 0 0 0 0 0 0 0
     MSC 4: 0 0 0 0 0 0 0
     MSC 5: 0 0 0 0 0 0 0
     MSC 6: 0 6170 1351 0 0 0 0
     MSC 7: 0 9624 1280 0 0 0 0

     Switch Fabric Statistics

     Rejected Cells: 389023
     Invalid Cells: 56399
     Memory Buffers: 18878
     Rx Cells: 0
     Tx Cells: 0
     RHEC: 0
     TPE: 0

     router1#show sw fab
     MMC Switch Fabric (idb=0x60CF1788)

     Key: Rej. Cells - # cells rejected due to lack of resources or policing (16-bit)
     Inv. Cells - # good cells that came in on a non-existent conn.
     Mem Buffs - # cell buffers currently in use
     RX Cells - # rx cells (16-bit)
     TX Cells - # tx cells (16-bit)
     Rx HEC - # cells Received with HEC errors
     Tx PERR - # cells with memory parity errors

     MSC# Rej. Cells Inv. Cells Mem. Buffs Rx Cells Tx Cells Rx HEC Tx PErr
     ----- ----------- ------------ ----------- ----------- ---------- ---------- ----------
     MSC 0: 2189 6 14177 0 0 0 0
     MSC 1: 0 36 2070 0 0 0 0
     MSC 2: 0 0 0 0 0 0 0
     MSC 3: 0 0 0 0 0 0 0
     MSC 4: 0 0 0 0 0 0 0
     MSC 5: 0 0 0 0 0 0 0
     MSC 6: 0 6 1351 0 0 0 0
     MSC 7: 0 10 1280 0 0 0 0

     Switch Fabric Statistics
     ^
     Rejected Cells: 2189 |------ Mem bufs held on more than one MSC
     Invalid Cells: 58
     Memory Buffers: 18878
     Rx Cells: 0
     Tx Cells: 0
     RHEC: 0
     TPE: 0

Note: The show switch fabric command should be done under the lightest possible traffic conditions as actual traffic may be using the mbufs.

     router1#show epc q
     INT X-INT VCI QCNT VCI QCNT
     Gi0/0/0 Gi1/0/0 67 640 62 0
     Gi0/0/0 Gi1/0/0 71 546 66 0
     Gi0/0/1 Gi1/0/0 67 135 147 0
     Gi0/0/1 Gi1/0/0 69 18 149 0
     Gi1/0/0 SRP 35 0 342 1791
     Gi1/0/0 Gi0/0/0 62 0 67 640
     Gi1/0/0 Gi0/0/0 66 0 71 546
     Gi1/0/0 Gi0/0/1 147 0 67 135
     Gi1/0/0 Gi0/0/1 149 0 69 18
     Gi1/0/0 Gi1/0/1 152 0 67 639 <----- g 1/0/0 always involved
     Gi1/0/0 Gi12/0/0 577 0 67 640
     Gi1/0/0 Gi12/0/0 578 0 68 16
     Gi1/0/0 Gi12/0/0 579 0 69 38
     Gi1/0/0 Gi12/0/0 580 0 70 16
     Gi1/0/0 Gi12/0/1 662 0 67 640
     Gi1/0/0 Gi12/0/1 666 0 71 640
     Gi1/0/1 Gi1/0/0 67 639 152 0
     Gi12/0/0 Gi1/0/0 67 640 577 0
     Gi12/0/0 Gi1/0/0 68 16 578 0
     Gi12/0/0 Gi1/0/0 69 38 579 0
     Gi12/0/0 Gi1/0/0 70 16 580 0
     Gi12/0/1 Gi1/0/0 67 640 662 0
     Gi12/0/1 Gi1/0/0 71 640 666 0

     router1#sh epc stat
     Status of GigabitEthernet0/0/0: OK
     Status of GigabitEthernet0/0/1: OK
     Status of GigabitEthernet1/0/0: not OK <------ g 1/0/0 in trouble
     Status of GigabitEthernet1/0/1: OK
     Status of GigabitEthernet2/0/0: OK
     Status of GigabitEthernet2/0/1: OK
     Status of GigabitEthernet12/0/0: OK
     Status of GigabitEthernet12/0/1: OK
     router1#

The show switch fabric command is a clear on read command. Based on the information that the Mem. Buffs in sho switch fabric is being held as constant and spread across more than one MSC chip, the sho epc status for one port is reported as not OK, then this is a port stuck issue.

b) Options for resolution:

  • upgrade to CSR 12.0(4a)W(X)5(11) or later to prevent a reoccurrence
  • upgrade to MSR 12.0(4a)W5(11a) or later to prevent a reoccurrence
  • reload the box in question to clear up the problem

Workaround/Solution

Workaround
  • download 12.0(4a)WX5(11a) or later on C8540CSRs to prevent a reoccurrence
  • download 12.0(4a)W5(11a) or later on C8540MSRs to prevent a reoccurrence
  • reload the box in question to clear this occurences of the problem.

This is only a temporary workaround.

Solution

Replace the Switch Processors via the upgrade form contained in the next section of this Field Notice. Replacements should only be requested for hardware rev 7.2 for the C8542CSR-SP and 8.2 for the C8546MSR-MSP-FCL without a "1" set in the RMA field.

Replacements for MSR and CSR SPs may not be at the latest revision, 8.0 for CSR and 8.3 for MSR. This is not a problem since down rev products have been repaired and tested for these problems. Cards that have been repaired and tested will have a 1 set in the RMA field from the show hardware command.

Router#show hardware

C8540 named Router, Date: 10:17:00 UTC Tue Nov 23 1999

Slot Ctrlr-Type              Part No.  Rev  Ser No     Mfg Date    RMA No. Hw Vrs  Tst  EEP
----   ------------         ----------   --  --------    ---------   --------  -------  ---   ---
1/* GIGETHERNET           73-3366-02  B0  0313184I   Jan 00 00      0             2.1
4/* Route Proc            73-3775-01  B0  03020EST   Jan 00 00      0             5.2
5/* Switch Card           73-3327-07  B0  03030J6L   Jan 00 00      1             7.2
7/* Switch Card           73-3327-07  B0  03030KEU   Jan 00 00      1             7.2
 

Upgrade Program

Catalyst 8540 SP Replacement

If you have any questions concerning the upgrade program, you can e-mail upgrades-info@cisco.com.

Note: Fields marked with an asterisk (*) are required fields.

Company Information
*Organization
*Address
Address_line2
*City
*State/Province
*ZIP/Postal Code
Country
Product
Product *Quantity
C8542CSR-SP=
C8546MSR-MSP-FCL=
Contact Information
*First Name
*Last Name
*Phone 1 Ext.
Fax 1 Ext.
*Internet E-Mail
Please use the following format: user@domain.com
*Upgrade Order Reference Number
Please provide a number that you can use when inquiring about order status
*Serial Numbers of Affected Units 2
1 For phone and fax, include 011 and the country code outside North America.

2 When entering more than one serial number in the serial number text entry field, be sure to type a comma between each serial number to separate them.

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) at (800) 553-24HR, (408) 526-7209, or send e-mail to tac@cisco.com