Guest

Cisco Catalyst 6500 Series Switches

Troubleshooting Hardware and Common Issues on Catalyst 6500/6000 Series Switches Running Cisco IOS System Software

Document ID: 24053

Updated: May 24, 2012

   Print

Contents

Introduction

This document describes troubleshooting hardware and related common issues on Catalyst 6500/6000 switches that run Cisco IOS® system software. Cisco IOS Software refers to the single bundled Cisco IOS image for both the Supervisor Engine and Multilayer Switch Feature Card (MSFC) module. This document assumes that you have a problem symptom and that you want to get additional information about it or want to resolve it. This document is applicable to Supervisor Engine 1-, 2-, or 720-based Catalyst 6500/6000 switches.

Refer to the Naming Convention for CatOS and Cisco IOS Software Images section of the document System Software Conversion from CatOS to Cisco IOS for Catalyst 6500/6000 Switches in order to understand the naming convention of software images.

Refer to these documents in order to troubleshoot a system that runs Catalyst OS (CatOS) on the Supervisor Engine and Cisco IOS Software on the MSFC:

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

This document is not restricted to specific software and hardware versions.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Troubleshoot Error Messages in the Syslog or Console

The system messages are printed on the console if console logging is enabled, or in the syslog if syslog is enabled. Some of the messages are for informational purposes only and do not indicate an error condition. For an overview of the system error messages, refer to System Messages Overview.

Enable the appropriate level of logging, and configure the switch to log the messages to a syslog server. For further configuration information, refer to the Step-by-Step Instructions to Configure IOS Devices section of the document Resource Manager Essentials and Syslog Analysis: How-To.

In order to monitor the logged messages, issue the show logging command. Or, use other monitoring stations periodically, such as CiscoWorks and HP OpenView.

In order to better understand a specific system message, refer to Messages and Recovery Procedures (Catalyst 6500/6000 Cisco IOS system software).

If you are still unable to determine the problem, or if the error message is not present in the documentation, contact the Cisco Technical Support escalation center.

The error message %CONST_DIAG-SP-4-ERROR_COUNTER_WARNING: Module 4 Error counter exceeds threshold appears on the console of the Catalyst 6500. This issue can have two causes:

  • A poor connection to the backplane (bent connector pin or poor electrical connection), or

  • This can be related to the first indication of a failing module.

In order to resolve this, set the diagnostic boot up level to "complete", and then firmly reseat module 4 in the chassis. This catches any latent hardware failure and also resolves any backplane connection issues.

The show diagnostic sanity Command

The show diagnostic sanity command runs a set of predetermined checks on the configuration, along with a combination of certain system states. The command then compiles a list of warning conditions. The checks are designed to look for anything that seems out of place. The checks are intended to serve as an aid for troubleshooting and maintenance of the system sanity. The command does not modify any existing variables or system states. It reads the system variables that correspond to the configuration and the states in order to raise warnings if there is a match to a set of predetermined combinations. The command does not impact switch functionality, and you can use it on a production network environment. The only limitation during the run process is that the command reserves the file system for a finite time while the command accesses the boot images and tests their validity. The command is supported in Cisco IOS Software Release 12.2(18)SXE1 or later.

Check the configuration for a configuration that appears valid but that can have a negative implication. Warn the user in these cases:

  • Trunking—Trunk mode is "on" or if the port is trunking in "auto". A trunk port has a mode that is set to desirable and is not trunking or if the trunk port negotiates to half duplex.

  • Channeling—Channeling mode is "on" or if a port is not channeling and the mode is set to desirable.

  • Spanning Tree—One of these is set to default:

    • root max age

    • root forward delay

    • max age

    • max forward delay

    • hello time

    • port cost

    • port priority

    Or, if the spanning tree root is not set for a VLAN.

  • UDLD—Port has UniDirectional Link Detection (UDLD) disabled, shutdown, or in undetermined state.

  • Flow control and PortFast—Port has receive flow control disabled or if it has PortFast enabled.

  • High Availability—Redundant Supervisor Engine is present but high availability (HA) is disabled.

  • Boot String and boot config register—Boot String is empty or it has an invalid file that is specified as a boot image. The configuration register is anything other than 0x2,0x102, or 0x2102.

  • IGMP Snooping—Internet Group Management Protocol (IGMP) snooping is disabled. Also if IGMP snooping is disabled but Router-Port Group Management Protocol (RGMP) is enabled, and if multicast is enabled globally but disabled on the interface.

  • SNMP Community access strings—The access strings (rw, ro, rw-all) are set to the default.

  • Ports—A port negotiates to half duplex or it has a duplex/VLAN mismatch.

  • Inline power ports—An inline-power port is in any of these states:

    • denied

    • faulty

    • other

    • off

  • Modules—A module is in any state other than "ok".

  • Tests—List the system diagnostic tests that failed on bootup.

  • Default gateway(s) unreachable—Pings the default gateways in order to list those that cannot be reached.

  • Checks whether the bootflash is correctly formatted and has enough space to hold a crashinfo file.

This is example output:

Note: The actual output can vary, based on the software version.

IOSSwitch>show diagnostic sanity 
Status of the default gateway is:
10.6.144.1 is alive 

The following active ports have auto-negotiated to half-duplex:
4/1 

The following vlans have a spanning tree root of 32k:
1 

The following ports have a port cost different from the default:
4/48,6/1 

The following ports have UDLD disabled:
4/1,4/48,6/1 

The following ports have a receive flowControl disabled:
4/1,4/48,6/1 

The value for Community-Access on read-only operations for 
SNMP is the same as default. Please verify that this is the best 
value from a security point of view. 

The value for Community-Access on read-write operations for SNMP is 
the same as default. Please verify that this is the best value from 
a security point of view. 

The value for Community-Access on read-write-all operations for SNMP 
is the same as default. Please verify that this is the best value from 
a security point of view.

Please check the status of the following modules:
8,9


Module 2 had a MINOR_ERROR.


The Module 2 failed the following tests:

TestIngressSpan


The following ports from Module2 failed test1:

1,2,4,48

Refer to the show diagnostic sanity section of the Command Reference Guide.

Supervisor Engine or Module Problems

Supervisor Engine LED in Red/Amber or Status Indicates faulty

If your switch Supervisor Engine LED is red, or the status shows faulty, there can be a hardware problem. You can get a system error message that is similar to this:

%DIAG-SP-3-MINOR_HW: 
   Module 1: Online Diagnostics detected Minor Hardware Error

Complete these steps for additional troubleshooting:

  1. Console in to the Supervisor Engine and issue the show diagnostic module {1 | 2} command, if possible.

    Note: You must set the diagnostic level at complete so that the switch can perform a full suite of tests in order to identify any hardware failure. Performance of the complete online diagnostic test increases the bootup time slightly. Bootup at the minimal level does not take as long as at the complete level, but detection of potential hardware problems on the card still occurs. If you set the diagnostic test level at bypass, no diagnostics tests are performed. Issue the diagnostic bootup level {complete | minimal | bypass} global configuration command in order to toggle between the diagnostic levels. The default diagnostic level is minimal, whether with CatOS or Cisco IOS system software.

    Note: Online diagnostics are not supported for Supervisor Engine 1-based systems that run Cisco IOS Software.

    This output shows an example of failure:

    Router#show diagnostic mod 1
    Current Online Diagnostic Level = Complete
    
    Online Diagnostic Result for Module 1 : MINOR ERROR
    
    Test Results: (. = Pass, F = Fail, U = Unknown)
    
    1 . TestNewLearn             : .
    2 . TestIndexLearn           : .
    3 . TestDontLearn            : .
    4 . TestConditionalLearn     : F
    5 . TestBadBpdu              : F
    6 . TestTrap                 : .
    7 . TestMatch                : .
    8 . TestCapture              : F
    9 . TestProtocolMatch        : .
    10. TestChannel              : .
    11. IpFibScTest              : .
    12. DontScTest               : .
    13. L3Capture2Test           : F
    14. L3VlanMetTest            : .
    15. AclPermitTest            : .
    16. AclDenyTest              : .
    17. TestLoopback:
              
       Port  1  2
       ----------
             .  . 
    
    18. TestInlineRewrite:
    
       Port  1  2
       ----------
             .  . 

    If power-on diagnostics return failure, which the F indicates in the test results, perform these steps:

    1. Reseat the module firmly and make sure that the screws are tightly screwed.

    2. Move the module to a known good, working slot on the same chassis or a different chassis.

      Note: The Supervisor Engine 1 or 2 can go in either slot 1 or slot 2 only.

    3. Troubleshoot to eliminate the possibility of a faulty module.

      Note: In some rare circumstances, a faulty module can result in the report of the Supervisor Engine as faulty.

      In order to eliminate the possibility, perform one of these steps:

      • If you recently inserted a module and the Supervisor Engine began to report problems, remove the module that you inserted last and reseat it firmly. If you still receive messages that indicate that the Supervisor Engine is faulty, reboot the switch without that module. If the Supervisor Engine functions properly, there is a possibility that the module is faulty. Inspect the backplane connector on the module in order to be sure that there is no damage. If there is no visual damage, try the module in another slot or in a different chassis. Also, inspect for bent pins on the slot connector on the backplane. Use a flashlight, if necessary, when you inspect the connector pins on the chassis backplane. If you still need assistance, contact Cisco Technical Support.

      • If you are not aware of any recently added module, and replacement of the Supervisor Engine does not fix the problem, there is a possibility that the module is seated improperly or is faulty. In order to troubleshoot, remove all the modules except the Supervisor Engine from the chassis. Power up the chassis and make sure that the Supervisor Engine comes up without any failure. If the Supervisor Engine comes up without any failures, begin to insert modules one at a time until you determine which module is faulty. If the Supervisor Engine does not fail again, there is a possibility that one of the modules was not seated correctly. Observe the switch and, if you continue to have problems, create a service request with Cisco Technical Support in order to troubleshoot further.

    After you perform each of these steps, issue the show diagnostic module module_# command. Observe if the module still shows the failure status. If the failure status still appears, capture the log from the troubleshooting steps that you performed and create a service request with Cisco Technical Support for further assistance.

    Note: If you run the Cisco IOS Software Release 12.1(8) train, the diagnostics are not fully supported. You get false failure messages when diagnostics are enabled. The diagnostics are supported in Cisco IOS Software Release 12.1(8b)EX4 and later, and for Supervisor Engine 2-based systems, in Cisco IOS Software Release 12.1(11b)E1 and later.

    Also, refer to Field Notice: Diagnostics Incorrectly Enabled in Cisco IOS Software Release 12.1(8b)EX2 and 12.1(8b)EX3 for more information.

  2. If the switch does not boot and fails the self diagnostics during the boot sequence, capture the output and create a service request with Cisco Technical Support for further assistance.

  3. If you do not see any hardware failure in the boot sequence or in the output of the show diagnostics module {1 | 2} command, issue the show environment status and show environment temperature commands in order to check the outputs related to environment conditions and look for any other failed components.

    cat6knative#show environment status
    backplane: 
      operating clock count: 2
      operating VTT count: 3
    fan-tray 1: 
      fan-tray 1 fan-fail: OK
    VTT 1: 
      VTT 1 OK: OK
      VTT 1 outlet temperature: 35C
    VTT 2: 
      VTT 2 OK: OK
      VTT 2 outlet temperature: 31C
    VTT 3: 
      VTT 3 OK: OK
      VTT 3 outlet temperature: 33C
    clock 1: 
      clock 1 OK: OK, clock 1 clock-inuse: in-use
    clock 2: 
      clock 2 OK: OK, clock 2 clock-inuse: not-in-use
    power-supply 1: 
      power-supply 1 fan-fail: OK
      power-supply 1 power-output-fail: OK
    module 1: 
      module 1 power-output-fail: OK
      module 1 outlet temperature: 28C
      module 1 device-2 temperature: 32C
      RP 1 outlet temperature: 34C
      RP 1 inlet temperature: 34C
      EARL 1 outlet temperature: 34C
      EARL 1 inlet temperature: 28C
    module 3: 
      module 3 power-output-fail: OK
      module 3 outlet temperature: 39C
      module 3 inlet temperature: 23C
      EARL 3 outlet temperature: 33C
      EARL 3 inlet temperature: 30C
    module 4: 
      module 4 power-output-fail: OK
      module 4 outlet temperature: 38C
      module 4 inlet temperature: 26C
      EARL 4 outlet temperature: 37C
      EARL 4 inlet temperature: 30C
    module 5: 
      module 5 power-output-fail: OK
      module 5 outlet temperature: 39C
      module 5 inlet temperature: 31C
    module 6: 
      module 6 power-output-fail: OK
      module 6 outlet temperature: 35C
      module 6 inlet temperature: 29C
      EARL 6 outlet temperature: 39C
      EARL 6 inlet temperature: 30C

    If you see any system component (fan, voltage termination [VTT]) failure, create a service request with Cisco Technical Support and provide the command output.

    If you see a failed status in this output for any of the modules, issue the hw-module module module_# reset command. Or reseat the module in the same slot or in a different slot in order to try to recover the module. Also, see the Troubleshoot a Module That Does Not Come On Line or Indicates faulty or other Status section of this document for further assistance.

  4. If the status indicates OK, as the sample output in Step 3 shows, issue the show environment alarms command in order to check for an environment alarm.

    If there are no alarms, the output is similar to this:

    cat6knative#show environment alarm
    environmental alarms:
      no alarms
    

    However, if there is an alarm, the output is similar to this:

    cat6knative#show environment alarm
    environmental alarms:
    system minor alarm on VTT 1 outlet temperature (raised 00:07:12 ago)
    system minor alarm on VTT 2 outlet temperature (raised 00:07:10 ago)
    system minor alarm on VTT 3 outlet temperature (raised 00:07:07 ago)
    system major alarm on VTT 1 outlet temperature (raised 00:07:12 ago)
    system major alarm on VTT 2 outlet temperature (raised 00:07:10 ago)
    system major alarm on VTT 3 outlet temperature (raised 00:07:07 ago)

Switch is in Continuous Booting Loop, in ROMmon mode, or Missing the System Image

If your switch Supervisor Engine is in a continuous booting loop, in ROM monitor (ROMmon) mode, or is missing the system image, the problem is most likely not a hardware problem.

The Supervisor Engine goes into ROMmon mode or fails to boot when the system image is either corrupt or missing. For instructions on how to recover the Supervisor Engine, refer to Recovering a Catalyst 6500/6000 Running Cisco IOS System Software from a Corrupted or Missing Boot Loader Image or ROMmon Mode.

You can boot the Cisco IOS image from either Sup-bootflash: or slot0: (the PC card slot). Have a copy of the system image in both devices for faster recovery. If your Supervisor Engine 2 bootflash device has only 16 MB, an upgrade to 32 MB can be necessary in order to support the newer system images. For more information, refer to Catalyst 6500 Series Supervisor Engine 2 Boot ROM and Bootflash Device Upgrade Installation Note.

Standby Supervisor Engine Module is Not Online or Status Indicates Unknown

This section outlines common reasons that the standby Supervisor Engine module does not come on line and how to solve each problem. You can determine that the Supervisor Engine module does not come on line in one of these ways:

  • The output of the show module command shows the status other or faulty.

  • The amber status LED is lit.

Common Reasons/Solutions

  • Console in to the standby Supervisor Engine in order to determine if it is in ROMmon mode or in continuous reboot. If the Supervisor Engine is in one of these states, refer to Recovering a Catalyst 6500/6000 Running Cisco IOS System Software from a Corrupted or Missing Boot Loader Image or ROMmon Mode.

    Note: If the active and standby Supervisor Engines do not run the same Cisco IOS Software release, the standby can fail to come on line. For example, a Supervisor Engine can fail to come on line in a situation in which:

    • The active Supervisor Engine runs Route Processor Redundancy Plus (RPR+) mode.

      Note: RPR+ mode is available in Cisco IOS Software Release 12.1[11]EX and later.

    • The standby Supervisor Engine runs a software version in which RPR/RPR+ mode is not available, such as Cisco IOS Software Release 12.1[8b]E9.

    In this case, the second Supervisor Engine fails to come on line because the redundancy mode is enhanced high system availability (EHSA), by default. The standby Supervisor Engine fails to negotiate with the active Supervisor Engine. Be sure that both Supervisor Engines run the same Cisco IOS Software level.

    This output shows the Supervisor Engine in slot 2 in ROMmon mode. You must console in to the standby Supervisor Engine in order to recover it. For recovery procedures, refer to Recovering a Catalyst 6500/6000 Running Cisco IOS System Software from a Corrupted or Missing Boot Loader Image or ROMmon Mode.

    tpa_data_6513_01#show module
    Mod Ports Card Type                              Model              Serial No.
    --- ----- -------------------------------------- ------------------ -----------
      1    2  Catalyst 6000 supervisor 2 (Active)    WS-X6K-S2U-MSFC2   SAD0628035C
      2    0  Supervisor-Other                       unknown            unknown
      3   16  Pure SFM-mode 16 port 1000mb GBIC      WS-X6816-GBIC      SAL061218K3
      4   16  Pure SFM-mode 16 port 1000mb GBIC      WS-X6816-GBIC      SAL061218K8
      5    0  Switching Fabric Module-136 (Active)   WS-X6500-SFM2      SAD061701YC
      6    1  1 port 10-Gigabit Ethernet Module      WS-X6502-10GE      SAD062003CM
    
    Mod MAC addresses                       Hw    Fw           Sw           Status
    --- ---------------------------------- ------ ------------ ------------ -------
      1  0001.6416.0342 to 0001.6416.0343   3.9   6.1(3)       7.5(0.6)HUB9 Ok      
      2  0000.0000.0000 to 0000.0000.0000   0.0   Unknown      Unknown      Unknown 
      3  0005.7485.9518 to 0005.7485.9527   1.3   12.1(5r)E1   12.1(13)E3,  Ok      
      4  0005.7485.9548 to 0005.7485.9557   1.3   12.1(5r)E1   12.1(13)E3,  Ok      
      5  0001.0002.0003 to 0001.0002.0003   1.2   6.1(3)       7.5(0.6)HUB9 Ok      
      6  0002.7ec2.95f2 to 0002.7ec2.95f2   1.0   6.3(1)       7.5(0.6)HUB9 Ok      
    
    Mod Sub-Module                  Model           Serial           Hw     Status 
    --- --------------------------- --------------- --------------- ------- -------
      1 Policy Feature Card 2       WS-F6K-PFC2     SAD062802AV      3.2    Ok     
      1 Cat6k MSFC 2 daughterboard  WS-F6K-MSFC2    SAD062803TX      2.5    Ok     
      3 Distributed Forwarding Card WS-F6K-DFC      SAL06121A19      2.1    Ok     
      4 Distributed Forwarding Card WS-F6K-DFC      SAL06121A46      2.1    Ok     
      6 Distributed Forwarding Card WS-F6K-DFC      SAL06261R0A      2.3    Ok     
      6 10GBASE-LR Serial 1310nm lo WS-G6488        SAD062201BN      1.1    Ok
  • Make sure that the Supervisor Engine module is properly seated in the backplane connector. Also, make sure that the Supervisor Engine installation screw is completely tightened. Refer to Catalyst 6500 Series Switch Module Installation Note for more information.

  • In order to identify if the standby Supervisor Engine is faulty, issue the redundancy reload peer command from the active Supervisor Engine. Via the console to the standby Supervisor Engine, observe the boot sequence in order to identify any hardware failures.

    If the standby Supervisor Engine still does not come on line, create a service request with Cisco Technical Support in order to troubleshoot further. When you create the service request, provide the log of the switch output that you collected and the troubleshooting steps that you performed.

Show Module Output Gives "not applicable" for SPA Module

This error message occurs because PA-1XCHSTM1/OC3 does not have diagnostic support in SRB. When this command is passed, while the switch runs an SRB code, the not applicable status is seen. This does not mean that the status of the SPA Interface Processor is not checked since the overall diagnostics give the proper results. From the SRC code onwards, this output works. This is caused by a bug with the SRB code, and this bug is filed in CSCso02832 (registered customers only) .

Standby Supervisor Engine Reloads Unexpectedly

This section discusses the common reasons why the Catalyst switch standby supervisor unexpectedly reloads.

Common Reasons/Solutions

  • The active supervisor resets the standby supervisor after a failure to synchronize with the start-up configuration. The issue can be due to the consecutive wr mem that is performed by management stations in a short span of time (1-3 seconds), which locks the startup-configuration and causes synchronization to fail. If the first sync process is not completed and the second wr mem is issued, there is a sync failure on the standby supervisor, and sometimes the standby supervisor reloads or resets. This issue is documented in bug CSCsg24830 (registered customers only) . This synchronization failure can be identified by this error message:

    %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup configuration to
    the standby Router
    %PFINIT-SP-1-CONFIG_SYNC_FAIL: Sync'ing the startup configuration
    to the standby Router FAILED
  • The active supervisor does not synchronize its configuration with the standby supervisor. This condition can be a transient one that was caused by the temporary use of the configuration file by another process. If you entered the show configuration command or the show running-configuration command to view the configuration or the running configuration, the configuration file is locked. This issue is documented in bug CSCeg21028 (registered customers only) . This synchronization failure can be identified by this error message:

    %PFINIT-SP-1-CONFIG_SYNC_FAIL_RETRY: Sync'ing the startup 
    configuration to the standby Router FAILED, the file may be already locked by a command

Even After You Remove the Modules, the show run Command Still Shows Information About the Removed Module Interfaces

When you physically remove a module from the chassis, the configuration for the module in the slot still appears. This issue is a result of the design that allows for easier replacement of the module. If you insert the same type of module in the slot, the switch uses the configurations of the module that was previously in the slot. If you insert another type of module into the slot, the module configuration is cleared. In order to remove the configuration automatically once a module is taken out of a slot, issue the module clear-config command from the global configuration mode. Make sure to issue the command before the modules are removed from the slot. The command does not clear the old configurations of modules that had already been removed from the slot. This command clears the module configuration from the output of the show running-config command and the interface details from the output of the show ip interface brief command. From the Cisco IOS releases 12.2(18)SXF and later, it also removes the count of interface types from the show version command.

Switch Has Reset/Rebooted on Its Own

If your switch has reset on its own without any manual intervention, follow these steps in order to identify the problem:

Common Reasons/Solutions

  • The switch can have had a software crash. Issue the dir bootflash: command, which displays the MSFC (route processor [RP]) bootflash device, and the dir slavebootflash: command in order to check for a software crash.

    The output in this section shows that crashinfo has been recorded in the RP bootflash:. Make sure that the crashinfo that you view is of the most recent crash. Issue the more bootflash:filename command in order to display the crashinfo file. In this example, the command is more bootflash:crashinfo_20020829-112340.

    cat6knative#dir bootflash:
    Directory of bootflash:/
    
        1  -rw-     1693168   Jul 24 2002 15:48:22  c6msfc2-boot-mz.121-8a.EX
        2  -rw-      183086   Aug 29 2002 11:23:40  crashinfo_20020829-112340
        3  -rw-    20174748   Jan 30 2003 11:59:18  c6sup22-jsv-mz.121-8b.E9
        4  -rw-        7146   Feb 03 2003 06:50:39  test.cfg
        5  -rw-       31288   Feb 03 2003 07:36:36  01_config.txt
        6  -rw-       30963   Feb 03 2003 07:36:44  02_config.txt
    
    31981568 bytes total (9860396 bytes free)

    The dir sup-bootflash: command displays the Supervisor Engine bootflash: device. You can also issue the dir slavesup-bootflash: command in order to display the standby Supervisor Engine bootflash: device. This output shows crashinfo recorded in the Supervisor Engine bootflash: device:

    cat6knative11#dir sup-bootflash:
    Directory of sup-bootflash:/
    
        1  -rw-    14849280   May 23 2001 12:35:09  c6sup12-jsv-mz.121-5c.E10
        2  -rw-       20176   Aug 02 2001 18:42:05  crashinfo_20010802-234205
    
    !--- Output suppressed.
    
    

    If the command output indicates that a software crash occurred at the time you suspected that the switch rebooted, contact Cisco Technical Support. Provide the output of the show tech-support command and the show logging command, as well as the output of the crashinfo file. In order to send the file, transfer it via TFTP from the switch to a TFTP server, and attach the file to the case.

  • If there is no crashinfo file, check the power source for the switch to make sure that it did not fail. If you use an uninterruptible power supply (UPS), make sure that it works properly. If you still cannot determine the problem, contact the Cisco Technical Support escalation center.

DFC-Equipped Module Has Reset on Its Own

If a Distributed Forwarding Card (DFC)-equipped module has reset on its own without user reload, you can check the bootflash of the DFC card to see if it crashed. If a crash information file is available, you can find the cause of the crash. Issue the dir dfc#module_#-bootflash: command in order to verify if there is a crash information file and when it was written. If the DFC reset matches the crashinfo timestamp, issue the more dfc#module_#-bootflash:filename command. Or, issue the copy dfc#module_#-bootflash:filename tftp command in order to transfer the file via TFTP to a TFTP server.

cat6knative#dir dfc#6-bootflash:
Directory of dfc#6-bootflash:/
-#- ED ----type---- --crc--- -seek-- nlen -length- -----date/time------ name 
1   ..   crashinfo 2B745A9A   C24D0   25   271437 Jan 27 2003 20:39:43 crashinfo_
 20030127-203943

After you have the crashinfo file available, collect the output of the show logging command and the show tech command and contact Cisco Technical Support for further assistance.

Troubleshoot a Module That Does Not Come Online or Indicates Faulty or Other Status

This section outlines common reasons that one of the modules can fail to come on line and how to solve the problem. You can determine that a module does not come on line in one of these ways:

  • The output of the show module command shows one of these statuses:

    • other

    • unknown

    • faulty

    • errdisable

    • power-deny

    • power-bad

  • The amber or red status LED is lit.

Common Reasons/Solutions

  • Check the Supported Hardware section of the Catalyst 6500 Series Release Notes of the relevant release. If the module is not supported in the software that you currently run, download the required software from the Cisco IOS Software Center (registered customers only) .

  • If the status is power-deny, the switch does not have enough power available to power this module. Issue the show power command in order to confirm if enough power is available. See the Troubleshoot C6KPWR-4-POWRDENIED: insufficient power, module in slot [dec] power denied or %C6KPWR-SP-4-POWRDENIED: insufficient power, module in slot [dec] power denied Error Messages section of this document.

  • If the status is power-bad, the switch is able to see a card, but unable to allocate power. This is possible if the Supervisor Engine is not able access the serial PROM (SPROM) contents on the module in order to determine the identification of the line card. You can issue the show idprom module slot command in order to verify if the SPROM is readable. If the SPROM is not accessible, you can reset the module.

  • Make sure that the module is properly seated and screwed in completely. If the module still does not come on line, issue the diagnostic bootup level complete global configuration command in order to make sure that the diagnostic is enabled. Then, issue the hw-module module slot_number reset command. If the module still does not come on line, inspect the backplane connector on the module to make sure that there is no damage. If there is no visual damage, try the module in another slot or a different chassis. Also, inspect for bent pins on the slot connector on the backplane. Use a flashlight, if necessary, when you inspect the connector pins on the chassis backplane.

  • Issue the show diagnostics module slot_number command in order to identify any hardware failures on the module. Issue the diagnostic bootup level complete global configuration command in order to enable complete diagnostics. You must have complete diagnostics enabled so that the switch can perform diagnostics on the module. If you have minimal diagnostics enabled and you change to complete diagnostics, the module must reset so that the switch can perform the full diagnostics. The example output in this section issues the show diagnostics module command. But the output is inconclusive because many of the tests have been performed in minimal mode. The output shows how to turn on the diagnostic level and then issue the show diagnostics module command again in order to see the complete results.

    Note: The Gigabit Interface Converters (GBICs) were not installed in the sample module. Therefore, the integrity tests were not performed. The GBIC integrity test is performed only on copper GBICs (WS-G5483= ).

    cat6native#show diagnostic module 3
    Current Online Diagnostic Level = Minimal
    
    Online Diagnostic Result for Module 3 : PASS
    Online Diagnostic Level when Module 3 came up = Minimal
    
    Test Results: (. = Pass, F = Fail, U = Unknown)
    
    1 . TestGBICIntegrity : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             U  U  U  U  U  U  U  U  U  U  U  U  U  U  U  U 
    
    2 . TestLoopback : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . 
    
    3 . TestDontLearn                 : U
    4 . TestConditionalLearn          : .
    5 . TestStaticEntry               : U
    6 . TestCapture                   : U
    7 . TestNewLearn                  : .
    8 . TestIndexLearn                : U
    9 . TestTrap                      : U
    10. TestIpFibShortcut             : .
    11. TestDontShortcut              : U
    12. TestL3Capture                 : U
    13. TestL3VlanMet                 : .
    14. TestIngressSpan               : .
    15. TestEgressSpan                : .
    16. TestAclPermit                 : U
    17. TestAclDeny                   : U
    18. TestNetflowInlineRewrite : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             U  U  U  U  U  U  U  U  U  U  U  U  U  U  U  U 
    
    !--- Tests that are marked "U" were skipped because a minimal 
    !--- level of diagnostics was enabled.
    
    cat6knative#configure terminal
    Enter configuration commands, one per line.  End with CNTL/Z.
    cat6knative(config)#diagnostic bootup level complete
    
    !--- This command enables complete diagnostics.
    
    cat6knative(config)#end
    cat6knative#
    *Feb 18 13:13:03 EST: %SYS-5-CONFIG_I: Configured from console by console
    cat6knative#
    cat6knative#hw-module module 3 reset
    Proceed with reload of module? [confirm]
    % reset issued for module 3
    cat6knative#
    *Feb 18 13:13:20 EST: %C6KPWR-SP-4-DISABLED: power to module in slot 3 set off 
     (Reset)
    *Feb 18 13:14:12 EST: %DIAG-SP-6-RUN_COMPLETE: Module 3: Running Complete Online 
     Diagnostics...
    *Feb 18 13:14:51 EST: %DIAG-SP-6-DIAG_OK: Module 3: Passed Online Diagnostics
    *Feb 18 13:14:51 EST: %OIR-SP-6-INSCARD: Card inserted in slot 3, interfaces 
     are now online 
    cat6knative#show diagnostic module 3  
    Current Online Diagnostic Level = Complete
    
    Online Diagnostic Result for Module 3 : PASS
    Online Diagnostic Level when Module 3 came up = Complete
    
    Test Results: (. = Pass, F = Fail, U = Unknown)
    
    1 . TestGBICIntegrity : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             U  U  U  U  U  U  U  U  U  U  U  U  U  U  U  U 
    
    !--- The result for this test is unknown ("U", untested) 
    !--- because no copper GBICS are plugged in.
    
    
    2 . TestLoopback : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . 
    
    3 . TestDontLearn                 : .
    4 . TestConditionalLearn          : .
    5 . TestStaticEntry               : .
    6 . TestCapture                   : .
    7 . TestNewLearn                  : .
    8 . TestIndexLearn                : .
    9 . TestTrap                      : .
    10. TestIpFibShortcut             : .
    11. TestDontShortcut              : .
    12. TestL3Capture                 : .
    13. TestL3VlanMet                 : .
    14. TestIngressSpan               : .
    15. TestEgressSpan                : .
    16. TestAclPermit                 : .
    17. TestAclDeny                   : .
    18. TestNetflowInlineRewrite : 
    
       Port  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
       ----------------------------------------------------
             .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . 
  • Issue the show tech-support command and the show logging command. Look for any other messages that relate to this module in order to troubleshoot further.

    If the module still does not come on line, create a service request with Cisco Technical Support in order to troubleshoot further. Provide the log of the switch output that you collected and the troubleshooting steps that you performed.

Inband Communication Failure

The Supervisor Engines can throw messages that indicate Inband communication failure. The messages logged by the switch look similar to these:

InbandKeepAliveFailure:Module 1 not responding over inband
InbandKeepAlive:Module 2 inband rate: rx=0 pps, tx=0 pps
ProcessStatusPing:Module 1 not responding over SCP
ProcessStatusPing:Module 1 not responding... resetting module

Common Cause/Solution 1

When the management interface of the switch processes heavy traffic, switch logs InbandKeepAliveFailure error messages appear. This can be caused by these reasons:

  • Busy Supervisor Engine

  • Spanning tree protocol loop

  • ACLs and QoS policers have throttled or dropped traffic over the inband communications channel

  • Port ASIC synchronization problems

  • Switch Fabric Module problems

In order to resolve the issue, follow these instructions:

  1. Use show process cpu, to determine which process causes this issue. Refer to Catalyst 6500/6000 Switch High CPU Utilization to clear the root cause.

  2. A mis-seated or faulty Supervisor Module can throw up these communication failure messages. In order to recover from this error messages, schedule a maintenance window and re-seat the Supervisor Module.

Error "System returned to ROM by power-on (SP by abort)"

A Cisco Catalyst 6500/6000 that runs Cisco IOS Software can appear to reload with this reset reason:

System returned to ROM by power-on (SP by abort)

A Catalyst 6500/6000 with an SP configuration register that allows break, for example 0x2, and that receives a console break signal enters ROMmon diagnostic mode. The system appears to crash. A mismatch of the configuration register settings on SP and RP can cause this type of reload. Specifically, you can set the Supervisor Engine Switch Processor (SP) configuration register to a value that does not ignore break, while the Multilayer Switch Feature Card (MSFC) Route Processor (RP) configuration register is a proper value that does ignore break. For example, you can set the Supervisor Engine SP to 0x2 and the MSFC RP to 0x2102.

For more information, refer to IOS Catalyst 6500/6000 Resets with Error "System returned to ROM by power-on (SP by abort)".

A Cisco Catalyst 6500/6000 that runs Cisco IOS Software boots the old image in the sup-bootdisk regardless of the BOOT variable configuration in the running configuration. Even though the BOOT variable is configured to boot from external flash, it switch boots only the old image in the sup-bootdisk. The cause for this issue is the mismatch of the configuration register settings on SP and RP.

In the RP, issue the command show bootvar .

Switch#sh boot
BOOT variable = 
sup-bootdisk:s72033-advipservicesk9_wan-mz.122-18.SXF7.bin,1;
CONFIG_FILE variable =
BOOTLDR variable =
Configuration register is 0x2102

In the SP, issue the command show bootvar .

Switch-sp#sh boot
BOOT variable = bootdisk:s72033-advipservicesk9_wan-mz.122-18.SXF7.bin,1;
CONFIG_FILE variable does not exist
BOOTLDR variable does not exist
Configuration register is 0x2101

This causes the switch to boot the previous image regardless of the BOOT variable configuration in the running configuration. In order to resolve this problem, issue the command switch(config)#config-register 0x2102, and then confirm that both the SP and RP have the same config-register value. Reload the switch after you save it in startup configuration.

Error: NVRAM: nv->magic != NVMAGIC, invalid nvram

This error message indicates that the NVRAM has issues. If you erase the NVRAM and reload the switch, it can recover the NVRAM.

If this does not resolve the issue, format the NVRAM in order to help resolve the issue. In both cases, it is recommended to have a backup of the NVRAM contents. This error message is displayed only when NVRAM debugging is enabled.

Error: Switching Bus FIFO counter stuck

The error message CRIT_ERR_DETECTED Module 7 - Error: Switching Bus FIFO counter stuck indicates that the module has not seen activity on the data switching bus.

The reason for this error can be because the newly inserted module was not firmly inserted in the chassis initially or was pushed in too slowly.

Reseat the module in order to resolve the problem.

Error: Counter exceeds threshold, system operation continue

The Catalyst 6500 vss cluster encounters this error message:

%CONST_DIAG-4-ERROR_COUNTER_WARNING: Module [dec] Error counter exceeds 
   threshold, system operation continue.

The TestErrorCounterMonitor has detected that an error counter in the specified module has exceeded a threshold. Specific data about the error counter will be sent in a separate system message. The TestErrorCounterMonitor is a non-disruptive health-monitoring background process that periodically polls the error counters and interrupt counters of each line card or supervisor module in the system.

%CONST_DIAG-4-ERROR_COUNTER_DATA: ID:[dec] IN:[dec] PO:[dec] RE:[dec] RM:[dec]
   DV:[dec] EG:[dec] CF:[dec] TF:[dec]

The TestErrorCounterMonitor has detected that an error counter in the specified module has exceeded a threshold. This message contains specific data about the error counter, including the ASIC and register of the counter, and the error count.

This error message is received when an ASIC on the linecard receives packets with a bad CRC. The issue may be local to this module or may be triggered by some other faulty module in the chassis.

For example:

%CONST_DIAG-SW1_SP-4-ERROR_COUNTER_WARNING: Module 2 
   Error counter exceeds threshold, system operation continue.

The reason for this error can be because the newly inserted module was not firmly inserted.

Reseat the module in order to resolve the problem.

Error: No more SWIDB can be allocated

This error message is received when the maximum number of Software Interface Descriptor Block (SWIDB) is reached:

%INTERFACE_API-SP-1-NOMORESWIDB: No more SWIDB can be allocated, maximum allowed 12000

Refer to Maximum Number of Interfaces and Subinterfaces for Cisco IOS Platforms: IDB Limits for more information on IDB limits.

When you try to convert a non-switchport interface to a switchport, it returns an error.

Switch(config)#interface gigabit ethernet 7/29
Switch(config-if)#switchport
%Command rejected: Cannot convert port.
Maximum number of interfaces reached.

Output of idb:

AMC440E-SAS01#show idb

Maximum number of Software IDBs 12000.  In use 11999.

                       HWIDBs     SWIDBs
Active                    218        220
Inactive                11779      11779
Total IDBs              11997      11999
Size each (bytes)        3392       1520
Total bytes          40693824   18238480

This example shows that the Total IDBs number (under the SWIDBs column) has reached the maximum number of IDBs limit. When you delete a subinterface, the Active and Inactive numbers in the SWIDBs column change; however, the Total IDBs number remains in the memory.

In order to resolve this issue, reload the switch to clear the IDB database. Otherwise, once you run out, you will need to re-use the deleted subinterfaces.

SYSTEM INIT: INSUFFICIENT MEMORY TO BOOT THE IMAGE!

A similar error message is reported when the Cisco Catalyst 6500 switch fails to boot with a specified Cisco IOS software release.

00:00:56: %SYS-SP-2-MALLOCFAIL: Memory allocation of 2177024 bytes failed from 0x40173D8C,
alignment 8 
Pool: Processor  Free: 1266272  Cause: Not enough free memory 
Alternate Pool: None  Free: 0  Cause: No Alternate pool 

-Process= "TCAM Manager process", ipl= 0, pid= 112
-Traceback= 4016F4D0 40172688 40173D94 40577FF8 4055DB04 4055DEDC
SYSTEM INIT: INSUFFICIENT MEMORY TO BOOT THE IMAGE!

%Software-forced reload

This issue commonly occurs when there is not enough DRAM available for the image in Flash to decompress.

In order to resolve this issue, perform one of these options:

Troubleshoot CatOS to Cisco IOS Software or Cisco IOS Software to CatOS Conversion

If you have difficulty with a conversion from CatOS to Cisco IOS system software or Cisco IOS Software to CatOS, refer to these documents for assistance:

Problem when User Attempts to Access the NVRAM After Cisco IOS to CatOS Conversion

If the NVRAM gets corrupted or the value of the CONFIG_FILE variable is set from MSFC ROMmon during the conversion from Cisco IOS to CatOS, you can experience problems when you try to access the NVRAM from MSFC. You also get error messages that are similar to these:

Router#write memory
     startup-config file open failed (Not enough space)
Router#dir nvram:
     Directory of nvram:/       
    
%Error calling getdents for nvram:/ (Unknown error 89)

When MSFC loads with the CONFIG_FILE set in the ROMmon, the user is unable to save the configuration to NVRAM. The show startup-config also fails with an error code 89. This issue is seen in Catalyst 6500 with Supervisor Engine 720, in hybrid mode, running Cisco IOS Software Release 12.2 (14)SX2 on MSFC3.

These are the workarounds if the CONFIG_FILE is set:

  1. Upgrade the MSFC3 code to Cisco IOS Software Release 12.2(17a)SX or later. For more information on how to upgrade the software image on MSFC, refer to How to Upgrade Software Images on Catalyst Switch Layer 3 Modules.

  2. Unset the CONFIG_FILE variable from the MSFC ROMmon.

    In order to enter into ROMmon mode, reload the MSFC and then press the Ctrl+Break key during the first 60 seconds of startup. Once the MSFC enters into the ROMmon mode, issue these commands in order to unset the CONFIG_FILE:

    • rommon 2 >priv
      
      !--- Press Enter or Return.
      !--- You have entered ROMmon privileged mode.
      !--- You see this output:
      
      You now have access to the full set of monitor commands.
      Warning: some commands will allow you to destroy your
      configuration and/or system images and could render
      the machine unbootable.
    • rommon 3 >unset CONFIG_FILE
      
      !--- Press Enter or Return.
      !--- This unsets the CONFIG_FILE variable.
      
      
    • rommon 4 >sync
      
      !--- Press Enter or Return.
      
      
    • rommon 5 >reset
      
      
      !--- Press Enter or Return.
      
      

If the NVRAM gets corrupted during the conversion from Cisco IOS to CatOS, erase the NVRAM to resolve the issue. In order to erase the NVRAM, enter into ROMmon mode and then issue these commands:

  • rommon 1 >priv
    
    
    !--- Press Enter or Return.
    !--- You have entered ROMmon privileged mode.
    !--- You see this output:
    
    You now have access to the full set of monitor commands.
    Warning: some commands will allow you to destroy your
    configuration and/or system images and could render
    the machine unbootable.
  • rommon 2 >nvram_erase
    
    
    !--- Press Enter or Return.
    !--- Be sure to enter these parameters exactly:
    !--- The first line is a "be" (no space) followed by six zeros ("000000").
    !--- The next line is an "2" (no space) followed by five zeros ("00000").
    
    Enter in hex the start address [0xbe020000]:  be000000
    
    
    !--- Press Enter or Return.
    
    Enter in hex the test size or length in bytes [0x100]:  200000
    
    
    !--- Press Enter or Return.
    !--- After the NVRAM erase has completed, issue the reset command.
    
    
    rommon 3 >reset
    
    !--- Press Enter or Return.
    
    

    Note: Supervisor Engine 720 has the nvram_erase command in Route Processor (MSFC) ROMmon, and it is not a valid command in Switch Processor (Supervisor Engine) ROMmon.

Unable to Boot with Cisco IOS Software when User Converts from CatOS to Cisco IOS

If you try to boot Cisco IOS Software from disk0 or slot0 during the conversion process, you can get an error message similar to this:

*** TLB (Store) Exception ***
Access address = 0x10000403
PC = 0x8000fd60, Cause = 0xc, Status Reg = 0x30419003
 
monitor: command "boot" aborted due to exception

This error message can be hardware or software related and can result in a boot loop or the switch getting stuck in ROM Monitor (ROMmon) mode.

Complete these steps in order to resolve this issue:

  1. This issue can be caused by a software image with a bad checksum. Re-download the Cisco IOS Software image from the TFTP server.

  2. If a re-download does not resolve the issue, format the Flash card and re-download the Cisco IOS Software image.

    Refer to PCMCIA Filesystem Compatibility Matrix and Filesystem Information for information on how to erase the Flash.

  3. This issue can also be due to a hardware fault, but the error message does not indicate which hardware component causes the problem. Try to boot the Cisco IOS Software from another Flash card.

Interface/Module Connectivity Problems

Connectivity Problem or Packet Loss with WS-X6548-GE-TX and WS-X6148-GE-TX Modules used in a Server Farm

When you use either the WS-X6548-GE-TX or WS-X6148-GE-TX modules, there is a possibility that individual port utilization can lead to connectivity problems or packet loss on the surrounding interfaces. Especially when you use EtherChannel and Remote Switched Port Analyzer (RSPAN) in these line cards, you can potentially see the slow response due to packet loss. These line cards are oversubscription cards that are designed to extend gigabit to the desktop and might not be ideal for server farm connectivity. On these modules there is a single 1-Gigabit Ethernet uplink from the port ASIC that supports eight ports. These cards share a 1 Mb buffer between a group of ports (1-8, 9-16, 17-24, 25-32, 33-40, and 41-48) since each block of eight ports is 8:1 oversubscribed. The aggregate throughput of each block of eight ports cannot exceed 1 Gbps. Table 4 in the Cisco Catalyst 6500 Series 10/100- & 10/100/1000-Mbps Ethernet Interface Modules shows the different types of Ethernet interface modules and the supported buffer size per port.

Oversubscription happens due to multiple ports combined into a single Pinnacle ASIC. The Pinnacle ASIC is a direct memory access (DMA) engine that transfers packets between backplane switching bus and the network ports. If any port in this range receives or transmits traffic at a rate that exceeds its bandwidth or utilizes a large amount of buffers to handle bursts of traffic, the other ports in the same range can potentially experience packet loss. The buffer assignment on these modules is documented in Buffers, Queues & Thresholds on Catalyst 6500 Ethernet Modules.

A SPAN destination is a very common cause since it is not uncommon to copy traffic from an entire VLAN or multiple ports to a single interface. On a card with individual interface buffers, the packets that exceed the bandwidth of the destination port are silently dropped and no other ports are affected. With a shared buffer, this causes connectivity problems for the other ports on this range. In most scenarios, shared buffers do not result in any problems. Even with eight gigabit attached workstations, it is rare that the provided bandwidth is exceeded.

The switch can experience degradation in services when you configure local SPAN in a switch, especially if it monitors a large amount of source ports. This problem remains if it monitors certain VLANs and if a large number or ports is assigned to any of these VLANs.

Even though SPAN is done in hardware, there is a performance impact since now the switch carries twice as much traffic. Since each linecard replicates the traffic at ingress, whenever a port is monitored, all ingress traffic is doubled when it hits the fabric. The capture of traffic from a large number of busy ports on a linecard can fill up the fabric connection, especially with the WS-6548-GE-TX cards, which only have an 8 Gigabit fabric connection.

The WS-X6548-GE-TX, WS-X6548V-GE-TX, WS-X6148-GE-TX, and WS-X6148V-GE-TX modules have a limitation with EtherChannel. For EtherChannel, the data from all links in a bundle goes to the port ASIC, even though the data is destined for another link. This data consumes bandwidth in the 1-Gigabit Ethernet link. For these modules, the sum total of all data on an EtherChannel cannot exceed 1 Gigabit.

Check this output in order to verify that the module experiences drops related to over utilized buffers:

  • CatOS

    Cat6500 (enable) show asicreg <mod/port> pinnacle err

    Check this output in the list of registers. If the settings in this output are non-zero, it indicates that there were drops due to the buffer overrun.

    015B: PI_PBT_S_QOS3_OUTLOST_REG = 0011

    015F: PI_PBT_S_HOLD_REG = D26C

  • NativeIOS

    Cat6500# show counters interface gigabitEthernet <mod/port> | include qos3Outlost

    51. qos3Outlost = 768504851

Run the show commands several times to check if asicreg steadily increments. The asicreg outputs are cleared every time they are run. If the asicreg outputs remain non-zero then this indicates active drops. Based on the rate of traffic, this data might need to be collected over several minutes in order to get significant increments.

Workaround

Complete these steps:

  1. Isolate any ports that might be consistently oversubscribed to their own range of ports in order to minimize the impact of drops to other interfaces.

    For example, if you have a server connected to port 1 which is oversubscribing the interface, this can lead to slow response if you have several other servers connected to the ports in the range 2-8. In this case, move the oversubscribing server to port 9 in order to free up the buffer in the first block of ports 1-8. On newer software versions, SPAN destinations have the buffering automatically moved to the interface so it does not impact the other ports in its range. Refer to Cisco bug IDs CSCed25278 (registered customers only) (CatOS) and CSCin70308 (registered customers only) (NativeIOS) for more information.

  2. Disable head of line blocking (HOL) which utilizes the interface buffers instead of the shared buffers.

    This results in only the single over utilized port having drops. Since the interface buffers (32 k) are significantly smaller than the 1 Mb shared buffer, there can potentially be more packet loss on the individual ports. This is only recommended for extreme cases where slower clients or SPAN ports cannot be moved to the other line cards that offer dedicated interface buffers.

    • NativeIOS

      Router(config)# interface gigabitethernet <mod/port>

      Router(config-if)# hol-blocking disable

      Once this is disabled, the drops move to the interface counters and can be seen with the show interface gigabit <mod/port> command. The other ports are no longer affected provided that they are also not individually bursting. Since it is recommended to keep HOL blocking enabled, this information can be used to find the device that overruns the buffers on the range of ports and move it to another card or an isolated range on the card so HOL blocking can be re-enabled.

    • CatOS

      Console> (enable) set port hol-blocking <mod/port> disable

      Once this is disabled, the drops move to the interface counters and can be seen with the show mac <mod/port> command. The other ports are no longer affected provided that they are not also individually bursting. Since it is recommended to keep HOL blocking enabled, this information can be used to find the device that overruns the buffers on the range of ports and move it to another card or an isolated range on the card so HOL blocking can be re-enabled.

  3. When you configure a SPAN session, make sure that the destination port does not report any errors on that specific interface. In order to check any possible errors on the destination port, check the output of the show interface <interface type> <interface number> command for IOS or the output of the show port counters <mod/port> command in CatOS to see if there are any output drops or errors. The device connected to the destination port and the port itself must have the same speed and duplex settings to avoid any errors on the destination port.

  4. Consider a move to Ethernet modules that do not have oversubscribed ports. Refer to Cisco Catalyst 6500 Series Switches - Relevant Interfaces and Modules for more information on the supported modules.

Workstation Is Unable to Log In to Network During Startup/Unable to Obtain DHCP Address

Protocols that run on the switch can introduce initial connectivity delay. There is a possibility that you have this problem if you observe any of these symptoms when you power up or reboot a client machine:

  • A Microsoft networking client displays No Domain Controllers Available.

  • DHCP reports No DHCP Servers Available.

  • A Novell Internetwork Packet Exchange (IPX) networking workstation does not have the Novell Login screen upon bootup.

  • An AppleTalk networking client displays Access to your AppleTalk network has been interrupted. To re-establish your connection, open and close the AppleTalk control panel. There is also a possibility that the AppleTalk client Chooser application either does not display a zone list or displays an incomplete zone list.

  • IBM Network Stations can have one of these messages:

    • NSB83619--Address resolution failed

    • NSB83589--Failed to boot after 1 attempt

    • NSB70519--Failed to connect to a server

Common Reasons/Solutions

Interface delay can result in the symptoms that the section Workstation Is Unable to Log In to Network During Startup/Unable to Obtain DHCP Address lists. These are common causes of interface delay:

  • Spanning Tree Protocol (STP) delay

  • EtherChannel delay

  • Trunking delay

  • Autonegotiation delay

For more information about these delays and possible solutions, refer to Using PortFast and Other Commands to Fix Workstation Startup Connectivity Delays.

If you still have issues after you review and follow the procedure, contact Cisco Technical Support.

Troubleshoot NIC Compatibility Issues

You can have network interface card (NIC) compatibility or misconfiguration issues with the switch if you have any of these problems:

  • A server/client connection to the switch does not come up.

  • You have autonegotiation issues.

  • You see errors on the port.

Common Reasons/Solutions

The reason for these symptoms can be:

  • A known NIC driver issue

  • Speed-duplex mismatch

  • Autonegotiation problems

  • Cabling problems

In order to troubleshoot further, refer to Troubleshooting Cisco Catalyst Switches to NIC Compatibility Issues.

Interface Is in errdisable Status

If the interface status is errdisable in the show interface status command output, the interface has been disabled because of an error condition. Here is an example of the interface in errdisable status:

cat6knative#show interfaces gigabitethernet 4/1 status 

Port    Name               Status       Vlan       Duplex  Speed Type
Gi4/1                      err-disabled 100          full   1000 1000BaseSX

Or, you can see messages similar to these if the interface has been disabled because of an error condition:

%SPANTREE-SP-2-BLOCK_BPDUGUARD: 
   Received BPDU on port GigabitEthernet4/1 with BPDU Guard enabled. Disabling port.
%PM-SP-4-ERR_DISABLE: 
   bpduguard error detected on Gi4/1, putting Gi4/1 in err-disable state

This example message displays when the bridge protocol data unit (BPDU) is received on a host port. The actual message depends on the reason for the error condition.

There are various reasons for the interface to go into errdisable. The reason can be:

  • Duplex mismatch

  • Port channel misconfiguration

  • BPDU guard violation

  • UDLD condition

  • Late-collision detection

  • Link-flap detection

  • Security violation

  • Port Aggregation Protocol (PAgP) flap

  • Layer 2 Tunneling Protocol (L2TP) guard

  • DHCP snooping rate-limit

In order to enable an errdisabled port, complete these steps:

  1. Unplug the cable from one end of the connection.

  2. Reconfigure the interfaces.

    For example, if the interfaces are in an errdisabled state due to Etherchannel misconfiguration, reconfigure interface ranges for the etherchannel.

  3. Shutdown the ports on both ends.

  4. Plug the cables to both the switches.

  5. Issue the no shutdown command on the interfaces.

You can also issue the errdisable recovery cause cause enable command in order to set up timeout mechanisms that automatically reenable the port after a configured timer period.

Note: The error condition reoccurs if you do not resolve the root cause of the issue.

In order to determine the reason for the errdisable status, issue the show errdisable recovery command.

cat6knative#show errdisable recovery 
ErrDisable Reason    Timer Status
-----------------    --------------
udld                 Enabled
bpduguard            Enabled
security-violatio    Enabled
channel-misconfig    Enabled
pagp-flap            Enabled
dtp-flap             Enabled
link-flap            Enabled
l2ptguard            Enabled
psecure-violation    Enabled

Timer interval: 300 seconds

Interfaces that will be enabled at the next timeout:

Interface    Errdisable reason    Time left(sec)
---------    -----------------    --------------
 Gi4/1           bpduguard             270

After you know the cause of the errdisable, troubleshoot the problem and fix the root of the issue. For example, your port can be in errdisable because of the receipt of a BPDU on a PortFast-enabled access port, as in the example. You can troubleshoot whether a switch was accidently connected to that port or if a hub was connected that created a looping condition. In order to troubleshoot other scenarios, refer to the specific feature information in the product documentation.

Refer to Errdisable Port State Recovery on the Cisco IOS Platforms for more comprehensive information of errdiable status.

If you still have issues after you review and troubleshoot on the basis of this information, contact Cisco Technical Support for further assistance.

Troubleshoot Interface Errors

If you see errors in the show interface command output, check the state and health of the interface that encounters the problems. Also check whether traffic passes through the interface. Refer to Step 12 of Troubleshooting WS-X6348 Module Port Connectivity on a Catalyst 6500/6000 Running Cisco IOS System Software.

cat6knative#show interfaces gigabitethernet 1/1
GigabitEthernet1/1 is up, line protocol is up (connected)
  Hardware is C6k 1000Mb 802.3, address is 0001.6416.042a (bia 0001.6416.042a)
  Description: L2 FX Trunk to tpa_data_6513_01
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Full-duplex mode, link type is autonegotiation, media type is SX
  output flow-control is unsupported, input flow-control is unsupported, 1000Mb/s
  Clock mode is auto
  input flow-control is off, output flow-control is off
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:01, output 00:00:28, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue :0/40 (size/max)
  5 minute input rate 118000 bits/sec, 289 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     461986872 packets input, 33320301551 bytes, 0 no buffer
     Received 461467631 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 137 overrun, 0 ignored
     0 input packets with dribble condition detected
     64429726 packets output, 4706228422 bytes, 0 underruns
     0 output errors, 0 collisions, 2 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier
     0 output buffer failures, 0 output buffers swapped out
cat6knative#

Also, you may see errors in the show interfaces interface-id counters errors command output. If so, check for errors that are associated with the interface. Refer to Step 14 of Troubleshooting WS-X6348 Module Port Connectivity on a Catalyst 6500/6000 Running Cisco IOS System Software.

cat6knative#show interfaces gigabitethernet 3/1 counters errors 

Port        Align-Err    FCS-Err   Xmit-Err    Rcv-Err UnderSize OutDiscards
Gi3/1               0          0          0          0         0           0

Port      Single-Col Multi-Col  Late-Col Excess-Col Carri-Sen     Runts    Giants
Gi3/1              0         0         0          0         0         0         0

Port       SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Gi3/1                0           0            0            0          0

Common Reasons/Solutions

  • The reason that the interface shows errors can be physical layer issues, such as:

    • Faulty cable/NIC

    • Configuration issues, such as a speed-duplex mismatch

    • Performance issues, such as oversubscription

    In order to understand and troubleshoot these issues, refer to Troubleshooting Switch Port and Interface Problems.

  • At times, error counters are incremented incorrectly because of a software bug or a hardware limitation. This table lists some of the known counter issues with the Catalyst 6500/6000 platform that runs Cisco IOS Software:

    Symptom Description Fix
    Giants on IEEE 802.1Q trunk interfaces on Supervisor Engine 720-based switches. A Catalyst 6500 series switch may report giants for packet sizes that are above 1496 bytes and are received tagged on a trunk over the Supervisor Engine 720 ports. You can also see this issue on 67xx line cards. The issue is cosmetic, and the switch forwards the packets. The issue also occurs with ISL1 trunks. Refer to Cisco bug IDs CSCec62587 (registered customers only) and CSCed42859 (registered customers only) for details. Cisco IOS Software Release 12.2(17b)SXA and later Cisco IOS Software Release 12.2(18)SXD and later
    Giants on 802.1Q trunk interfaces on Supervisor Engine 2-based switches. The switch counts packets which follow in the range of 1497 to 1500 on a non-native VLAN on the 802.1Q trunk port as giants. This is a cosmetic issue, and the packets are forwarded by the switch. Refer to Cisco bug ID CSCdw04642 (registered customers only) for details. Not currently available
    Excessive output drop counters are seen in the show interface command output on Gigabit interfaces, even during low traffic conditions. Excessive output drop counters are seen in the show interface command output on Gigabit interfaces, even during low traffic conditions. Refer to Cisco bug ID CSCdv86024 (registered customers only) for details. Cisco IOS Software Release 12.1(8b)E12 and later Cisco IOS Software Release 12.1(11b)E8 and later Cisco IOS Software Release 12.1(12c)E1 and later Cisco IOS Software Release 12.1(13)E1 and later
    The port channel interface has incorrect statistics in the output of the show interface command for the bps1 and pps2. When you use Cisco IOS Software and a port channel is defined on two Fast Ethernet ports, and traffic is generated through the port channel, the physical interfaces have the correct rate statistics. However, the port channel interface has incorrect statistics. Refer to Cisco bug ID CSCdw23826 (registered customers only) for details. Cisco IOS Software Release 12.1(8a)EX Cisco IOS Software Release 12.1(11b)E1 Cisco IOS Software Release 12.1(13)E1

    1 ISL = Inter-Switch Link.

    2 bps = bits per second.

    3 pps = packets per second.

If you still have issues after you review and troubleshoot on the basis of the documents that this section mentions, contact Cisco Technical Support for further assistance.

You Receive %PM_SCP-SP-3-GBIC_BAD: GBIC integrity check on port x failed: bad key Error Messages

GBICs that work in software releases that are earlier than Cisco IOS Software Release 12.1(13)E fail after you upgrade.

With Cisco IOS Software Release 12.1(13) system software, ports with GBICs that have a bad GBIC EEPROM checksum are not allowed to come up. This is the expected behavior for 1000BASE-TX (copper) and Coarse Wave Division Multiplexer (CWDM) GBICs. However, the behavior is incorrect for other GBICs. With earlier releases, ports with the other GBICs that had checksum errors were allowed to come up.

This error message is printed when this error occurs in Cisco IOS Software Release 12.1(13)E:

%PM_SCP-SP-3-GBIC_BAD: GBIC integrity check on port 1/2 failed: bad key

Issue the show interface command in order to display this output:

Router#show interface status

Port    Name               Status       Vlan       Duplex  Speed Type
Gi2/1                      faulty       routed       full   1000 bad EEPROM

This problem will be fixed in Cisco IOS Software Releases 12.1(13)E1, 12.1(14)E, and later releases.

For further details about this issue, refer to Field Notice: GBIC EEPROM Errors Incorrect in Cisco IOS® Software Release 12.1(13)E for the Catalyst 6000.

You Get COIL Error Messages on WS-X6x48 Module Interfaces

You may see one or more of these error messages in the syslogs or show log command output:

  • Coil Pinnacle Header Checksum

  • Coil Mdtif State Machine Error

  • Coil Mdtif Packet CRC Error

  • Coil Pb Rx Underflow Error

  • Coil Pb Rx Parity Error

If you have connectivity issues with the connection of the hosts on the WS-X6348 module or other 10/100 modules, or if you see error messages that are similar to the ones listed in this section, and you have a group of 12 ports that are stuck and do not pass traffic, perform these steps:

  1. Disable and enable the interfaces.

  2. Issue the command in order to soft reset the module.

  3. Perform one of these actions in order to hard reset the module:

    • Physically reseat the card.

    • Issue the no power enable module module_# global configuration command and the power enable module module_# global configuration command.

After you perform these steps, contact Cisco Technical Support with the information if you encounter one or more of these issues:

  • The module does not come on line.

  • The module comes on line, but a group of 12 interfaces fails diagnostics.

    You can see this in the output from the show diagnostic module module_# command.

  • The module is stuck in the other state when you boot.

  • All port LEDs on the module become amber.

  • All interfaces are in the errdisabled state.

    You can see this when you issue the show interfaces status module module_# command.

Refer to Troubleshooting WS-X6348 Module Port Connectivity on a Catalyst 6500/6000 Running Cisco IOS System Software for detailed troubleshooting.

Troubleshoot WS-X6x48 Module Connectivity Problems

If you have connectivity issues with the connection of the hosts on the WS-X6348 module or other 10/100 modules, refer to Troubleshooting WS-X6348 Module Port Connectivity on a Catalyst 6500/6000 Running Cisco IOS System Software for detailed troubleshooting.

If you still have issues after you review and troubleshoot on the basis of the document Troubleshooting WS-X6348 Module Port Connectivity on a Catalyst 6500/6000 Running Cisco IOS System Software, contact Cisco Technical Support for further assistance.

Troubleshoot STP Issues

Spanning tree-related issues can cause connectivity problems in a switched network. For step-by-step troubleshooting and guidelines to prevent spanning-tree issues, refer to Troubleshooting STP on Catalyst Switch Running Cisco IOS System Software.

Unable to Use Telnet Command to Connect to Switch

Cause

Like every Cisco IOS device, the Catalyst 6500 switch also allows only a limited number of Telnet sessions. If you reach this limit, the switch does not allow further vty sessions. In order to verify if you run into this problem, connect to the console of the Supervisor Engine. Issue the show user command. The command-line interface (CLI) output from this command shows how many lines are currently occupied:

Cat6500#show user
Line     User    Host(s)      Idle     Location
0 con 0         10.48.72.118 00:00:00 
1 vty 0         10.48.72.118 00:00:00 10.48.72.118
2 vty 1         10.48.72.118 00:00:00 10.48.72.118
3 vty 2         10.48.72.118 00:00:00 10.48.72.118
4 vty 3         10.48.72.118 00:00:00 10.48.72.118
*5 vty 4         idle         00:00:00 10.48.72.118

Solutions

Complete these steps:

  1. Based on the output of the show user command, issue the clear line line_number command in order to clear obsolete sessions.

    Cat6500#show user
    Line     User    Host(s)      Idle     Location
    0 con 0         10.48.72.118 00:00:00 
    1 vty 0         10.48.72.118 00:00:00 10.48.72.118
    2 vty 1         10.48.72.118 00:00:00 10.48.72.118
    3 vty 2         10.48.72.118 00:00:00 10.48.72.118
    4 vty 3         10.48.72.118 00:00:00 10.48.72.118
    *5 vty 4         idle         00:00:00 10.48.72.118
    
    Cat6500#clear line 1
    
    Cat6500#clear line 2
    
    
    !--- Output suppressed.
    
    
  2. Configure idle timeout for the vty sessions and console line in order to clear any inactive sessions. This example shows the configuration to use in order to set the idle timeout to 10 minutes:

    Cat6500#configure terminal
    Enter configuration commands, one per line.  End with CNTL/Z.
    Cat6500(config)#line vty 0 4
    
    Cat6500(config-line)#exec-timeout ?
      <0-35791>  Timeout in minutes
    Cat6500(config-line)#exec-timeout 10 ?
      <0-2147483>  Timeout in seconds
      <cr>
    Cat6500(config-line)#exec-timeout 10 0
    
    Cat6500(config-line)#exit
    Cat6500(config)#line con 0
    
    Cat6500(config-line)#exec-timeout 10 0
    
    Cat6500(config-line)#exit
    Cat6500(config)#
  3. You can also raise the number of available vty sessions. Use the line vty 0 6 command instead of line vty 0 4 .

In some cases, the show user command output can show no active vty under sessions, but a connection to the switch with use of the telnet command still fails with this error message:

% telnet connections not permitted from this terminal

In this case, verify that you have correctly configured the vty. Issue the transport input all command in order to allow the vty to transport everything.

Unable to Console the Standby Unit using Radius Authentication

Problem

6500 switches are stacked in the VSS cluster; when you try to console it into a standby switch, it fails with this Radius log message:

%RADIUS-4-RADIUS_DEAD: RADIUS server 10.50.245.20:1812,1813 is not responding.

Authentication through Telnet to this standby supervisor works fine, and the console log in on the active supervisor also works fine. The problem occurs with the connection to the console of the standby supervisor.

Solution:

Radius authentication against the console for the standby unit is not possible. The standby does not have IP connectivity for AAA authentication. You need to use the fallback option, such as a local database.

Giant Packet Counters on VSL Interfaces

Sometimes giant packet counters on VSL interfaces increment even if no giant data packets are sent through the system.

Packets that traverse the VSL interfaces carry a 32-byte VSL header, over and above the normal MAC header. This header ideally is excluded in packet size classification, but the port ASIC actually includes this header in such classification. As a result, control packets that are close to the 1518 size limit for regular-sized packets can end up classified as giant packets.

Currently, there are no workarounds for this issue.

Multiple VLANs Appear on the Switch

You can see multiple VLANs on the switch that were not there before. For example:

Vlan982         unassigned      YES unset  administratively down down    
Vlan983         unassigned      YES unset  administratively down down    
Vlan984         unassigned      YES unset  administratively down down    
Vlan985         unassigned      YES unset  administratively down down    
Vlan986         unassigned      YES unset  administratively down down    
Vlan987         unassigned      YES unset  administratively down down    
Vlan988         unassigned      YES unset  administratively down down    
Vlan989         unassigned      YES unset  administratively down down    
Vlan990         unassigned      YES unset  administratively down down    
Vlan991         unassigned      YES unset  administratively down down    
Vlan992         unassigned      YES unset  administratively down down    
Vlan993         unassigned      YES unset  administratively down down    
Vlan994         unassigned      YES unset  administratively down down    
Vlan995         unassigned      YES unset  administratively down down    
Vlan996         unassigned      YES unset  administratively down down    
Vlan997         unassigned      YES unset  administratively down down    
Vlan998         unassigned      YES unset  administratively down down    
Vlan999         unassigned      YES unset  administratively down down    
Vlan1000        unassigned      YES unset  administratively down down    
Vlan1001        unassigned      YES unset  administratively down down    
Vlan1002        unassigned      YES unset  administratively down down    
Vlan1003        unassigned      YES unset  administratively down down    
Vlan1004        unassigned      YES unset  administratively down down    
Vlan1005        unassigned      YES unset  administratively down down

As a resolution, the vlan filter Traffic-Capture vlan-list 1 - 700 command is added to the configuration. Any VLANs not already configured will be added as layer 3 VLANs.

Power Supply and Fan Problems

Power Supply INPUT OK LED Does Not Light Up

If the power supply INPUT OK LED does not light up after you turn on the power switch, issue the show power status all command. Look for the status of the power supply, as this example shows:

cat6knative#show power status all           
                        Power-Capacity PS-Fan Output Oper
PS   Type               Watts   A @42V Status Status State
---- ------------------ ------- ------ ------ ------ -----
1    WS-CAC-2500W       2331.00 55.50  OK     OK     on 
2    none
                        Pwr-Requested  Pwr-Allocated  Admin Oper
Slot Card-Type          Watts   A @42V Watts   A @42V State State
---- ------------------ ------- ------ ------- ------ ----- -----
1    WS-X6K-S2U-MSFC2    142.38  3.39   142.38  3.39  on    on
2    WSSUP1A-2GE         142.38  3.39   142.38  3.39  on    on
3    WS-X6516-GBIC       231.00  5.50   231.00  5.50  on    on
4    WS-X6516-GBIC       231.00  5.50   231.00  5.50  on    on
5    WS-X6500-SFM2       129.78  3.09   129.78  3.09  on    on
6    WS-X6502-10GE       226.80  5.40   226.80  5.40  on    on
cat6knative#

If the status is not OK, as in this example, follow the steps indicated in the Troubleshooting the Power Supply section of the document Troubleshooting (Catalyst 6500 series switches) in order to troubleshoot further.

Troubleshoot C6KPWR-4-POWRDENIED: insufficient power, module in slot [dec] power denied or %C6KPWR-SP-4-POWRDENIED: insufficient power, module in slot [dec] power denied Error Messages

If you get this message in the log, the message indicates that there is not enough power to turn on the module. The [dec] in the message indicates the slot number:

%OIR-SP-6-REMCARD: Card removed from slot 9, interfaces disabled
C6KPWR-4-POWERDENIED: insufficient power, module in slot 9 power denied
C6KPWR-SP-4-POWERDENIED: insufficient power, module in slot 9 power denied

Issue the show power command in order to find the mode of power supply redundancy.

cat6knative#show power
system power redundancy mode = redundant
system power total = 27.460A
system power used = 25.430A
system power available = 2.030A
FRU-type       #    current   admin state oper
power-supply   1    27.460A   on          on
power-supply   2    27.460A   on          on
module         1    3.390A    on          on
module         2    3.390A    on          on
module         3    5.500A    on          on
module         5    3.090A    on          on
module         7    5.030A    on          on
module         8    5.030A    on          on
module         9    5.030A    on          off (FRU-power denied).

This output shows you that the power supply mode is redundant and that one power supply is not enough to power the whole chassis. You can perform one of these two options:

  • Get a higher-wattage power supply.

    For example, if the current power supply is 1300W AC, get a 2500W AC or 4000W AC power supply.

  • Make the power supply redundancy mode combined.

    Here is an example:

    cat6knative(config)#power redundancy-mode combined 
    cat6knative(config)#
     %C6KPWR-SP-4-PSCOMBINEDMODE: power supplies set to combined mode. 
    

In the combined mode, both power supplies provide power. However, in this mode, if one power supply fails, you lose power to the module again because the power supply that remains cannot supply power to the whole chassis.

Therefore, the better option is to use a higher-wattage power supply.

Power that is reserved for an empty slot cannot be reallocated. If, for example, slot 6 is empty, and slot 2 has only 68 watts available, you cannot reallocate the 282 watts reserved for slot 6 to slot 2 in order to have more wattage available for slot 2.

Each slot has its own available power, and, if not in use, it cannot be re-allocated to a different slot. There is no command to disable the reserved power for an empty slot.

Note: Make sure the switch is connected to a 220VAC instead of a 110VAC (if the power supply supports 220VAC) to use the full power capacity of the power supplies.

For more information about power management, refer to Power Management for Catalyst 6000 Series Switches.

FAN LED Is Red or Shows failed in the show environment status Command Output

If you issue the show environment status command and see that the fan assembly has failed, follow the steps in the Troubleshooting the Fan Assembly section of the document Troubleshooting (Catalyst 6500 series switches) in order to identify the problem.

Here is an example:

cat6knative#show environment status                              
backplane: 
  operating clock count: 2
  operating VTT count: 3
fan-tray 1: 
  fan-tray 1 fan-fail: failed

!--- Output suppressed.

"Diagnostic level complete" causes a crash on 6500

This error message is seen on the older IOS version 12.1, which has reached End Of Support [EOS}/ End Of Life [EOL]. Set the diagnostics back to the default of minimal, or upgrade the IOS that runs on the device to the lastest version of IOS to resolve this error.

Related Information

Updated: May 24, 2012
Document ID: 24053