Information About Troubleshooting the Software Configuration
Software Failure on a Switch
Switch software can be corrupted during an upgrade by downloading the incorrect file to the switch, and by deleting the image file. In all of these cases, there is no connectivity.Follow the steps described in the Recovering from a Software Failure section to recover from a software failure.
Lost or Forgotten Password on a Device
The default configuration for the device allows an end user with physical access to the device to recover from a lost password by interrupting the boot process during power-on and by entering a new password. These recovery procedures require that you have physical access to the device.
Note |
On these devices, a system administrator can disable some of the functionality of this feature by allowing an end user to reset a password only by agreeing to return to the default configuration. If you are an end user trying to reset a password when password recovery has been disabled, a status message reminds you to return to the default configuration during the recovery process. |
Note |
You cannot recover encryption password key, when Cisco WLC configuration is copied from one Cisco WLC to another (in case of an RMA). |
Follow the steps described in the section Recovering from a Lost or Forgotten Password to recover from a lost or forgotten password.
Power over Ethernet Ports
A Power over Ethernet (PoE) switch port automatically supplies power to one of these connected devices if the switch detects that there is no power on the circuit:
-
a Cisco pre-standard powered device (such as a Cisco IP Phone or a Cisco Aironet Access Point)
-
an IEEE 802.3af-compliant powered device
-
an IEEE 802.3at-compliant powered device
A powered device can receive redundant power when it is connected to a PoE switch port and to an AC power source. The device does not receive redundant power when it is only connected to the PoE port.
After the switch detects a powered device, the switch determines the device power requirements and then grants or denies power to the device. The switch can also detect the real-time power consumption of the device by monitoring and policing the power usage.
For more information, see the "Configuring PoE" chapter in the . Interface and Hardware Component Configuration Guide (Catalyst 9400 Switches)
Refere the section Scenarios to Troubleshoot Power over Ethernet (PoE) for various PoE troubleshooting scenarios.
Disabled Port Caused by Power Loss
If a powered device (such as a Cisco IP Phone 7910) that is connected to a PoE device port and powered by an AC power source loses power from the AC power source, the device might enter an error-disabled state. To recover from an error-disabled state, enter the shutdown interface configuration command, and then enter the no shutdown interface command. You can also configure automatic recovery on the device to recover from the error-disabled state.
On a device, the errdisable recovery cause loopback and the errdisable recovery interval seconds global configuration commands automatically take the interface out of the error-disabled state after the specified period of time.
Disabled Port Caused by False Link-Up
If a Cisco powered device is connected to a port and you configure the port by using the power inline never interface configuration command, a false link-up can occur, placing the port into an error-disabled state. To take the port out of the error-disabled state, enter the shutdown and the no shutdown interface configuration commands.
You should not connect a Cisco powered device to a port that has been configured with the power inline never command.
Ping
The device supports IP ping, which you can use to test connectivity to remote hosts. Ping sends an echo request packet to an address and waits for a reply. Ping returns one of these responses:
-
Normal response—The normal response (hostname is alive) occurs in 1 to 10 seconds, depending on network traffic.
-
Destination does not respond—If the host does not respond, a no-answer message is returned.
-
Unknown host—If the host does not exist, an unknown host message is returned.
-
Destination unreachable—If the default gateway cannot reach the specified network, a destination-unreachable message is returned.
-
Network or host unreachable—If there is no entry in the route table for the host or network, a network or host unreachable message is returned.
Refere the section Executing Ping to understand how ping works.
Layer 2 Traceroute
The Layer 2 traceroute feature allows the switch to identify the physical path that a packet takes from a source device to a destination device. Layer 2 traceroute supports only unicast source and destination MAC addresses. Traceroute finds the path by using the MAC address tables of the devices in the path. When the Device detects a device in the path that does not support Layer 2 traceroute, the Device continues to send Layer 2 trace queries and lets them time out.
The Device can only identify the path from the source device to the destination device. It cannot identify the path that a packet takes from source host to the source device or from the destination device to the destination host.
Layer 2 Traceroute Guidelines
-
Cisco Discovery Protocol (CDP) must be enabled on all the devices in the network. For Layer 2 traceroute to function properly, do not disable CDP.
If any devices in the physical path are transparent to CDP, the switch cannot identify the path through these devices.
-
A device is reachable from another device when you can test connectivity by using the ping privileged EXEC command. All devices in the physical path must be reachable from each other.
-
The maximum number of hops identified in the path is ten.
-
You can enter the traceroute mac or the traceroute mac ip privileged EXEC command on a device that is not in the physical path from the source device to the destination device. All devices in the path must be reachable from this switch.
-
The traceroute mac command output shows the Layer 2 path only when the specified source and destination MAC addresses belong to the same VLAN. If you specify source and destination MAC addresses that belong to different VLANs, the Layer 2 path is not identified, and an error message appears.
-
If you specify a multicast source or destination MAC address, the path is not identified, and an error message appears.
-
If the source or destination MAC address belongs to multiple VLANs, you must specify the VLAN to which both the source and destination MAC addresses belong. If the VLAN is not specified, the path is not identified, and an error message appears.
-
The traceroute mac ip command output shows the Layer 2 path when the specified source and destination IP addresses belong to the same subnet. When you specify the IP addresses, the device uses the Address Resolution Protocol (ARP) to associate the IP addresses with the corresponding MAC addresses and the VLAN IDs.
-
If an ARP entry exists for the specified IP address, the device uses the associated MAC address and identifies the physical path.
-
If an ARP entry does not exist, the device sends an ARP query and tries to resolve the IP address. If the IP address is not resolved, the path is not identified, and an error message appears.
-
-
When multiple devices are attached to one port through hubs (for example, multiple CDP neighbors are detected on a port), the Layer 2 traceroute feature is not supported. When more than one CDP neighbor is detected on a port, the Layer 2 path is not identified, and an error message appears.
-
This feature is not supported in Token Ring VLANs.
-
Layer 2 traceroute opens a listening socket on the User Datagram Protocol (UDP) port 2228 that can be accessed remotely with any IPv4 address, and does not require any authentication. This UDP socket allows to read VLAN information, links, presence of particular MAC addresses, and CDP neighbor information, from the device. This information can be used to eventually build a complete picture of the Layer 2 network topology.
-
Layer 2 traceroute is enabled by default and can be disabled by running the no l2 traceroute command in global configuration mode. To re-enable Layer 2 traceroute, use the l2 traceroute command in global configuration mode.
IP Traceroute
You can use IP traceroute to identify the path that packets take through the network on a hop-by-hop basis. The command output displays all network layer (Layer 3) devices, such as routers, that the traffic passes through on the way to the destination.
Your Device can participate as the source or destination of the traceroute privileged EXEC command and might or might not appear as a hop in the traceroute command output. If the Device is the destination of the traceroute, it is displayed as the final destination in the traceroute output. Intermediate devices do not show up in the traceroute output if they are only bridging the packet from one port to another within the same VLAN. However, if the intermediate Device is a multilayer Device that is routing a particular packet, this device shows up as a hop in the traceroute output.
The traceroute privileged EXEC command uses the Time To Live (TTL) field in the IP header to cause routers and servers to generate specific return messages. Traceroute starts by sending a User Datagram Protocol (UDP) datagram to the destination host with the TTL field set to 1. If a router finds a TTL value of 1 or 0, it drops the datagram and sends an Internet Control Message Protocol (ICMP) time-to-live-exceeded message to the sender. Traceroute finds the address of the first hop by examining the source address field of the ICMP time-to-live-exceeded message.
To identify the next hop, traceroute sends a UDP packet with a TTL value of 2. The first router decrements the TTL field by 1 and sends the datagram to the next router. The second router sees a TTL value of 1, discards the datagram, and returns the time-to-live-exceeded message to the source. This process continues until the TTL is incremented to a value large enough for the datagram to reach the destination host (or until the maximum TTL is reached).
To learn when a datagram reaches its destination, traceroute sets the UDP destination port number in the datagram to a very large value that the destination host is unlikely to be using. When a host receives a datagram destined to itself containing a destination port number that is unused locally, it sends an ICMP port-unreachable error to the source. Because all errors except port-unreachable errors come from intermediate hops, the receipt of a port-unreachable error means that this message was sent by the destination port.
Go to Example: Performing a Traceroute to an IP Host to see an example of IP traceroute process.
Time Domain Reflector Guidelines
You can use the Time Domain Reflector (TDR) feature to diagnose and resolve cabling problems. When running TDR, a local device sends a signal through a cable and compares the reflected signal to the initial signal.
-
Open, broken, or cut twisted-pair wires—The wires are not connected to the wires from the remote device.
-
Shorted twisted-pair wires—The wires are touching each other or the wires from the remote device. For example, a shorted twisted pair can occur if one wire of the twisted pair is soldered to the other wire.
If one of the twisted-pair wires is open, TDR can find the length at which the wire is open.
Use TDR to diagnose and resolve cabling problems in these situations:
-
Replacing a device.
-
Setting up a wiring closet
-
Troubleshooting a connection between two devices when a link cannot be established or when it is not operating properly
When you run TDR, the device reports accurate information in these situations:
-
The cable for the gigabit link is a solid-core cable.
-
The open-ended cable is not terminated.
When you run TDR, the device does not report accurate information in these situations:
-
The cable for the gigabit link is a twisted-pair cable or is in series with a solid-core cable.
-
The link is a 10-megabit or a 100-megabit link.
-
The cable is a stranded cable.
-
The link partner is a Cisco IP Phone.
-
The link partner is not IEEE 802.3 compliant.
Go to Running TDR and Displaying the Results to know the TDR commands.
Debug Commands
Caution |
Because debugging output is assigned high priority in the CPU process, it can render the system unusable. For this reason, use debug commands only to troubleshoot specific problems or during troubleshooting sessions with Cisco technical support staff. It is best to use debug commands during periods of lower network traffic and fewer users. Debugging during these periods decreases the likelihood that increased debug command processing overhead will affect system use. |
All debug commands are entered in privileged EXEC mode, and most debug commands take no arguments.
System Report
System reports or crashinfo files save information that helps Cisco technical support representatives to debug problems that caused the Cisco IOS image to fail (crash). It is necessary to quickly and reliably collect critical crash information with high fidelity and integrity. Further, it is necessary to collect this information and bundle it in a way that it can be associated or identified with a specific crash occurrence.
System reports are generated in these situations:
-
In case of a switch failure—A system report is generated on the switch that failed
-
In case of a switchover—System reports are generated only on high availability (HA) member switches. reports are not generated for non-HA members.
The system does not generate reports in case of a reload.
During a process crash, the following is collected locally from the switch:
-
Full process core
-
Tracelogs
-
IOS syslogs (not guaranteed in case of non-active crashes)
-
System process information
-
Bootup logs
-
Reload logs
-
Certain types of /proc information
This information is stored in separate files which are then archived and compressed into one bundle. This makes it convenient to get a crash snapshot in one place, and can be then moved off the box for analysis. This report is generated before the switch goes down to rommon/bootloader.
Except for the full core and tracelogs, everything else is a text file.
Use the request platform software process core fed active command to generate the core dump.
h2-macallan1# request platform software process core fed active
Process : fed main event (28155) encountered fatal signal 6
Process : fed main event stack :
SUCCESS: Core file generated.
h2-macallan1#dir bootflash:core
Directory of bootflash:/core/
178483 -rw- 1 May 23 2017 06:05:17 +00:00 .callhome
194710 drwx 4096 Aug 16 2017 19:42:33 +00:00 modules
178494 -rw- 10829893 Aug 23 2017 09:46:23 +00:00 h2-macallan1_RP_0_fed_28155_20170823-094616-UTC.core.gz
Crashinfo Files
By default the system report file will be generated and saved into the /crashinfo directory. Ifit cannot be saved to the crashinfo partition for lack of space, then it will be saved to the /flash directory.
To display the files, enter the dir crashinfo: command. The following is sample output of a crashinfo directory:
Switch#dir crashinfo:
Directory of crashinfo:/
23665 drwx 86016 Jun 9 2017 07:47:51 -07:00 tracelogs
11 -rw- 0 May 26 2017 15:32:44 -07:00 koops.dat
12 -rw- 4782675 May 29 2017 15:47:16 -07:00 system-report_1_20170529-154715-PDT.tar.gz
1651507200 bytes total (1519386624 bytes free)
System reports are located in the crashinfo directory in the following format:
system-report_[switch number]_[date]-[timestamp]-UTC.gz
After a switch crashes, check for a system report file. The name of the most recently generated system report file is stored in the last_systemreport file under the crashinfo directory. The system report and crashinfo files assist TAC while troubleshooting the issue.
The system report generated can be further copied using TFTP, HTTP and few other options.
Switch#copy crashinfo: ?
crashinfo: Copy to crashinfo: file system
flash: Copy to flash: file system
ftp: Copy to ftp: file system
http: Copy to http: file system
https: Copy to https: file system
null: Copy to null: file system
nvram: Copy to nvram: file system
rcp: Copy to rcp: file system
running-config Update (merge with) current system configuration
scp: Copy to scp: file system
startup-config Copy to startup configuration
syslog: Copy to syslog: file system
system: Copy to system: file system
tftp: Copy to tftp: file system
tmpsys: Copy to tmpsys: file system
The general syntax for copying onto TFTP server is as follows:
Switch#copy crashinfo: tftp:
Source filename [system-report_1_20150909-092728-UTC.gz]?
Address or name of remote host []? 1.1.1.1
Destination filename [system-report_1_20150909-092728-UTC.gz]?
The tracelogs can be collected by issuing a trace archive command. This command provides time period options. The command syntax is as follows:
Switch#request platform software trace archive ?
last Archive trace files of last x days
target Location and name for the archive file
The tracelogs stored in crashinfo: or flash: directory from within the last 3650 days can be collected.
Switch# request platform software trace archive last ?
<1-3650> Number of days (1-3650)
Switch#request platform software trace archive last 3650 days target ?
crashinfo: Archive file name and location
flash: Archive file name and location
Note |
It is important to clear the system reports or trace archives from flash or crashinfo directory once they are copied out, in order to have space available for tracelogs and other purposes. |
In a complex network it is difficult to track the origin of a system-report file. This task is made easier if the system-report files are uniquely identifiable. Starting with the Cisco IOS XE Amsterdam 17.3.x release, the hostname will be prepended to the system-report file name making the reports uniquely identifiable.
The following example displays system-report files with the hostname prepended:
HOSTNAME#dir flash:/core | grep HOSTNAME
40486 -rw- 108268293 Oct 21 2019 16:07:50 -04:00 HOSTNAME-system-report_20191021-200748-UTC.tar.gz
40487 -rw- 17523 Oct 21 2019 16:07:56 -04:00 HOSTNAME-system-report_20191021-200748-UTC-info.txt
40484 -rw- 48360998 Oct 21 2019 16:55:24 -04:00 HOSTNAME-system-report_20191021-205523-UTC.tar.gz
40488 -rw- 14073 Oct 21 2019 16:55:26 -04:00 HOSTNAME-system-report_20191021-205523-UTC-info.txt
Onboard Failure Logging on the Switch
You can use the onboard failure logging (OBFL) feature to collect information about the Device. The information includes uptime, temperature, and voltage information and helps Cisco technical support representatives to troubleshoot Device problems. We recommend that you keep OBFL enabled and do not erase the data stored in the flash memory.
By default, OBFL is enabled. It collects information about the Device and small form-factor pluggable (SFP) modules. The Device stores this information in the flash memory:
-
CLI commands—Record of the OBFL CLI commands that are entered on a standalone Device.
-
Message—Record of the hardware-related system messages generated by a standalone Device .
-
Power over Ethernet (PoE)—Record of the power consumption of PoE ports on a standalone Device .
-
Temperature—Temperature of a standalone Device .
-
Uptime data—Time when a standalone Device starts, the reason the Device restarts, and the length of time the Device has been running since it last restarted.
-
Voltage—System voltages of a standalone Device .
You should manually set the system clock or configure it by using Network Time Protocol (NTP).
When the Device is running, you can retrieve the OBFL data by using the show logging onboard privileged EXEC commands. If the Device fails, contact your Cisco technical support representative to find out how to retrieve the data.
When an OBFL-enabled Device is restarted, there is a 10-minute delay before logging of new data begins.
Fan Failures
By default, the feature is disabled. When more than one of the fans fails in a field-replaceable unit (FRU) or in a power supply, the device does not shut down, and this error message appears:
WARNING:Fan PS1/0 in slot 1 has the error: Error Status,
Please replace it with a new fan.
The device might overheat and shut down.
When an individual fan fails, the following message appears:
The fan in slot PS17/1 is encountering a failure condition
The following messages appears when the entire fan tray fails and the system shuts down:
Shutting down system now because the fans in slot PS17 have all failed.
To restart the device, it must be power cycled.
For more information on Fan failures, referCisco Catalyst 9400 Series Switches Hardware Installation Guide .
Possible Symptoms of High CPU Utilization
Excessive CPU utilization might result in these symptoms, but the symptoms might also result from other causes, some of which are the following:
-
Spanning tree topology changes
-
EtherChannel links brought down due to loss of communication
-
Failure to respond to management requests (ICMP ping, SNMP timeouts, slow Telnet or SSH sessions)
-
UDLD flapping
-
IP SLAs failures because of SLAs responses beyond an acceptable threshold
-
DHCP or IEEE 802.1x failures if the switch does not forward or respond to requests