Troubleshooting System Management Issues
The system management features of the Cisco Nexus 5000 Series switch allow you to monitor and manage your network for efficient device use, role-based access control, SNMP communications, diagnostics, and logging.
This chapter describes how to identify and resolve problems that can occur with system management and the Cisco Nexus 5000 Series switch.
This chapter includes the following sections:
SNMP memory usage continuously increasing
The show proc mem | inc snmp command shows continuously increasing SNMP memory usage.
SNMP memory usage increases when SNMP requests are processed from different monitoring stations. Typically, this situation stabilizes over time. If the memory increases continuously without stabilizing, then some of the SNMP requests are causing a memory leak.
Review the output from the show system internal snmp mem-stats detail command.
Take example snapshots with the following commands while processing SNMP requests:
- show clock
- show system internal mem-stats detail
- show tech snmp
SNMP not responding
No response or delayed response for SNMP request.
If the switch CPU utilization is high during the SNMP operations such as GET, GETNEXT and WALK, the response may be very slow or there is no responsethat results in a time-out.
While SNMP is not responding, check CPU utilization with the following commands:
- show proc cpu history
- show proc cpu sort
The output from this command shows which Nexus 5000 component is using the greatest amount of CPU resources.
SNMP not responding and show snmp command reports SNMP has timed out
SNMP is not responding and the show snmp command reports that SNMP has timed out.
The SNMP process might have exited, but the process did not crash.
Use show system internal sysmgr service name snmpd command which should show the state to be“SRV_STATE_HANDSHAKED.
Service "snmpd" ("snmpd", 74):
UUID = 0x1A, PID = 4131, SAP = 28
State: SRV_STATE_HANDSHAKED (entered at time Mon Jun 14 17:12:15 2010).
Time of last restart: Mon Jun 14 17:12:14 2010.
The service never crashed since the last reboot.
Not able to perform SNMP SET operation
The following error appears when trying to perform the SNMP SET operation:
bash-2.05b$ snmpset -v2c -c private 10.78.25.211.220.127.116.11.18.104.22.168.305.1.1.6.0 i 1
The SNMP community does not have write permission.
Check the output of the show snmp community command to ensure that the write permission is enabled.
Community Group / Access context acl_filter
Only “network-admin” has write permissions.
snmpset -v2c -c public 10.78.25.211.22.214.171.124.126.96.36.199.305.1.1.6.0 i 1
SNMP on BRIDGE-MIB
The SNMP GET on BRIDGE-MIB operation does not return correct values and results in errors.
The BRIDGE-MIB may not be supported.
Check the release notes to make sure that BRIDGE-MIB is supported on NX-OS Release 4.2(1) or later releases.
System is not responsive
System performance is significantly slower or non responsive.
Some system resources may be over-utilized. For example, an incorrect logging level might generate many messages resulting in an impact on system resources.
Check the logging level on the chassis. If you have a logging level setting, such as 6 or 7, many messages are generated and performance can be impacted. Use the following commands to display the amount of resources that are being used.
- show proc cpu | inc syslogd
- show proc cpu
- show run | inc logging
- show system resource
Syslog server not getting messages from DUT
Although the syslog server is configured, the destination syslog server is not receiving messages from DUT.
Syslog server might not be accessible or the logging level might not be appropriate.
- Check to see if the destination syslog server is accessible from VRF management. Use the ping <dest-ip> vrf management command to ping the server.
- Check that the syslog configuration on the DUT has use-vrf management.
logging server 10.193.12.1 5 use-vrf management
- Check that the appropriate logging level is enabled to send logging messages. Use the show logging info command. If the logging level is not appropriate, then set the appropriate level using the logging level <feature> <log-level> command.
High CPU Utilization
CPU experiences brief high utilization.
Brief high utilization caused by CPU multitasking.
Spikes of high CPU utilization on the Cisco Nexus 5000/5500 switch is normal activity.
The show system resources command displays the high level CPU utilization for the supervisor module. The show process cpu command with the sort option displays all of the processes sorted by the highest CPU utilization per process. The show process cpu history command displays the CPU history in three increments: 60 seconds, 60 minutes, 72 hours. Viewing the CPU history is valuable when correlating a network event with the past CPU utilization.
Cisco NX-OS takes advantage of preemptive CPU multitasking, so processes can take advantage of an idle CPU to complete tasks faster. Therefore the show process cpu history command might display CPU spikes that are not necessarily a problem. Additional investigation is required if the average CPU remains close to 100%.
switch# show processes cpu sort
PID Runtime(ms) Invoked uSecs 1Sec Process
----- ----------- -------- ----- ------ -----------
3611 57354660 30766347 1864 7.0% statsclient
4011 110298193 27004447 4084 5.3% fcpc
3561 96792384 87683659 1103 3.5% gatosusd
3685 862 8678 99 1.8% netstack
1 39116 447596 87 0.0% init
switch# show processes cpu history
41 11 11111111131 11 11 1 12811111121 11 1122 1111111 1
20 # # # # ## # ### # # ### #
Traps not received
The results of traps are not received.
The traps might not be enabled or the SNMP host might not be accessible.
The following are possible causes:
- Traps might not be enabled.
- The SNMP host might not be accessible.
- A firewall might be blocking access.
- An access list might be blocking UDP port 162.
Use the following commands to check whether the proper VRF is configured for the SNMP host and that the trap is enabled:
- snmp-server enable traps <trapname>
- snmp-server host <x.x.x.x> use-vrf <vrf-name>
where x.x.x.x is the IP address of the trap receiving device.
DNS resolution not working correctly
When specifying a host name using DNS or VRF, the host name is not resolved and an error occurs.
The DNS client is not configured correctly.
Use the following commands to configure the DNS client:
- config t
- vrf context management
- ip host name <address1 [address2... address6]>
- ip domain-name name [use-vrf <vrf-name> ]
- ip domain-list name [use-vrf <vrf-name> ]
- ip name-server <server-address1 [server-address2... server-address6]>< [use-vrf vrf-name>]
- ip domain lookup
- show hosts
- copy running-config startup-config
Specified domain not removed from domain-list
When using the no ip domain-list <name> command to remove a specified domain from the domain-list, only the most recently added domain is removed.
The no ip domain-list <name> command is not locating the specified domain.
There are two possible workarounds:
- To remove a domain using the no ip domain-list <name> command that is not the most recently added domain to the domain-list, you must temporarily remove every domain in the domain-list until reaching the desired domain. Then you must add back the temporarily removed domains to the domain-list.
- An alternative approach is to copy the startup-config and delete the desired domain with a text editor. Then you must load the edited startup-config back onto the device.