General Troubleshooting

Guidelines for Troubleshooting

When you troubleshoot issues with Cisco UCS Central or a component that it manages, follow the guidelines listed in the following table.

Table 1. Troubleshooting Guidelines

Guideline

Description

Check the release notes to see if the issue is a known problem.

The release notes are available at: Cisco UCS Central Release Notes.

Take screenshots of the fault or error message dialog box, the FSM for the component, and other relevant areas.

These screenshots provide visual cues about the state of Cisco UCS Central when the problem occurred. If your computer does not have software to take screenshots, check the documentation for your operating system.

Record the steps that you took directly before the issue occurred.

If you have access to screen or keystroke recording software, repeat the steps you took and record what occurs in Cisco UCS Central.

If you do not have access to that type of software, repeat the steps you took, make detailed notes of the steps and what happens in Cisco UCS Central after each step.

Create a technical support file.

The information about the current state of Cisco UCS Central and the Cisco UCS domains is helpful to Cisco support. It frequently provides the information to identify the source of the problem.

Technical Support Files

When you encounter an issue that requires troubleshooting, or a request for assistance to the Cisco Technical Assistance Center (Cisco Technical Assistance Center), collect as much information as possible. Cisco UCS Central outputs this information into a tech support file that you can send to TAC.

The following describes how to generate technical support log files through the HTML5 GUI and through the CLI. This guide does not support versions of Cisco UCS Central with the FLEX GUI.

Creating a Technical Support File in the Cisco UCS Central CLI

Use the show tech-support command to output information about a Cisco UCS domain that you can send to Cisco Technical Assistance Center.

Procedure

  Command or Action Purpose
Step 1

UCS-A # connect local-mgmt {a | b}

Enters local management mode.

Step 2

UCS-A (local-mgmt) # show tech-support detail

Produces a detailed report (tgz file) that you can send to Cisco TAC to debug.

Step 3

UCS-A (local-mgmt) # copy volatile:/ <filename>.tar {scp | ftp | sftp | tftp }: user_name@IP_address | username’s password: password

Copies the output file to an external location.

The SCP and FTP commands require an absolute path for the target location. The path to your home directory cannot include special symbols, such as ‘~’.

Creating a Tech Support File in the Cisco UCS Central GUI

The following steps describe how to generate a tech support file in the HTML GUI.

Procedure


Step 1

Click on the System Tools icon and choose Tech Support.

In v1.4, the System Tools icon is called the Operations icon.

Step 2

From the Domain list, click a domain, or UCS Central.

Step 3

Click Generate Tech Support.

The Generate Tech Support dialog opens.
Step 4

Select Include System data such as policies and inventory.

Step 5

Click Yes.

Step 6

After Cisco UCS Central produces the report, select it.

Step 7

Click Download to download the report to your local system so you can email it to Cisco TAC.


Inventory Data Sync

When you register a Cisco UCS Manager domain to Cisco UCS Central, Cisco UCS Central performs a full inventory. After the initial inventory, Cisco UCS Central only performs a partial inventory, which consists of the delta between the previous inventory and the current one.

After an update, it's common to see an inventory out of sync. If inventory data is out of sync between Cisco UCS Manager and Cisco UCS Central, the status updates from Cisco UCS Manager do not display on Cisco UCS Central. On Cisco UCS Central, the inventory status displays as In Progress, but does not change to OK.

Note

Acceptable latency between Cisco UCS Manager and Cisco UCS Central is less than 300ms.


Verify that the pmon state shows all of the Cisco UCS Central DME processes in the CLI.

Procedure

  Command or Action Purpose
Step 1

UCS-A# connect local-mgmt

Connects local management.
Step 2

UCSC(local-mgmt)# show pmon state

Note 

If your Cisco UCS domain is running Cisco UCS Manager v2.2.3 or earlier, and you are using a WAN environment with low bandwidth and high latency, inventory processing may timeout. To fix, install the latest version of Cisco UCS Manager.

Step 3

Alternatively, in the UI, click the Alerts icon, then click Internal Services.

Refreshing the Inventory

Manually refresh the inventory for the specific Cisco UCS domain using the Cisco UCS Central CLI.

Procedure

  Command or Action Purpose
Step 1

connect resource-mgr

Connects to the resource manager.
Step 2

scope domain-mgmt

Connects to domain management.
Step 3

show ucs-domain

Displays all registered domains and their IDs.
Step 4

scope ucs-domain <domain-ID>

Connects to the chosen domain.
Step 5

refresh-inventory

Refreshes the inventory.
Step 6

commit-buffer

Commits the transaction.

Disk Space Issues

The following describes disk space issues you could encounter:

Issue Resolution

Scale issues

If you have many domains, enhance the VM configuration (memory, storage, etc.) to at least double the recommended numbers.

Excess images and files

Ensure that you clean up your unused images and technical support files.

Log files grew unbounded

Some users experienced a problem where syslog files grew larger than expected. These files consumed as much available space as possible, triggering alerts. This also prevented signing in and other admin functions. This issue has been fixed.

Boot Flash Full

If you enabled statistics collection with the internal statistics database, it could fill up the boot flash partition. To fix, drop the internal statistics database and disable statistics collection. Also, you could configure an external statistics database for statistics collection.


Attention

Contact Cisco TAC if your /bootflash partition becomes full and you have stats collection enabled. Please note that the Statistics Management feature is being deprecated and will not be supported after Cisco UCS Central release 1.5.


Procedure

  Command or Action Purpose
Step 1

UCSC# connect stats-mgr

Disables statistics collection.
Step 2

sc collection-policy

Step 3

/collection-policy # set collection-interval never

Step 4

/collection-policy* # commit-buffer

Port Configuration with a Firewall

The following table lists the ports that you must configure:

Issue Resolution

Ports open on Cisco UCS Central

  • HTTPS_PORT=”https”(443) – Communications from Cisco UCS Central to Cisco UCS domain(s) and Cisco UCS Central GUI. Always required.

  • HTTP_PORT=”http”(80) – Communications from Cisco UCS Central to Cisco UCS domain(s). This port is configurable, and only required for the Flash-based Cisco UCS Central UI.

  • PRIVATE_PORT=(843) – Cisco UCS Central communications from Flash UI to Cisco UCS Central VM.  Only required for the Flash-based Cisco UCS Central UI. This is not required if using the new HTML-5 UI.

Note 

For Cisco UCS Manager domains v2.2(1b) and below, you also must open the following NFS ports:

  • LOCKD_TCPPORT=32803 – Linux NFS lock

  • MOUNTD_PORT=892 – Linux NFS mount

  • RQUOTAD_PORT=875 – Linux remote quota server port (NFS)

  • STATD_PORT=32805 – Linux – Used by NFS file locking service – lock recovery

  • NFS_PORT="nfs"(2049) – Linux NFS listening port

  • RPC_PORT="sunrpc"(111) – Linux RPCBIND listening port

Port open on Cisco UCS Manager

  • HTTPS_PORT=”https”(443) – Communications from Cisco UCS Central to Cisco UCS domain(s). Always required.

DNS Troubleshooting

You can configure the DNS server from the Cisco UCS Central HTML-5 UI.

  • If Cisco UCS Central fails to resolve domain names, check that the DNS server is added to the /etc/resolve.conf file.

  • Check for any errors in the /var/log/core/svc_cor_controllerAG.log.

Host Firmware Package Policy Issues

Beginning with Cisco UCS Central release 1.4, you can exclude components from your host firmware package policy. When excluding components, be aware of the following:

  • The global-default host firmware package policy includes all components. If you create a new custom host firmware package policy, it automatically excludes the local disk component.

  • Host firmware package policies created in Cisco UCS Central v1.3, or previous versions, do not support excluding components. These policies do not change when you upgrade to Cisco UCS Central v1.4.

  • If you create your own custom host firmware package policy with excluded components, you cannot include it in a service profile associated with a server running a Cisco UCS Manager version prior to 2.2.7. If you do, the following error displays during service profile association:

    ucs domain does not have the matching server capabilities for this service-profile

    You can either remove all excluded components in the host firmware package policy, or upgrade your version of Cisco UCS Manager to the latest version.

Private VLAN Issues

The following issues could cause PVLAN configuration to fail:

  • VLAN referenced by a global service profile, port, or port channel that does not exist or has been deleted.

  • VLAN referenced by a port or port channel that is not created in the appropriate cloud.

  • VLAN referenced by a global service profile, port, or port channel that is not created under the appropriate domain groups.

  • VLAN ID/Name is overlapping with other VLANs that exist locally on a Cisco UCS domain.

  • More than one secondary VLAN is referring to the same primary VLAN.

  • The secondary VLAN referenced by the global service profile, port, or port channel does not refer to a valid primary VLAN.

  • The secondary VLAN referenced by global service profile, port, or port channel refers to a primary VLAN that was deleted.

Fixing Private VLAN Issues

To fix the PVLAN configuration issues:

Procedure


Step 1

Check the configuration and FSM status of the global service profiles, ports, or port channels.

Step 2

Analyze the domain and system faults for any related failures.


Smart Call Home Issues

The following table lists issues related to Smart Call Home:

Issues

Resolution

Configuration changes or issues

When you change a configuration, or enable or disable call home, you can check the status on the System Configuration > Smart Call Home > Configuration Status. Errors that are internally identified, such as "invalid certificate in the transport gateway" display here.

Registration email not received

When you enable Smart Call Home for the first time, you should receive an email confirming or requesting registration within five minutes. If you do not receive the email, create another inventory from System Configuration > Smart Call Home > Basic > Operations > Send System Inventory Now.

Viewing logs

  • Smart Call Home logs are located at /var/log/gch and /var/log/resource-mgr.

  • Individual Data Management Engine (DME) logs specific call home events that are raised within the DME. For example, the core DME raises a process core dumped event. The information specific to the event is located in /var/log/core/svc_core_dme.log.

  • Audit logs and tech support files capture specific configuration changes made in the Smart Call Home section.

Smart Software Licensing Issues

The first five Cisco UCS Central domains are currently licensed at no charge. For more domains, there is a charge. Support for the initial, or additional domains, is available as a paid option with the licenses.

Note

There is a 120-day grace period after the registration of the first domain. You can register any number of domains during this grace period. After the grace period expires, a license is required to prevent licensing fault alarms.


The following table lists issues related to Smart Software licensing:

Issues

Resolution

Registration failed due to network issue Review the network connectivity to the Cisco Smart Software Licensing portal.
Deregistration failed due to network issue Manually remove the product instance from the Cisco Smart Software Licensing portal.
Smart Software Licensing tech support commands
(resource-mgr) /smart-license # generate techsupport
(resource-mgr) /smart-license* # commit-buffer
Smart Software Licensing show commands
(policy-mgr) /org/device-profile/smart-license # show smart-license
(resource-mgr) /smart-license # show license usage
(resource-mgr) /smart-license # show license summary
(resource-mgr) /smart-license # show license status
(resource-mgr) /smart-license # show license udi
(resource-mgr) /smart-license # show license techsupport
(resource-mgr) /smart-license # show license all
Smart Software Licensing logs

/var/log/resource-mgr/svc_sam_cloudAG.log

/var/log/resource-mgr/svc_rsrcMgr_dme.log

DME Logs

The following table lists the Data Management Engine (DME) logs used in Cisco UCS Central:

Issue Resolution

Mgmt-controller (core) DME

Applies VM settings like IP address, DNS, NTP.

  • Located in /var/log/core

  • svc_core_dme.log – DME log

  • svc_core_controllerAG.log – runs scripts to configure VM

  • svc_core_secAG.log – authentication errors (local/ldap)

Policy-mgr DME

Policy management, ID Pool management.

  • Located in /var/log/policy-mgr

  • svc_pol_dme.log – DME log

  • svc_sam_pkiAG.log – certificate maintenance.

Resource-mgr DME

Service profiles, VLANS/VSANS, Inventory.

  • Located in /var/log/resource-mgr

  • svc_rsrcMgr_dme.log – DME log

Identifier-mgr DME

Management for IDs.

  • Located in /var/log/identifier-mgr

  • svc_idm_dme.log – DME log

Service Registry DME

Monitors DME status, registered domain status.

  • Located in /var/log/service-reg

  • svc_reg_dme.log – DME log

Operation-mgr DME

Backup, and firmware management.

  • Located in /var/log/operation-mgr

  • svc_ops_dme.log – DME log

  • svc_ops_imgMgmtAG.log - image management

Stats-mgr DME

Statistics collection from Cisco UCS domains.

  • Located in /var/log/stats-mgr

  • svc_statsMgr_dme.log – DME log

Central-mgr DME

Single entry point for XML API.

  • Located in /var/log/central-mgr

  • svc_centralMgr_dme.log – DME log

Cisco UCS Central Processes

The following table lists the Cisco UCS Central processes:

Service Name Description

core-svc_cor_secAG

Implements authentication related feature, such as local auth and remote auth

identifier-mgr-svc_idm_dme

Manages ID pools and allocates unique IDs in the system

core-solr.sh

SOLR process

resource-mgr-svc_sam_snmpTrapAG

Sends SNMP traps from resource-mgr

central-mgr-svc_centralMgr_dme

Cisco UCS Central NBAPI provider forwards the NBAPI to a specific DME

policy-mgr-svc_pol_dme

Manages Cisco UCS Central policies

identifier-mgr-svc_sam_snmpTrapAG

Sends SNMP traps from identifier-mgr

core-svc_cor_snmpTrapAG

Sends SNMP traps from mgmt-controller

operation-mgr-svc_ops_dme

Operations manager DME

policy-mgr-svc_sam_pkiAG

Provides PKI related service for policy-mgr DME

core-httpd.sh

Starts httpd process

gch-call_home

Cisco GCH call home process, which forwards the callhome/smartlicense message to Cisco Cloud Smartlicense Manager

service-reg-svc_sam_snmpTrapAG

Sends SNMP trap from service-reg

core-svc_cor_sessionmgrAG

Session auditing for Cisco UCS Central HA implementation

core-svc_cor_dme

Manages the configuration for Cisco UCS Central VM (mgmt-control DME)

resource-mgr-svc_sam_cloudAG

GCH callhome, smartlicense application gateway

stats-mgr-svc_sam_snmpTrapAG

Sends SNMP trap from stats-mgr

service-reg-svc_reg_dme

Implements registration service for different DME and UCSM

operation-mgr-svc_ops_imgMgmtAG

Image management application gateway for operations manager

resource-mgr-svc_rsrcMgr_dme

Resource manager DME where Cisco UCS Manager inventory is kept and which manages GSP

core-tomcat.sh

Controlling script for tomcat process

service-reg-svc_sam_controller

AG to implement Cisco UCS Central HA service

operation-mgr-svc_sam_snmpTrapAG

Sends SNMP trap from operation manager

sam_cores_mon.sh

Script to monitor and manage Cisco UCS Central coredump file

core-svc_cor_controllerAG

AG to configure Cisco UCS Central VM policies

service-reg-svc_sam_licenseAG

License AG for domain base license.

core-sam_nfs_mon.sh

Script to monitor NFS

gch-xosdsd

Infrastructure process for implementing GCH smartlicense feature

policy-mgr-svc_sam_snmpTrapAG

Sends SNMP Trap from policy-mgr

stats-mgr-svc_statsMgr_dme

Statistics manager which collects statistics from different Cisco UCS Manager domains and generates the statistics report