Guest

Support

Troubleshooting Issues with Cisco UCS

  • Viewing Options

  • PDF (166.2 KB)
  • Feedback
Troubleshooting Issues with Cisco UCS Manager

Table Of Contents

Troubleshooting Issues with Cisco UCS Manager

Troubleshooting Boot Issues

Reboot Warning Does Not Display

Problem

Possible Cause

Recommended Action

Server Does Not Boot from OS Installed on eUSB

Problem

Possible Cause

Recommended Action

Troubleshooting KVM Issues

BadFieldException When Launching the KVM Viewer

Problem

Possible Cause

Recommended Action

KVM Console Failure

Problem

Possible Cause

Recommended Action

KVM Fails to Open

Problem

Possible Cause

Recommended Action

Troubleshooting VM Issues

Error: "Currently connected network interface x uses Distributed Virtual Switch (uusid: y) which is accessed on the host via a switch that has no free ports"

Problem

Possible Cause

Recommended Action

Troubleshooting Cisco UCS Manager Issues

Error: "Fatal error: event sequencing is skewed "

Problem

Possible Cause

Recommended Action

HDD Metrics Not Updated in the Cisco UCS Manager GUI

Problem

Possible Cause

Recommended Action

Cisco UCS Manager Reports More Disks in Server than Total Slots Available

Problem

Possible Cause

Recommended Action

Troubleshooting Fabric Interconnect Issues

Recovering a Fabric Interconnect from the Boot Loader Prompt

Problem

Possible Cause

Recommended Action

Resolving Fabric Interconnect Cluster ID Mismatch

Problem

Recommended Action


Troubleshooting Issues with Cisco UCS Manager


This chapter describes solutions that you can implement when you troubleshoot issues with Cisco UCS Manager.

This chapter includes the following sections:

Troubleshooting Boot Issues

Troubleshooting KVM Issues

Troubleshooting VM Issues

Troubleshooting Cisco UCS Manager Issues

Troubleshooting Fabric Interconnect Issues

Troubleshooting Boot Issues

This section includes the following topics:

Reboot Warning Does Not Display

Server Does Not Boot from OS Installed on eUSB

Reboot Warning Does Not Display

Problem

The system fails to produce a reboot warning that lists any dependencies.

Possible Cause

This problem can be caused by changes to a vNIC template or a vHBA template. Reboot warnings occur when the back-end returns a list of dependencies. When you update the template type for a vNIC or vHBA template and make changes to any boot-related properties without applying changes between steps, the back-end systems are not triggered to return a list of dependencies.

Recommended Action

To ensure that template changes are applied, follow these steps in the Cisco UCS Manager GUI:


Step 1 In the vNIC template or vHBA template, do the following:

a. Change the template type from Inital Template to Updating Template.

b. Click Save Changes.

Step 2 Make any additional changes to the reboot-related values and Save.

A reboot warning and the list of dependencies appear.


Server Does Not Boot from OS Installed on eUSB

Problem

The eUSB embedded inside the Cisco UCS server includes an operating system. However, the server does not boot from that operating system.

Possible Cause

This problem can occur when, after associating the server with the service profile, the eUSB is not at the top of the actual boot order for the server.

Recommended Action

To ensure that the server boots from the operating system on the eUSB, follow these steps in the Cisco UCS Manager GUI:

Procedure


Step 1 On the Servers tab, do the following to verify the boot policy configuration:

a. Navigate to the service profile associated with the server.

b. On the Boot Order tab, ensure that the Local Disk is configured as the first device in the boot policy.

Step 2 On the Equipment tab, do the following to verify the actual boot order for the server:

a. Navigate to the server.

b. On the General tab, expand the Boot Order Details area and verify that the eUSB is listed as the first device on the Actual Boot Order tab.

For example, the first device should be VM eUSB DISK.

Step 3 If the eUSB is not the first device in the actual boot order, do the following:

a. On the General tab for the server, click the following links in the Actions area:

Click KVM Console to launch the KVM console

Click Boot Server to boot the server

b. In the KVM console, while the server is booting, press F2 to enter the BIOS setup.

c. In the BIOS utility, click on the Boot Options tab.

d. Click Hard Disk Order.

e. Configure Boot Option #1 to the eUSB.

For example, set this option to VM eUSB DISK.

f. Press F10 to save and exit.


Troubleshooting KVM Issues

This section includes the following topics:

BadFieldException When Launching the KVM Viewer

KVM Console Failure

KVM Fails to Open

BadFieldException When Launching the KVM Viewer

Problem

The BadFieldException error appears when the KVM viewer is launched.

Possible Cause

This problem can occur because the Java Web Start disables the cache by default when it is used with an application that uses native libraries.

Recommended Action

To eliminate this error, follow these steps:

Procedure


Step 1 Choose Start > Control Panel > Java.

Step 2 Click the General tab.

Step 3 In the Temporary Internet Files area, click Settings.

Step 4 Select the Keep temporary files on my computer check box.

Step 5 Click OK.


KVM Console Failure

Problem

The KVM console fails to launch and the JRE displays the following message:

Unable to launch the application.

Possible Cause

This problem can be caused if several KVM consoles are launched simultaneously.

Recommended Action

To launch the KVM console, follow these steps:

Procedure


Step 1 If possible, close one or more of any open KVM consoles.

Step 2 Close all open KVM consoles and relaunch the KVM console.


KVM Fails to Open

Problem

The first time you attempt to open the KVM on a server, the KVM fails to launch.

Possible Cause

This problem can be caused by a JRE version incompatibility.

Recommended Action

To launch KVM, follow these steps:

Procedure


Step 1 Upgrade to JRE 1.6_11.

Step 2 Reboot the server.

Step 3 Launch the KVM console.


Troubleshooting VM Issues

This section includes the following topic:

Error: "Currently connected network interface x uses Distributed Virtual Switch (uusid: y) which is accessed on the host via a switch that has no free ports"

Error: "Currently connected network interface x uses Distributed Virtual Switch (uusid: y) which is accessed on the host via a switch that has no free ports"

Problem

The following error appears:

Currently connected network interface x uses Distributed Virtual Switch (uusid:y) which is 
accessed on the host via a switch that has no free ports.

Possible Cause

This problem can be caused by one of the following issues:

After powering off or migrating a VM from one host to another, the vSphere server fails to recompute the numPortsAvailable property in the hostProxySwitch object.

The cumulative number of vNICs for the VMs powered on an ESX host matches or exceeds the number of dynamic nVINCs configured in the server's service profile.

After migrating a VM from one data-store to another data-store on the same server, the server incorrectly detects an increase in the number of DVS ports being used by all of the VMs powered on the host.

Recommended Action

To eliminate this error, follow these steps:

Procedure


Step 1 Identify the cause of the error.

Step 2 If the error resulted from powering off a VM, or from migrating a VM from one host to another, do the following:

a. Migrate a second VM from the ESX host to another system.

b. When a second port is made available, do one of the following:

Power on a VM.

Migrate a VM back to the ESX host.

Step 3 If the error resulted from migrating a VM instance from one data-store to another data-store on the same server, do the following:

a. Shut down all of the VMs on the ESX host.

b. Retry the migration.


Troubleshooting Cisco UCS Manager Issues

This section includes the following topics:

Error: "Fatal error: event sequencing is skewed "

HDD Metrics Not Updated in the Cisco UCS Manager GUI

Cisco UCS Manager Reports More Disks in Server than Total Slots Available

Error: "Fatal error: event sequencing is skewed "

Problem

After coming back from sleep mode, the Cisco UCS Manager GUI displays the following message:

Fatal error: event sequencing is skewed.

Possible Cause

This problem can be caused if the Cisco UCS Manager GUI was running when the computer went to sleep. Since the JRE does not have a sleep detection mechanism, the system is unable to retrack all of the messages received before it went into sleep mode. After multiple retries, this event sequencing error is logged.


Note Always exit the Cisco UCS Manager GUI before putting your computer to sleep.


Recommended Action

To eliminate this error, follow these steps:


Step 1 In the Cisco UCS Manager GUI, if a Connection Error dialog box appears, click one of the following:

Click Re-login to log back in to Cisco UCS Manager GUI.

Click Exit to exit the Cisco UCS Manager GUI.


HDD Metrics Not Updated in the Cisco UCS Manager GUI

Problem

After hot-swapping, removing, or adding a hard drive, the updated hard disk drive (HDD) metrics do not appear in the Cisco UCS Manager GUI.

Possible Cause

This problem can be caused because Cisco UCS Manager gathers HDD metrics only during a system boot. If a hard drive is added or removed after a system boot, the Cisco UCS Manager GUI does not update the HDD metrics.

Recommended Action

To update the HDD metrics, follow these steps:


Step 1 Reboot the server.


Cisco UCS Manager Reports More Disks in Server than Total Slots Available

Problem

Cisco UCS Manager reports that a server has more disks than the total disk slots available in the server. For example, Cisco UCS Manager reports three disks for a server with two disk slots as follows:

RAID Controller 1: 
           Local Disk 1: 
               Product Name: 73GB 6Gb SAS 15K RPM SFF HDD/hot plug/drive sled mounted 
               PID: A03-D073GC2 
               Serial: D3B0P99001R9 
               Presence: Equipped 
           Local Disk 2:  
               Product Name: 
               Presence: Equipped 
               Size (MB): Unknown 
           Local Disk 5: 
               Product Name: 73GB 6Gb SAS 15K RPM SFF HDD/hot plug/drive sled mounted 
               Serial: D3B0P99001R9 
               HW Rev: 0 
               Size (MB): 70136

Possible Cause

This problem is typically caused by a communication failure between Cisco UCS Manager and the server that reports the inaccurate information.

Recommended Action

To update the server information, follow these steps:


Step 1 Decommission the server.

Step 2 Recommission the server.


Troubleshooting Fabric Interconnect Issues

This section includes the following topics:

Recovering a Fabric Interconnect from the Boot Loader Prompt

Resolving Fabric Interconnect Cluster ID Mismatch

Recovering a Fabric Interconnect from the Boot Loader Prompt

Problem

The fabric interconnect fails to start.

Possible Cause

This problem can be caused by one of the following issues, which require you to use the boot loader prompt to recover the fabric interconnect:

The kickstart image is corrupted or nonfunctional for other reasons.

The file system on the bootflash memory is corrupted.

Recommended Action

To recover the fabric interconnect from the boot loader prompt, follow these steps:


Step 1 Verify the following physical connections on the fabric interconnect:

A console port on the first fabric interconnect is physically connected to a computer terminal or a console server.

The management Ethernet port (mgmt0) is connected to an external hub, switch, or router.

For a cluster configuration, the L1 ports on both of the fabric interconnects are directly connected to each other and the L2 ports on both of the fabric interconnects are directly connected to each other.

Step 2 Verify that the console port parameters on the computer terminal (or console server) that are attached to the console port are set to:

9600 baud

8 data bits

No parity

1 stop bit

Step 3 Identify and download the following firmware images and place them on an SCP server:

Kickstart

System

Cisco UCS Manager

Step 4 Collect the following information:

Full path and filenames for the firmware images

Hostname of the SCP server

Username and password to access the files on the SCP server

Step 5 Connect to the console port.

Step 6 Power cycle the fabric interconnect:

a. Turn off the power to the fabric interconnect.

b. Turn on the power to the fabric interconnect.

You can see the power on self-test messages as the fabric interconnect boots.

Step 7 To get the loader prompt, in the console, press one of the following key combinations as it boots:

Ctrl+l

Ctrl+Shift+r

You may need to press the selected key combination multiple times before the screen displays the loader prompt.

Step 8 (Optional) Set the MAC address for the fabric interconnect.:

loader> set mac mac_address

Step 9 Set the IP address and netmask for the fabric interconnect:

loader > set ip ip_address netmask

Step 10 Set the gateway for the fabric interconnect:

loader > set gw gateway

Step 11 Specify the kickstart image to use for the fabric interconnect:

loader > boot scp://scp_server_ip/path_relative_to_scp_root/kickstart_filename

This command takes you to the kickstart prompt.

Step 12 Initialize the fabric interconnect and create the partitions in the bootflash:

switch(boot)# init system

Step 13 Enter the configuration mode:

switch(boot)# config terminal

Step 14 Enter the management 0 interface mode:

switch(config)# interface mgmt0

Step 15 Configure the IP address and netmask for the fabric interconnect:

switch(config-if)# ip address ip_address netmask

Step 16 Administratively start the interface and ensure that there is no shutdown:

switch(config-if)# no shut

Step 17 Configure the default gateway for the fabric interconnect:

switch(config-if)# ip default-gateway gateway

Step 18 Complete the configuration of the IP address for the fabric interconnect:

switch(config-if)# end

Step 19 Copy the kickstart image to bootflash:

switch(config)# copy scp: bootflash:

Example:

switch(config)# copy scp: bootflash:
Enter source filename: /ucs/images/ucs-6100-k9-kickstart.4.1.3.N2.1.3.1.bin
Enter hostname for the scp server: 192.168.10.10
Enter username: user1
user1@192.168.10.10's password:
ucs-6100-k9-kickstart.4.1.3.N2.1.3.1.bin 100%   21MB   4.1MB/s   00:05

As shown in the example, you are prompted for additional information, such as the name of the kickstart image file including the path from the SCP root.

This example uses a .bin file. Under some circumstances, Cisco TAC may provide you with a .gbin file for the new image.

Step 20 Copy the system image to bootflash:

switch(config)# copy scp: bootflash:

You are prompted for the name and location of the system image file.

Step 21 Copy the management image to bootflash:

switch(config)# copy scp: bootflash:

You are prompted for the name and location of the management image file.

Step 22 Set the gateway for the fabric interconnect:

loader > set gw gateway

Step 23 Set the gateway for the fabric interconnect:

loader > set gw gateway

Step 24 Set the gateway for the fabric interconnect:

loader > set gw gateway

Step 25 Perform a dir on the bootflash to ensure that the files were copied correctly.

Step 26 Copy the management image in the bootflash and rename the management image to ensure that it is Cisco UCS Manager-compliant.

switch(boot)# copy bootflash: management_image bootflash:nuova-sim-mgmt-nsg.0.1.0.001.bin 

The `nuova-sim-mgmt-nsg.0.1.0.001.bin' line makes the management image Cisco UCS Manager-compliant.

Step 27 Load the system image from the bootflash.

switch(boot)# load bootflash: system_image

Step 28 Connect to Cisco UCS Manager and upgrade the Cisco UCS Manager firmware to the release of Cisco UCS that you copied into bootflash.

Step 29 Upgrade the fabric interconnects after you complete the Cisco UCS Manager firmware upgrade and the Cisco UCS instance restarts.

a. When High Availability is active, identify the subordinate fabric interconnect and upgrade the firmware for that fabric interconnect.

b. When High Availability is active again, force the newly upgraded fabric interconnect to become the primary fabric interconnect in the cluster.

c. Upgrade the firmware of the new subordinate fabric interconnect.

Upgrade both of the fabric interconnects and ensure that they boot correctly in the future. If you do not perform this step, you may have to reboot from the loader prompt again.

Follow the instructions in the appropriate upgrade guide for this release. Assuming that you are upgrading to the same release as other components in the Cisco UCS instance, you do not have to upgrade the other components, such as the IOMs or servers.

Resolving Fabric Interconnect Cluster ID Mismatch

Problem

When you set up two fabric interconnects to support a high availability cluster and connect the L1 ports and L2 ports, a fabric interconnect cluster ID mismatch can occur. This type of mismatch means that the cluster fails and Cisco UCS Manager cannot be initialized.

Recommended Action

To resolve a fabric interconnect cluster ID mismatch, follow these steps:


Step 1 In Cisco UCS Manager CLI, connect to fabric interconnect B and enter the erase configuration command.

All configuration on the fabric interconnect is erased.

Step 2 Reboot fabric interconnect B.

After rebooting, fabric interconnect B detects the presence of fabric interconnect A and downloads the cluster ID from fabric interconnect A. The cluster can then be formed.