- Overview
- Tools Used in Troubleshooting
- Installation
- Upgrade
- Licenses
- High Availability
- VSM and VEM Modules
- Ports
- Port Profiles
- Port Channels and Trunking
- Layer 2 Switching
- VLANs
- Private VLANs
- NetFlow
- Access Control Lists
- Quality of Service
- SPAN
- Multicast IGMP Snooping
- DHCP, DAI, and IPSG
- System
- Network Segmentation Manager
- Ethanalyzer
- Before Contacting Technical Support
- Information About High Availability
- Problems with High Availability
- Failover Clusters and the Microsoft SCVMM
- Selecting Storage During VM Deployment on Failover Clusters from the Microsoft SCVMM
- Live Migration Fails Due to Network Bandwidth
- Cluster IP Resource Fails to Come Up
- High Availability Troubleshooting Commands
High Availability
This chapter describes how to identify and resolve problems related to high availability.
Information About High Availability
The purpose of high availability (HA) is to limit the impact of failures—both hardware and software— within a system. The Cisco NX-OS operating system is designed for high availability at the network, system, and service levels.
The following Cisco NX-OS features minimize or prevent traffic disruption in the event of a failure:
- Redundancy—Redundancy at every aspect of the software architecture.
- Isolation of processes—Isolation between software components to prevent a failure within one process that is disrupting other processes.
- Restartability—Most system functions and services are isolated so that they can be restarted independently after a failure while other services continue to run. In addition, most system services can perform stateful restarts, which allow the service to resume operations transparently to other services.
- Supervisor stateful switchover— Active/standby dual supervisor configuration. The state and configuration remain constantly synchronized between two Virtual Supervisor Modules (VSMs) to provide a seamless and statefu1 switchover in the event of a VSM failure.
Problems with High Availability
System-Level High Availability
The Cisco Nexus 1000V supports redundant VSM VMs—a primary and a secondary—that run as an HA pair. Dual VSMs operate in an active/standby capacity in which only one of the VSMs is active at any given time, while the other acts as a standby backup. The state and configuration remain constantly synchronized between the two VSMs to provide a statefu1 switchover if the active VSM fails.
Single or Dual Supervisors
The Cisco Nexus 1000V system is made up of the following:
- VEMs that run within virtualization servers (these VEMs are represented as modules within the VSM)
- A remote management component, such as the Microsoft SCVMM.
- One or two VSMs that run within VMs.
Network-Level High Availability
The Cisco Nexus 1000V HA at the network level includes port channels and the Link Aggregation Control Protocol (LACP). A port channel bundles physical links into a channel group to create a single logical link that provides the aggregate bandwidth of up to eight physical links. If a member port within a port channel fails, the traffic that was previously carried over the failed link switches to the remaining member ports within the port channel.
Additionally, the LACP allows you to configure up to 16 interfaces into a port channel. A maximum of eight interfaces can be active, and a maximum of eight interfaces can be placed in a standby state.
For additional information about port channels and the LACP, see the Cisco Nexus 1000V for Microsoft Hyper-V Layer 2 Switching Configuration Guide.
Failover Clusters and the Microsoft SCVMM
Failover clustering is a hostside feature that provides high availability and scalability to multiple server workloads. In order for a Cisco Nexus 1000V switch to be considered a high availability device, the switch must meet the following criteria:
- The VM must be set to High Availability > True to be considered part of a failover cluster. That is, the VM can be moved automatically by the cluster in the event of a host failure.
- The high availability VM should be stored in one of the following types of Internet Protocol (IP) based storage facilities to accommodate live migration for a failover cluster:
– Clustered shared volumes (iSCSI, and so on)
When clusters are managed by the Microsoft SCVMM, certain criteria must be met for the Microsoft SCVMM to manage the VM as part of a failover cluster. That is, the logical switch that is part of the hosts of the failover clusters should be configured for high availability.
High Availability Logical Switch Criteria and Behavior
- A logical switch is considered to be highly available when it carries the same uplink networks on all the nodes of the cluster.
- If certain adapters carry the same uplink in each logical switch across all nodes and other uplinks do not then the adapters that carry the same uplink networks become high availability.
- A VM that is not configured for high availability can be connected to any switch in the failover cluster (logical or standard switch).
- A high availability VM can only be connected to uplinks that are high availability and are part of a logical switch.
Selecting Storage During VM Deployment on Failover Clusters from the Microsoft SCVMM
The failover cluster managed by the Microsoft SCVMM has more than one associated storage device. By default, the Microsoft SCVMM chooses the storage based on the deployment algorithm of the Microsoft SCVMM, which might not be what you want.
Step 1 Launch the Microsoft SCVMM UI.
Step 2 In the Migrate VM Wizard screen, change the storage of the VM and the VM hard disk to the appropriate storage.
Step 3 Pin the selection to the Microsoft SCVMM UI.
Live Migration Fails Due to Network Bandwidth
When a workload VM is carrying high traffic, VM live migration might not be allowed by the Microsoft SCVMM. The Microsoft SCVMM performs checks during live mgration and decides the feasibility of moving the VM based on many factors, one of which is VM port traffic. From the perspective of the Microsoft SCVMM, when a VM is transmitting or receiving large amounts of traffic, it is not feasible to move the VM because it might result in a loss of bandwidth.
Step 1 Launch the Microsoft SCVMM UI.
Step 2 In a Microsoft SCVMM PowerShell window, enter Move - SCVirtualMachine.
Cluster IP Resource Fails to Come Up
Cluster validation is an important tool used by large deployments to validate cluster configurations. When a virtual switch is deployed on the management NIC of the host with a static IP address, and the failover cluster already exists, the cluster IP resource might fail to come up. When this problem occurs, although the cluster IP address and DNS are reachable by conventional means (ping), the cluster validation tool fails.

Note This problem is a known issue with the Microsoft SCVMM and is seen only with static IP addresses, not when the host management IP address is distributed over DHCP.
There is no known workaround for this issue. We recommended that you create clusters after you deploy the Cisco Nexus1000V on the management IP address.
High Availability Troubleshooting Commands
You can use the commands in this section to troubleshoot problems related to high availability.
To list process logs and cores, enter these commands:
To check the redundancy status, enter this command:
To check the system internal redundancy status, enter this command:
To check the system internal sysmgr state, enter this command:
To reload a module, enter this command:
This command reloads the secondary VSM.

Note Entering the reload command without specifying a module reloads the whole system.
To attach to the standby VSM console, enter this command:
The standby VSM console is not accessible externally but can be accessed from the active VSM through the attach module module-number command.