Guest

Cisco Nexus 1000V Switch for Microsoft Hyper-V

Nexus 1000V on Hyper-V Troubleshoot Guide

Document ID: 116402

Updated: Oct 01, 2013

Contributed by Louis Watta and Matthew Wronkowski, Cisco TAC Engineers.

   Print

Introduction

This document describes the procedures used in order to troubleshoot Cisco Nexus 1000V (N1KV) Series switches on Microsoft (MS) Hyper-V servers. Implementation on Hyper-V is much different than on ESXi, so there will be some frequently-encountered issues; thus, this document was created.

Much of the information described in this document comes directly from the Engineering New Product Introduction (NPI), and from issues encountered during beta testing. This document is dynamic in nature, and will be updated accordingly.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

  • N1KV Series Switches
  • MS Hyper-V Servers

Components Used

This document is not restricted to specific software and hardware versions. 

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Installer Application Issues

There are many issues with the Installer applicaton, and this section describes the most common ones.

Use the Installer Application with Caution

Here are some reasons why you should use this application with caution:

  • It does not wait long enough for the Virtual Supervisor Module (VSM) to start on a lot of platforms, and often fails.
  • It moves the management (mgmt) interface to a MS Logical Switch and does not inform you, even though you might not want the mgmt interface moved.
  • The logical switch that is created cannot use a teamed interface. This means that there is no redundancy for the switch or the mgmt interface.
  • You cannot simply add another Network Interface Card (NIC) to the logical switch in order to make it teamed; you must create a new switch with a teamed interface, and move everything over in order to have redundancy.
  • The application does not recognize if the hosts that you install the VSMs on are part of a cluster. This means that the virtual disks are installed in local storage, not cluster storage.
  • The application creates a network uplink with a system network set. Every network segment must have a system network set. This is a major bug, so be aware of it.

Installer Application Log Location

The Installer applicaton is only meant to work in greenfield environments. Do not attempt to use the application in a previously-established configuration. In order to verify if the Installer errors-out, navigate to C: > Users > <username> > AppData > Local > Temp > 2 > Nexus1000vInstaller_xxxxxxxx.txt, and check the log.

Installer Application Migrates the Mgmt NIC

The default (and only) behavior of the Installer application is to display and use the physical NIC to which the mgmt interface is connected. When you run the Installer application, you are only able to choose one NIC - the mgmt NIC.

The Installer application:

  1. Creates a MS logical switch
  2. Adds the two hosts that have the VSMs to the logical switch
  3. Migrates the mgmt NIC to a virtual nNIC on the logical switch
  4. Adds the VSM network connections to that logical switch

This image illustrates how one of the hosts now has a MS logical switch assigned, and a virtual NIC that carries the mgmt traffic:

You can see that the uplink is defined with No Uplink Team when you view the logical switch that is created. This is a problem because you cannot add another NIC or a teamed NIC to this switch. You are not allowed to change the type of Team once the switch is created. Also, the Installer Application does not allow you to add a teamed interface.

In order to change the switch to teamed, you must remove it and add it back with a teaming set. This is possible, but tedious. You want redundancy, so if it is not teamed, then there is a potential problem. 

The Logical Switch for Mgmt and VSM is Not Created on All Switches

This is another problem because the VSMs are tied to only these two hosts. So, live migration and Cisco Home-Agent (HA) are limited to two hosts. You have the option to migrate the other Hyper-V hosts to the MS logical switch that is created, but it is not completed automatically by the Installer application.

The Installer Application does Not Utilize Cluster-Based Storage

When the VSM Virtual Machine (VM) settings are created, the Availability has a value of Low. MS only allows VMs with Availability values of High to be included on Cluster-Based Storage. This places the VSM Virtual Disk (VD) and VM information on the local storage of the Hyper-V host. Again, this limits live-migration and HA for the VSM VMs.


Note: Unfortunately, no procedures have been discovered to change the Availability settings for the VSM once it is created.  

Issues with the VSM Configuration That the Installer Application Creates

The Installer application creates a very basic configuration on the VSM, and imports some of that configuration to System Center Virtual Machine Manager (SCVMM).

The application performs these actions from the N1KV side:

  • Creates a default logical network
  • Creates a default network segment pool
  • Creates a default uplink network
  • Creates a default eth port-profile with Channel group mac pinning
  • Creates a default veth port-profile for no-restriction
  • Creates a default IP pool template
  • Creates the N1KV logical Switch in SCVMM

The application not only creates these settings on the VSM, but it populates this information into SCVMM when it creates the logical switch.

The application does well in the configuration aspect, but has problems with the uplink network. This is how the network uplink is created:

nsm network uplinkn1kv_uplink_network_1_VSM-install1
 import port-profile n1kv_uplink_network_policy_VSM-install1
 allow network segment pool n1kv_network_segment_pool_VSM-install1
 native network segment n1kv_vmaccess_1_VSM-install1
 system network uplink
 publish network uplink

There is a system network uplink, which causes an issue. If you have an uplink with system network uplink set, then all the network segments and port-profiles that use that uplink must be system as well. This means that you are limited to 32 network segments that are able to use that uplink.

It is not clear that this is a problem, but let's show an example of what happens if you build a new network segment and IP pool template for VLAN 152:

VSM-install1(config)# nsm ip pool template vlan-152
VSM-install1(config-ip-pool-template)# ip address 192.168.152.2 192.168.152.253
VSM-install1(config-ip-pool-template)# network 192.168.152.0 255.255.255.0
VSM-install1(config-ip-pool-template)# default-router 192.168.152.1

VSM-install1(config)# nsm network segment segment-vlan-152
VSM-install1(config-net-seg)# switchport mode access
VSM-install1(config-net-seg)# switchport access vlan 152
VSM-install1(config-net-seg)# ip pool import template vlan-152
VSM-install1(config-net-seg)# member-of network segment pool n1kv_network_
  segment_pool_VSM-install1
VSM-install1(config-net-seg)# publish network segment
VSM-install1(config-net-seg)#

Refresh the SCVMM N1KV extension, and add the VM network for the network segment that you created. When you attempt to assign a VM to the new VM network, you get these errors:

Error (12700)
 Failed while applying switch port settings 'Ethernet Switch Port Profile Settings'
 on switch 'n1kv_VSM-install1': A device attached to the system is not functioning.
 (0x8007001F). Unknown error (0x8005)

Error (26908)
 Virtual switch on host to which the virtual network adapter is to be connected
 (n1kv_VSM-install1) is a non compliant logical switch instance


These errors are caused because the network uplink carries a system network and the network segment does not. You have two options: either create a new network uplink without a system network, or add a system network to your new network segment. 

SCVMM Cannot Connect to the VSM

The connectivity between the VSM and SCVMM is different with Hyper-V than with ESXi. In the Hyper-V solution, SCVMM talks to our(Nexus 1000V) API. This means that the connection is established and maintained from the SCVMM host. When the show svs connection command is used on the VSM, it shows nothing; there is no SVS connection in this solution.

SCVMM also polls the VSM once every thirty minutes. This means that you must force a refresh if you want to see the changes from the VSM appear on SCVMM immediately.  

Verify that the Provider is Installed

The provider for Hyper-V is similar to the plugin for N1KV on ESXi. The difference is that there is no unique provider for each VSM. You only need to run the provider install once. This populates SCVMM with the information that is needed in order to understand how to talk to the VSM.

The provider is not specific to each VSM. The provider is registered in the Windows Registry. You can search for VSEM in the registry, or navigate to this location:


If you are in a position where you cannot delete the provider, then you can delete the entry in the registry and restart the SCVMM service.

Note the location for the module in the registry entry. The provider Dynamic Loadable Library (DLL) should be installed in c:\Program Files\Cisco\Nexus1000V, along with a powershell script that is used in order to install the provider. Ensure that the DLL is present.

Note: If the DLL is corrupted, you must remove it and re-install. 

Uninstall/Reinstall the Provider with the Control Panel

An uninstall of the provider is completed via a program uninstall from the Hyper-V Server 2012 control panel. In order to reinstall the provider, double-click the provider installer. 

Check the Extension Compliance

Ensure that the provider extension is active and compliant in SCVMM. Navigate to Settings > Configuration Providers. Verify that the Cisco Systems Nexus 1000V Extension is active. This means that the extension is used by SCVMM.

  

Verify Connectivity Between VSM and SCVMM

SCVMM talks to the VSM, so you must troubleshoot from the SCVMM host.

Verify that:

  • You can ping the VSM from the SCVMM host.
  • You can browse via a web browser to the N1KV Application Programming Interface (API).

If you cannot ping the VSM, then verify the Windows firewalls and check for network connectivity issues. There is no requirement that the VSM and SCVMM must be on the same subnet.

In order to verify the API, use Internet Explorer (IE) and browse to the VSM REST API with this string: http://<vsm-ip>/api/n1kv.

You should receive this output:

If you cannot reach the API, then verify that:

  • There are no internet proxies configured on the SCVMM host. SCVMM inherits proxies if they are defined in IE. Check the Internet Settings in IE in order to verify that a proxy is defined. You might be required to add an exception for the VSM.
  • The Webserver and API is accessible on the VSM. Verify that the http-server is enabled on the VSM, and if any firewalls are enabled that block port 80 traffic.

Note: Currently the VSM processes API calls for HTTP or HTTPS, but SCVMM is limited to HTTP only.

Virtual Ethernet Module (VEM) Issues

N1KV on Hyper-V uses L3 control only. There is no way to control Hyper-V with with L2 control. The configuration L3 control on Hyper-V is much easier than a similar configuration on VMware. There is no need to dedicate a NIC to the VEM; the VSM talks directly to the Hyper-V Server 2012 management NIC. There is no requirement that the management NIC must be attached to the VEM module, which means that you do not need a special veth port-profile for the L3 control.

The installation of the VEM is also much easier. There is no VMware Update Manager (VUM) component with SCVMM. The ability to install extension components is built directly into SCVMM. If the VEM is not installed on the Hyper-V host, then SCVMM copies and installs the VEM on the target Hyper-V host automatically. If you want to manually install the VEM, it is a simple double-click of the VEM installer application on the host. Uninstall is also a simple program remove from the Control Panel. 

Hyper-V Host does Not Install to N1KV

A common error you might encounter is that a Hyper-V host is not added to the N1KV through SCVMM. There are multiple verifications that must be made in order to troubleshoot this issue.

Here is a typical error you might see in SCVMM when the VEM install fails:

Check for Old Network Teams on the Hyper-V Host

There might be an old team from another N1KV on the Hyper-V host. If so, you must delete the old team before you add the host to the N1KV. On the Hyper-V host, run Powershell and enter the Get-NetSwitchTeam command. If an old team appears, then you must remove it with the Remove-NetSwitchTeam command.

PS C:\> Get-NetSwitchTeam

Name: HPV7b9901d8-70b8-4063-b60e-bcd6679384f7 <<<< Logical Switch name is ?HPV?

Members: Ethernet

PS C:\> Remove-NetSwitchTeam -Name HPV7b9901d8-70b8-4063-b60e-bcd6679384f7

 

The Maximum Transmission Unit (MTU) of the NICs and the N1KV do Not Match

MTU settings in Hyper-V are set per NIC through the NIC settings. When you create a team, MS mandates that the MTU settings of all the NICs in the team are identical. 

There are two ways in order to set and verify MTU settings. The first is through the Network Adapter settings. The second way is to use Powershell. Here is an example that illustrates the use of Powershell in order to get and set the MTU setting at the same time:

PS C:\Program Files (x86)\cisco\Nexus1000V>
 Get-NetAdapterAdvancedProperty -RegistryKeyword
 *jumbo* -Name ? <adapter name>"
| Set-NetAdapterAdvancedProperty
 -RegistryValue <mtu value>

New Configuration does Not Work Due to Stale/Old N1KV Configuration

You might encounter an issue where there is a stale N1KV configuration on the Hyper-V host that does not allow it to be added to the new configuration. Usually when you delete the old N1KV from SCVMM or Hyper-V Manager, it cleans up the configuration. However, there might be a case where you must check and delete the old N1KV configuration from the Hyper-V host registry.

Enter the Regedit command, and delete the N1KV configuration at this location:

HKEY_LOCAL_MACHINE\SYSTEM > CurrentControlSet > Services > VMSMP >
  Parameters >SwitchList

After you delete the registry entry, clean up via the Hyper-V Manager and reboot. 

Required Drivers are Not Found Error

You might receive an error that the required drivers or MSI are not found when you attempt to add a Hyper-V host to the N1KV. Here is a sample of the error from the Jobs Window:

This usually means that the N1KV VEM code does not exist on the SCVMM server. The SCVMM server must verify the extension that is installed on the Hyper-V host. Even if the VEM code is already installed on the Hyper-V host, the N1KV VEM installer must be copied to a directory on the SCVMM server.

Verify that the N1KV VEM installer is copied to C:\ProgramData\Switch Extension Drivers on the SCVMM server. If it does not exist, then copy the file to the directory, and add the Hyper-V host to the N1KV.

VEM Module does Not Appear on the VSM

In this case, everything appears to work in SCVMM, but the module never appears on the VSM . It is rare that this happens with Hyper-V, since the configuration is so simple. When it does happen, there are few simple things to try.

Restart the N1KV Process on the Hyper-V Host

Use the Task Manager or Services in order to restart the N1KV process on the Hyper-V host that presents the problem. 

Here is a screenshot of the N1KV service in Task Manager - right-click it, and select Restart:


The VEM Team is Not Created Correctly

When you create the logical switch in SCVMM, you can choose either No Team or Team. With the N1KV, you must always choose Team, even if you have only one NIC attached.

Here is a screenshot that illustrates where to set the Team setting for the logical switch:

All the VETH Ports are Down After a Host-Reboot

Hyper-V is very capable in this regard; if it sees that VMs are powered-on, and that Administration issued a reboot, it pauses the state of the VMs and reboots. When the system comes back online, it attempts to bring the VMs back online as soon as possible. This assumes you did not live-migrate all the VMs off the host before the reboot.

The problem is that Hyper-V brings the VMs back online before the VEM process actually starts. The workaround is to set the VMs with an autostart delay. Engineering recommends that a thirty-second delay is used in order to allow the VEM and VSM to communicate before Hyper-V tries to resume/power-on all of the VMs.

 

Unable to Find Compliant Switch Error

When you attempt to create or move a VM to the N1KV, or to live-migrate a VM from one host to another, you might receive this error:

This is a warning message more than and error. Even though it appears as an error in the job screen, it does not indicate that something is severely broken. The issue is that SCVMM tries to keep a compliant state between itself, the VSM, and the VEM. For some reason, SCVMM occasionally thinks that the state is out of sync, and determines that for certain hosts the N1KV is non-compliant. The compliance of the individual hosts is monitored under Fabric > Logical switches> <your N1KV logical switch>.

Click the Hosts button on the ribbon at the top of the screen:

If the host is non-compliant, then you must attempt to remediate the host. Select the host that is out of compliance, and click on the Remediate button at the top of the screen. This triggers SCVMM to sync the data between itself, the VSM, and the VEM module. After a few minutes, the state changes to Compliant, and you do not see any errors.

Note: The compliance state does not always immediately update to Compliant. Wait a minute or two and try again if it does not work.

Other Issues and Useful Commands

This section describes several miscellaneous issues and useful commands for N1KV on Hyper-V. 

VSM Cannot be Nested on Hyper-V

You cannot currently the run the VSM on a nested Hyper-V host. Unlike ESXi, for some reason the VSM cannot run on a virtual Hyper-V host. Engineering is aware of the issue, but it is low priority at the moment, so be aware of that restriction. However, you can run the VSM on a nested ESXi host, so that is one possible workaround.  

Virtual Machine Queuing (VMQ)

VMQ is almost identical to the VMware Virtual Machine Device Queue (VMDQ). VMQ requires that the physical NIC supports VMQ. The NIC creates a network queue for each VM on the system, which allows the network traffic to flow directly from the hypervisor to the VM. This improves network performance for the VMs.

Note: In order to use VMQ, the physical NIC on the system must support VMQ/VMDQ. Current Cisco VIC adapters do NOT support VMQ/VMDQ.

Powershell Commands Used to Check VMQ

There are two useful commands used in order to check for VMQ information through Powershell on the Hyper-V host:

  • Get-NetAdapterVmq
  • Get-NetAdapterVmqQueue

Use of Vemcmd Commands in Order to Check VMQ

This is the primary command used in order to display information about VETHs for which queue(s) have been allocated:

>vemcmd show vmq allocation
  LTL   VSM Port  Phy LTL  Queue id  Team queue id
   49     Veth13       17        1                49
                18        2              49
   50     Veth14       17        2                50
                18        3              50
   51     Veth16       19        1                51
                20        1              51

Use of Vemcmd in Order to View VMQ Resources

This command displays information about VMQ-enabled Physical NICs:

>vemcmd show vmq resources
  LTL   VSM Port  Max queues  Free queues
   17     Eth3/1          16           10
   18     Eth3/2          16           10
   19     Eth3/3           8            7

Useful PowerShell Commands

There are several Powershell commands that pull or push data into the VSM. This allows you to script installation and orchestration of VMs to the N1KV. It also allows you to pull more detailed information that shows relationships between SCVMM and N1KV objects.

Use Powershell from SCVMM

You must ensure that you use a Powershell that has the SCVMM plugins. The easiest way to accomplish this is to launch Powershell from the SCVMM console:

The Get-SCPortClassification Command

This command is used in order to view the link between an SCVMM port-classification and the N1KV port-profile to which it is linked:

PS C:\Users\Administrator.HYPERV> Get-SCPortClassification

Name              : NexusNoRestrict-2
Description       :
ServerConnection  : Microsoft.SystemCenter.VirtualMachineManager.
                    Remoting.ServerConnection
ID                : 9f8819c1-8b53-42bd-a6fd-0173804e3194
IsViewOnly        : False
ObjectType        : PortClassification
MarkedForDeletion : False
IsFullyCached     : True

The Get-SCVirtualNetworkAdapterExtensionPortProfile Command

This command is used in order to view information about the uplink port-profile:

PS C:\Users\Administrator.HYPERV> Get-SCVirtualNetworkAdapterExtensionPortProfile

Name                    : NoRest-unicast-norest
ExternalId              : 308ad66b-7c42-4067-90af-13f7a6e59afe
NetworkEntityAccessType : ExternallyManaged
VirtualSwitchExtension  : n1kv-test
Tags                    : {}
AllowedVNicType         : Both
MaxNumberOfPorts        : 32
MaxNumberOfPortsPerHost : 216
ProfileData             : 0
ServerConnection        : Microsoft.SystemCenter.VirtualMachineManager.
                          Remoting.ServerConnection
ID                      : 8934a01c-0cb7-4ee2-ae9d-21ff5b26568f
IsViewOnly              : False
ObjectType              : VirtualSwitchExtensionVirtualPortProfile
MarkedForDeletion       : False
IsFullyCached           : True

The Get-SCConfigurationProvider Command

This command is used in order to view information about the Provider Extensions loaded on the SCVMM server:

PS C:\Users\Administrator.HYPERV> Get-SCConfigurationProvider

Name              : Cisco Systems Nexus 1000V
Type              : VirtualSwitchExtensionManager
Description       : Provider for Cisco Systems Nexus 1000V
                    Virtual Switch Extension Manager
LatestVersion     : 1.0
PublishDate       :
Publisher         : Cisco Systems, Inc.
Manufacturer      : Cisco Systems, Inc.
Model             : {Nexus 1000V}
Error             :
ServerConnection  : Microsoft.SystemCenter.VirtualMachineManager.
                    Remoting.ServerConnection
ID                : 22a8f431-b5fe-4ee8-a0f5-9b5a99f723f2
IsViewOnly        : False
ObjectType        : ConfigurationProvider
MarkedForDeletion : False
IsFullyCached     : True

Vemcmd and Vemlog Locations

VEM commands are available at C: > Program Files (x86) > Cisco > Nexus1000V.

Verify Physical Adapter Inventory through the Registry

In order to verify the physical adatper connectivity to the N1KV in the registry, access this registry location:

  • Registry Hive: HKEY_LOCAL_MACHINE > SYSTEM > CurrentControlSet
  • Registry Key: Services > Nexus1000V > Parameters > HostPhyAdapters

Cannot Delete N1KV Objects Due to Temporary Template

You might encounter this issue if you implemented templates and built VMs through the SCVMM Sevice Template application, and allowed self-service users to create their own VMs. This temporary template is not a viewable object through SCVMM. You must use the SCVMM Powershell in order to delete the temporary template with this command:

 Get-SCVMTemplate | where {$_.Name -like "Temporary*"} | Remove-SCVMTemplate 

VMs Assigned to N1KV Receive Logical Switch Compliance Errors

Sometimes compliance errors are just a function of the way SCVMM operates. The N1KV might be fully compliant in SCVMM, but you still receive compliance errors.

You might also receive this message, where you are not allowed to choose or modify any network settings for a VM:

This occurs when one of the nodes of the MS Cluster has problems. SCVMM discovers that all of the nodes are not in compliance, and does not allow you to make changes until you remove or fix the node with the problem. This is expected behavior in SCVMM.

In order to determine which node has the problems, use SCVMM or Cluster Failover Manager, and fix the problem node. If you cannot fix the node, then you must remove or pause it from the cluster. Once that is complete, you have the ability to add and modify VMs to the N1KV.

Updated: Oct 01, 2013
Document ID: 116402