Guest

Cisco Nexus 5000 Series Switches

Cisco NX-OS Software Virtual PortChannel: Fundamental Concepts 5.0.

  • Viewing Options

  • PDF (3.4 MB)
  • Feedback


Contents

Virtual PortChannel Technology. 3

vPC Basics. 3

vPC Peer Link. 5

vPC Peer-Keepalive or Fault-Tolerant Link. 6

vPC Ports, and Orphaned Ports. 6

vPC Topology with Fabric Extenders. 7

Traffic Flows. 8

Dual-Control Plane with Single Layer 2 Node Behavior 9

Link Aggregation Group Identifier 9

System ID in a vPC System... 11

Primary and Secondary vPC Roles. 11

Spanning Tree. 12

Cisco Discovery Protocol 12

Cisco Fabric Services over Ethernet Synchronization Protocol 13

vPC Configuration Changes When the Peer Link Fails. 13

Peer Configuration Check Bypass (for Cisco Nexus 5000 Series running NXOS version inferior to NXOS 5.0(2)N1(1)) 16

vPC Reload Restore. 17

vPC Configuration Consistency. 17

vPC Consistency Checks. 18

vPC Configuration Synchronization. 19

Duplicate Frames Prevention in vPC.. 19

vPC and Object Tracking. 21

In-Service Software Upgrade and vPC.. 22

Interactions Between vPC and Routing. 22

HSRP Gateway Considerations. 22

HSRP Configuration and Best Practices for vPC.. 22

ARP Synchronization. 23

Peer Gateway. 23

Layer 3 Link Between vPC Peers. 24

Layer 3 Link to the Core. 26

Interactions with Multicast 26

IGMP Snooping and vPC.. 26

Protocol Independent Multicast and vPC.. 28

vPC Failure Scenarios. 28

vPC Member Port Failure. 28

vPC Complete Dual-Active Failure (Double Failure) 28

vPC Peer-Link Failure. 29

vPC Peer-Keepalive Failure. 29

Examples. 29

vPC with Fabric Extender Active-Active Design. 32

vPC Configuration Best Practices. 33

vPC Domain Configuration. 33

vPC Role and Priority. 33

Reload Restore. 34

Peer Gateway. 34

vPC Peer Link. 35

vPC Peer Keepalive. 35

vPC Ports. 35

LACP.. 36

For More Information. 37


Virtual PortChannel Technology

Virtual PortChannels (vPCs) allow links that are physically connected to two different Cisco® switches to appear to a third downstream device to be coming from a single device and as part of a single PortChannel. The third device can be a switch, a server, or any other networking device that supports IEEE 802.3ad PortChannels.

Cisco NX-OS Software vPCs and Cisco Catalyst® Virtual Switching Systems (VSS) are similar technologies. For Cisco EtherChannel technology, the term “multichassis EtherChannel” (MCEC) refers to either technology interchangeably.

vPC allows the creation of Layer 2 PortChannels that span two switches. At the time of this writing, vPC is implemented on the Cisco Nexus® 7000 and 5000 Series platforms (with or without Cisco Nexus 2000 Series Fabric Extenders).

vPC Basics

The fundamental concepts of vPC are described at http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9402/white_paper_c11-516396.html.

vPCs consist of two vPC peer switches connected by a peer link. Of the vPC peers, one is primary and one is secondary. The system formed by the switches is referred to as a vPC domain.

Following is a list of some possible Cisco Nexus vPC topologies:

vPC on the Cisco Nexus 7000 Series (topology A): This topology consists of access layer switches dual-homed to the Cisco Nexus 7000 Series with a switch PortChannel with Gigabit Ethernet or 10 Gigabit Ethernet links. This topology can also consist of hosts connected with virtual PortChannels to each Cisco Nexus 7000 Series Switch.

vPC on Cisco Nexus 5000 Series (topology B): This topology consists of switches dual-connected to the Cisco Nexus 5000 Series with a switch PortChannel with 10 Gigabit Ethernet links, with one or more links to each Cisco Nexus 5000 Series Switch. Like topology A, topology B can consist of servers connected to each Cisco Nexus 5000 Series Switch via virtual PortChannels.

vPC on the Cisco Nexus 5000 Series with a Cisco Nexus 2000 Series Fabric Extender single-homed (also called straight-through mode) (topology C): This topology consists of a Cisco Nexus 2000 Series Fabric Extender single-homed with one to eight 10 Gigabit Ethernet links (depending on the fabric extender model) to a single Cisco Nexus 5000 Series Switch, and of Gigabit Ethernet or 10 Gigabit Ethernet-connected servers that form virtual PortChannels to the fabric extender devices. Note that each fabric extender connects to a single Cisco Nexus 5000 Series Switch and not to both, and that the virtual PortChannel can be formed only by connecting the server network interface cards (NICs) to two fabric extenders, where fabric extender 1 depends on Cisco Nexus 5000 Series Switch 1 and fabric extender 2 depends on Cisco Nexus 5000 Series Switch 2. If both fabric extender 1 and fabric extender 2 depend on switch 1 or both of them depend on switch2, the PortChannel cannot be formed.

Dual-homing of the Cisco Nexus 2000 Series Fabric Extender (topology D): This topology is also called Cisco Nexus 2000 Series Fabric Extender (FEX for brief) Active/Active. In this topology each FEX is connected to each Cisco Nexus 5000 Series device with a virtual PortChannel. With this topology, the server cannot create a PortChannel split between two fabric extenders. The servers can still be dual-homed with active-standby or active-active transmit-load-balancing (TLB) teaming.

Note: Topologies B, C, and D are not mutually exclusive. You can have an architecture that uses these three topologies concurrently.

Figure 1 illustrates topologies A and B. Figure 2 illustrates topologies C and D.

Figure 1. vPC Topologies A and B

Figure 2. vPC Topologies C and D

Figure 3 illustrates the main vPC components. Switches 1 and 2 are the vPC peer switches. The vPC peer switches are connected through a link called a peer link, also known as a multichassis EtherChannel trunk (MCT).

Figure 3 shows devices (switch 3, switch 4, and server 2) that are connected to the vPC peers (which could be Cisco Nexus 7000 or 5000 Series Switches). Switches 3 and 4 are configured with a normal PortChannel configuration, switches 1 and 2 are configured with a virtual PortChannel

Figure 3. vPC Components

vPC Peer Link

The vPC peer link is the most important connectivity element in the vPC system. This link is used to create the illusion of a single control plane by forwarding Bridge Protocol data units (BPDUs) or Link Aggregation Control Protocol (LACP) packets to the primary vPC switch from the secondary vPC switch.

The peer link is used to synchronize MAC addresses between aggregation groups 1 and 2, to synchronize IGMP entries for the purpose of IGMP snooping, it provides the necessary transport for multicast traffic and for the communication of orphaned ports. The term “orphaned ports” refers to switch ports connected to single-attached hosts, or vPC ports whose members are all connected to a single vPC peer.

In the case of a vPC device that is also a Layer 3 switch, the peer link also carries Hot Standby Router Protocol (HSRP) frames.

For a vPC to forward a VLAN, that VLAN must exist on the peer link and on both vPC peers, and it must appear in the allowed list of the switch port trunk for the vPC itself. If either of these conditions is not met, the VLAN is not displayed when you enter the command show vpc brief, nor is it a vPC VLAN.

When a PortChannel is defined as a vPC peer link, Bridge Assurance is automatically configured on the peer link.

vPC Peer-Keepalive or Fault-Tolerant Link

A routed “link” (it is more accurate to say “path”) is used to resolve dual-active scenarios in which the peer link connectivity is lost. This link is referred to as a vPC peer-keepalive or fault-tolerant link. The peer-keepalive traffic is often transported over the management network through the management 0 port of the Cisco Nexus 5000 Series Switch or the management 0 ports on each Cisco Nexus 7000 Series supervisor. The peer-keepalive traffic is typically routed over a dedicated Virtual Routing and Forwarding (VRF) instance (which could be the management VRF, for example).

The keepalive can be carried over a routed infrastructure; it does not need to be a direct point-to-point link, and, in fact, it is desirable to carry the peer-keepalive traffic on a different network instead of on a straight point-to-point link.

vPC Ports, and Orphaned Ports

A vPC port is a port that is assigned to a vPC channel group. The ports that form the virtual PortChannel are split between the vPC peers and are referred to as vPC member ports.

A non-vPC port, also known as an orphaned port, is a port that is not part of a vPC.

Figure 4 shows different types ports connected to a vPC system. Switch1 and Host 3 connect via vPCs. The ports connecting devices in a non-vPC mode to a vPC topology are referred to as orphaned ports. Switch 2 connects to the Cisco Nexus Switch with a regular spanning-tree configuration: thus, one link is forwarding, and one link is blocking. These links connect to the Cisco Nexus Switch with orphaned ports.

Figure 4. vPC Ports and Orphan Ports

Server 6 connects to a Cisco Nexus Switch with an active-standby teaming configuration. The ports that server 6 connects to on the Cisco Nexus Switch are orphaned ports.

vPC Topology with Fabric Extenders

Figure 5 illustrates another vPC topology consisting of Cisco Nexus 5000 Series Switches and Cisco Nexus 2000 Series Fabric Extenders (in straight-through mode: that is, each fabric extender is single-attached to a Cisco Nexus 5000 Series Switch).

Figure 5 shows devices that are connected to the vPC peer (Cisco Nexus 5000 Series Switches 5k01 and 5k02) with a PortChannel (a vPC); for example, server 2, which is configured for NIC teaming with the IEEE 802.3ad option.

Servers 1 and 3 connect to orphan ports.

Figure 5. vPC Components with the Fabric Extender (FEX)

To summarize, a vPC system consists of the following components:

Two peer devices: the vPC peers, of which one is primary and one is secondary; both are part of a vPC domain

A Layer 3 Gigabit Ethernet link called a peer-keepalive link to resolve dual-active scenarios

A redundant 10 Gigabit Ethernet PortChannel called a peer link which is used to carry traffic from one system to the other when needed and to synchronize forwarding tables

vPC member ports forming the virtual PortChannel

Traffic Flows

vPC configurations are optimized to help ensure that traffic through a vPC-capable system is symmetric. In Figure 6, for example, the flow on the left (in blue) reaching a Cisco Nexus switch (Agg1 in the figure) from the core is forwarded toward the access layer switch (Acc1 in the figure without traversing the peer Cisco Nexus switch device (Agg2). Similarly, traffic from the server directed to the core reaches a Cisco Nexus Switch (Agg1), and the receiving Cisco Nexus Switch routes this traffic directly to the core without unnecessarily passing it to the peer Cisco Nexus device. This process occurs regardless of which Cisco Nexus device is the primary HSRP device for a given VLAN.

Figure 6. Traffic Flows with vPC

Dual-Control Plane with Single Layer 2 Node Behavior

While still operating with two separate control planes, vPC helps ensure that the neighboring devices connected in vPC mode see the vPC peers as a single spanning-tree and LACP entity. For this to happen, the system has to perform IEEE 802.3ad control-plane operations in a slightly modified way (which is not noticeable to the neighbor switch).

Link Aggregation Group Identifier

IEEE 802.3ad specifies the standard implementation of PortChannels. PortChannel specifications provide LACP as a standard protocol, which enables negotiation of port bundling.

LACP makes misconfiguration less likely, because if ports are mismatched, they will not form a PortChannel.

Consider example A in Figure 8, in which switch 1 connects to switch 2. Port 1 on switch 1 connects to port 4 on switch 2, and port 2 on switch 1 connects to port 6 on switch 2.

Now imagine that the administrator configured a PortChannel on switch 1 between ports 1 and 2, while on switch 2 the PortChannel is configured between ports 5 and 3. Without LACP, the ports could be configured in channel-group mode, and you would not discover that this is a misconfiguration until you notice that traffic has dropped.

LACP discerns that the only ports that can be bundled are port 1 going to port 4. According to the IEEE specifications, to allow LACP to determine whether a set of links connect to the same system and to determine whether those links are compatible with aggregation, you need to be able to establish two types of identification:

You need a globally unique identifier for each system that participates in link aggregation (that is, the switch itself needs to be unique. This number is referred to as the system ID and is composed of a priority and a MAC address that uniquely identifies the switch. Figure 7 illustrates the system ID.

You need a means of identifying a link aggregation group.

For more information, please refer to the IEEE 802.3ad standard, Amendment to Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications - Aggregation of Multiple Link Segments.

Figure 7. Components of the System ID

In Figure 8, switch 1 announces ports 1 and 2 as part of the same aggregation group, and similarly switch 2 announces ports 4 and 5 as part of the same aggregation group. Because ports 3 and 6 are not part of the group, they cannot be bundled with the PortChannel.

Example A in Figure 8 shows an extreme case in which the PortChannel consists of an individual port only. If the negotiation had failed between switches 1 and 2, the links would still operate as normal, individual IEEE 802.3 links.

Switches 1 and 2 decide which ports can be bundled together based on the link aggregation group identifier (LAGID). This number includes the system identifier (in other words, an ID for the physical switch) and a key that identifies the aggregation group itself (that is, the equivalent of the channel group number).

As a first approximation, the LAGID is composed of the system ID of both systems and the channel group number used in both systems.

Example B in Figure 8 shows the case in which the ports are correctly wired. Assuming that the ports on switch 1 (system ID 1) are bundled as channel group 100 and the ports on switch 2 (system ID 2) are bundled as channel group 200, the LAGID would appear to be as follows: [1, 100, 2, 200].

Example C in Figure 8 illustrates how the PortChannel is requested between switch 2 and two separate upstream switches, switches 1 and 3, where switch 1 and 3 form a vPC system.

The system ID for switch 1 differs from the system ID for switch 3 because the MAC addresses of the two switches are different.

With vPC, the system IDs of switches 1 and 3 are identical, so switch 2 believes it is connected to a single upstream device.

Figure 8. LACP Behavior with Various Wiring Configurations

System ID in a vPC System

Spanning tree and LACP use the switch MAC address for, respectively, the bridge ID field in the spanning-tree BPDU and as part of LACP LAGID. In a single chassis, they use the systemwide MAC address for this purpose. For systems that use vPCs, use of the systemwide MAC address would not work because the vPC peers needs to appear as a single entity as shown in example C in Figure 8. To meet this requirement, vPC offers both an automatic configuration and a manual configuration of the system ID for the vPC peers.

The automatic solution implemented by vPC consists of generation of a system ID composed of a priority and a MAC address, with the MAC derived from a reserved pool of MAC addresses combined with the domain ID specified in the vPC configuration. The domain ID is encoded in the last octet and the trailing 2 bits of the previous octet of the MAC address.

By configuring domain IDs to be different on adjacent vPCs complexes (and to be identical on each vPC peer complex), you will help ensure the uniqueness of the system ID for LACP negotiation purposes. You also help ensure that the spanning-tree BPDUs use a MAC address that is representative of the vPC complex.

You can override the automatic generation of the system ID by using the command-line interface (CLI) and configuring the system ID on both vPC peers manually, as follows:

(config-vpc-domain)#system-mac <mac>

Primary and Secondary vPC Roles

In a vPC system, one vPC switch is defined as primary and one is defined as secondary, based on defined priorities. The lower number has higher priority, so it wins. Also, these roles are nonpreemptive, so a device may be operationally primary, but secondary from a configuration perspective.

To understand the operational role of a vPC member, you need to consider the status of the peer-keepalive link and the peer link.

When the two vPC systems are joined to form a vPC domain, the priority decides which device is the vPC primary and which is the vPC secondary. If the primary device were to reload, when the system comes back online and connectivity to the vPC secondary device (now the operational primary) is restored, the operational role of the secondary device (operational primary) will not change, to avoid unnecessary disruptions. This behavior is achieved with a sticky-bit method, whereby the sticky information is not saved in the startup configuration, thus making the device that is up and running win over the reloaded device. Hence, the vPC primary becomes the vPC operational secondary.

If the peer link is disconnected but the vPC peers are still connected through the vPC peer keepalive link, the vPC operational roles stay unchanged.

If both the peer link and peer-keepalive link are disconnected, both vPC peers become operational primary, but upon reconnection of the peer-keepalive link and the peer link, the vPC secondary device (operational primary) keeps the primary role, and the vPC primary becomes the operational secondary device.

Spanning Tree

vPC modifies the way in which spanning tree works on the switch to help ensure that the vPC peers in a vPC domain appear as a single spanning-tree entity on vPC ports. Also, vPC helps ensure that devices can connect to a vPC domain in a non-vPC fashion with classic spanning-tree topology. vPC is designed to support hybrid topologies. Depending on the Cisco NX-OS Software release, this can be achieved in slightly different ways.

In all Cisco NX-OS releases, the peer link is always forwarding because of the need to maintain the MAC address tables and Internet Group Management Protocol (IGMP) entries synchronized.

vPC by default ensures that only the primary switch forwards BPDUs on vPCs. This modification is strictly limited to vPC member ports. As a result, the BPDUs that may be received by the secondary vPC peer on a vPC port are forwarded to the primary vPC peer through the peer link for processing.

Note: Non-vPC ports operate like regular spanning-tree ports. The special behavior of the primary vPC member applies uniquely to ports that are part of a vPC.

Starting from Cisco NX-OS Releases 4.2(6) and 5.0(2), vPC allows the user to choose the peer-switch option. This option optimizes the behavior of spanning tree with vPC as follows:

The vPC primary and secondary are both root devices and both originate BPDUs

The BPDUs originated by both the vPC primary and the vPC secondary have the same designated bridge ID on vPC ports

The BPDUs originated by the vPC primary and secondary on non-vPC ports maintain the local bridge ID instead of the vPC bridge ID and advertise the Bridge ID of the vPC system as the root

The peer-switch option has the following advantages:

It reduces the traffic loss upon restoration of the peer link after a failure.

It reduces the disruption associated with a dual-active failure (whereby both vPC members become primary). Both devices keep sending BPDUs with the same bridge ID information on vPC member ports, which prevents errdisable from potentially disabling the PortChannel for an attached device.

It reduces the potential loss of BPDUs if the primary and secondary roles change.

Cisco Discovery Protocol

From the perspective of the Cisco Discovery Protocol, the presence of vPC does not hide the fact that the two Cisco Nexus Switches are two distinct devices, as illustrated by the following output:

tc-nexus5k01# show cdp neigh
Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
S - Switch, H - Host, I - IGMP, r - Repeater,
V - VoIP-Phone, D - Remotely-Managed-Device,
s - Supports-STP-Dispute
Device-ID Local Intrfce Hldtme Capability Platform Port ID
tc-nexus7k01-vdc2(TBM12162254)Eth2/1 158 R S I s N7K-C7010 Eth2/9
tc-nexus7k02-vdc2(TBM12193229)Eth2/2 158 R S I s N7K-C7010 Eth2/9

Cisco Fabric Services over Ethernet Synchronization Protocol

The vPC peers use the Cisco Fabric Services protocol to synchronize forwarding-plane information and implement necessary configuration checks.

vPC peers must syncrhonize the Layer 2 forwarding table - that is, the MAC address information between the vPC peers. This way, if one vPC peer learns a new MAC address, that MAC address is also programmed on the Layer 2 forwarding table of the other peer device.

The Cisco Fabric Services protocol travels on the peer link and does not require any configuration by the user.

To help ensure that the peer link communication for the Cisco Fabric Services over Ethernet protocol is always available, spanning tree has been modified to keep the peer-link ports always forwarding.

The Cisco Fabric Services over Ethernet protocol is also used to perform compatibility checks to validate the compatibility of vPC member ports to form the channel, to synchronize the IGMP snooping status, to monitor the status of the vPC member ports, to synchronize the Address Resolution Protocol (ARP) table (starting from Cisco NX-OS 4.2(6) and future Release 5.0 releases).

If the peer link is disconnected between two vPC peers, the synchronization between vPC peers is interrupted, which may lead to traffic drop for multicast traffic and to flooding for unicast traffic.

vPC Configuration Changes When the Peer Link Fails

The correct sequence for setting up vPCs requires that the two participating vPC switches see each other over the peer link and that they can communicate over the vPC peer keepalive link.

Figure 9 illustrates this fundamental concept: the configuration depicted in Figure 9 (2) requires starting from Figure 9 (1). If you try to configure a vPC like in Figure 9 (2) without having established vPC peer-link and vPC peer keepalive connectivity, vPC ports won’t go into forwarding state.

Once you are in the state depicted in Figure 9 (1) (which is a fully formed vpc domain) the peer keepalive connectivity is not strictly required in order to create or modify vPCs. In other words, you can configure the vPC in Figure 9 (2) even if there’s a loss of vPC peer keepalive connectivity after the configuration depicted in Figure 9 (1).

It is necessary and recommended to have functional vPC peer keepalive connectivity for the correct behavior of vPC in presence of failures, but from a traffic forwarding and configuration purpose, the temporary failure of the peer keepalive link doesn’t have any impact.

The vPC peer-link connectivity is more important for the correct traffic forwarding operations as well as for the ability to create or modify vPCs.

Figure 9. Set Up for vPC Domain, Peer Link, and vPC Member Ports

The initial implementation of vPC did not allow configuration changes when the peer link was disconnected to avoid, upon reconnection of the peer switch, inconsistencies that could bring these links down. Thus, if the peer-link was lost, the interfaces on the vPC primary could not be flapping (i.e. if they went down, they stayed down up until the peer-link was restored), and while the peer link was down, a new vPC member port could not be activated.

Example A in Figure 10 shows the case of a single vPC peer. The user cannot activate any vPC member port until a vPC peer switch is present (example B in Figure 10).

Figure 10. vPC Peers Must Be Connected for Interfaces to Be Activated

Because of this behavior, if the peer-link connection is lost, by default the user cannot add any vPC ports and activate them, nor can an interface flap. If a vPC interface flaps, the port will stay down after flapping.

For example, imagine a vPC setup with PortChannel 8 configured as vPC 8:

vPC status
----------------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
------ ----------- ------ ----------- -------------------------- -----------
8 Po8 up success success 23,50

After the peer-link failure only the primary keeps the vPC interfaces up. If the interface associated with PortChannel 8 flaps, it never goes up again.

vPC status
----------------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
------ ----------- ------ ----------- -------------------------- -----------
8 Po8 down failed Peer-link is down

To restore connectivity, you need to first restore the peer link, and then enter a shut/no shut command under the interface on the primary device.

Similarly, if you want to create a new vPC interface and to activate it, a vPC peer needs to be present and connected through the peer link. While a peer link is not connected, vPC prevents activation of any new vPC member port.

This vPC behavior may manifest itself in several scenarios:

A vPC pair in which the peer link is lost but the peer-keepalive is still connected (shown as case 1 in Figure11)

A vPC pair in which peer link and peer-keepalive links are lost (split brain)

A switch that was part of a vPC but has been reloaded; upon coming online, the vPC peer is unavailable (shown as case 2 in Figure 11)

A vPC switch that has never been part of a vPC domain because it has just been powered up and configured for vPC, but no peer-link and peer-keepalive connectivity has been established yet

Figure 11. Failure Scenarios: Peer Link Loss (1) and Reload (2)

Case 1 has been addressed by the Cisco Nexus 7000 Series with Cisco NX-OS 4.2(3) (CSCsz67416). This solution allows the vPC device to shut/no shut existing vPC member ports as long as the vPC secondary device can still be reached through the peer-keepalive link. This same behavior on the Cisco Nexus 5000 Series requires configuration of the keyword peer-check-config-bypass under the vPC domain configuration, but this process is superseded with NXOS 5.0(2)N1(1) which integrates CSCsz67416.

For case 1, to add a new vPCs you need the reload restore command of the Cisco Nexus 7000 Series and on the Cisco Nexus 5000 Series running NXOS 5.0(2)N1(1) or higher. You can achieve the same results with peer-check-config-bypass for the Cisco Nexus 5000 Series if running an earlier version of code.

The case of complete disconnection between the vPC primary and secondary devices (split brain) is addressed in the Cisco Nexus 7000 Series or Cisco Nexus 5000 Series running NXOS 5.0(2)N1(1) or higher by both CSCsz67416 (for existing vPC member ports) and reload restore (for new vPC member ports). The equivalent command on the Cisco Nexus 5000 Series for earlier releases is peer-check-config-bypass.

The third case, reload of a vPC device, is addressed with the vPC reload restore command.

The resolution of the fourth case is not currently contemplated in any Cisco NX-OS release. For a vPC port to be activated, the user is expected to first create a functional vPC configuration composed of two vPC peer switches as it is depicted in Figure 9 (1).

Peer Configuration Check Bypass (for Cisco Nexus 5000 Series running NXOS version inferior to NXOS 5.0(2)N1(1))

To override the default vPC behavior, which prevents activation of new vPC member ports when the peer link is down, you can use the command peer-config-check-bypass under the vPC domain configuration (on the Cisco Nexus 5000 Series only).

As an example:

vpc domain 2
role priority 100
peer-keepalive destination 10.51.35.18
peer-config-check-bypass

The peer-config-check-bypass command was introduced in Cisco NX-OS 4.1(3)N2(1), and it will be superseded by the vPC reload restore command when that command is available.

With this command in place, even if the Cisco Nexus 5000 Series peer link is down, a vPC member port can flap, and you can also create new vPC member ports and activate them.

Figure 12 illustrates the peer-config-check-bypass feature. You need to start from a configuration (1) in which the vPC peers are connected with the peer link. If the peer link fails, the vPC member ports stay active on the primary device (2). If a port flaps or if you add a new vPC member port, these ports can be activated (3).

Figure 12. The peer-config-check-bypass Feature

With this configuration in place and with the peer link down, you need to be cautious when adding new vPCs. The configurations need to be replicated on both devices to avoid a Type-1 inconsistency at peer-link restoration.

If the primary switch reloads, the machine restarts as if the vPC domain never existed: that is, you cannot activate existing vPC member ports or create and activate new vPCs until a functional redundant vPC configuration is put in place (which means until the primary device is connected to a vPC peer that is up and running).

vPC Reload Restore

The vPC reload restore feature was introduced with Cisco NX-OS 5.0(2). With vPC reload restore, when the peer link is down, or when the both peer link and peer-keepalive links are lost, the vPC primary can activate new vPCs ports.

If the vPC peer switch is brought online or connected to the existing vPC peer switch, the configuration checks are performed; if inconsistencies are found, the vPC ports are shut down.

In addition, with reload restore a vPC peer can reload, and after a reload, if no vPC peer exists (a condition that is detected by a timer), the user can create vPCs on the standalone vPC peer.

Upon reload, Cisco NX-OS starts a user-configurable timer (with a default of 240 seconds). If the peer-link port comes up physically or if the peer keepalive is functional, the timer is stopped and the device waits for the peer adjacency to form. If at timer expiration no peer-keepalive- or peer-link-up packets have been received, Cisco NX-OS assumes the primary role. The software reinitializes the vPCs, activating its local ports. Because there are no peers, the consistency check is bypassed for the local vPC ports.

The timer is user configurable and defines how long the standalone vPC device waits to detect a vPC peer. If at the timer expiration no peer-keepalive- or peer-link-up packets have been received, the software reinitializes the vPCs, activating its local ports. Because there are no peers, the consistency check is bypassed for the local vPC ports.

The following output shows the status of a virtual PortChannel configured on a standalone vPC system with restore reload:

----------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- -------------------------- ------------
51 Po51 up success Type checks were bypassed 10-14, 21-24
for the vPC ,50,60

vPC Configuration Consistency

The Ethernet PortChannel capability allows links to be bundled to form a single entity if certain compatibility conditions are met. The following is a list of conditions that are verified before ports can form a regular PortChannel (this list refers to regular PortChannels, not vPCs specifically). Members must:

Have the same port mode configured

Have the same speed configured; if they are configured with speed AUTO, they have to negotiate the same speed when they become active, and if a member negotiates a different speed, it will be suspended

Have the same maximum transmission unit (MTU) value configured

Have the same duplex mode configured

Have the same Ethernet layer (switchport or no switchport) configured

Not be SPAN ports

Have the same storm control configured

Have the same flow control configured

Have common capabilities

Be switching ports (Layer 2)

Have the same port access VLAN

Have the same port native VLAN

Have the same port-allowed VLAN list

vPC Consistency Checks

Similar to regular PortChannels, virtual PortChannels are subject to consistency checks and compatibility checks. During a compatibility check, one vPC peer conveys configuration information to the other vPC peer to verify that vPC member ports can actually form a PortChannel. For example, if two ports that are going to join the channel carry a different set of VLANs, this is a misconfiguration.

Depending on the severity of the misconfiguration, vPC may either warn the user (Type-2 misconfiguration) or suspend the PortChannel (Type-1 misconfiguration). In the specific case of a VLAN mismatch, only the VLAN that differs between the vPC member ports will be suspended on all the vPC PortChannels.

You can verify the consistency between vPC peers by using the command show vpc consistency-parameter:

tc-nexus5k02# show vpc consistency-parameter

Inconsistencies can be global or interface specific:

Global inconsistencies: Type-1 global inconsistencies affect all vPC member ports (but do not affect non-vPC ports)

Interface-specific inconsistencies: Type-1 interface-specific inconsistencies affect only the interface itself

Examples of areas where Type-1 inconsistencies may occur include:

Multiple Spanning Tree (MST) region definition (VLAN-to-instance mapping)

MTU value

Spanning-tree global settings (Bridge Assurance, loop guard, and root guard)

Configuration changes to the following (these affect only individual vPCs for all VLANs on the vPC):

- PortChannel mode

- Trunk mode

- Spanning-tree interface settings

Note: Mismatched quality-of-service (QoS) definitions were originally Type-1 inconsistencies, but in newer releases are Type-2 inconsistencies. For the Cisco Nexus 5000 Series, starting from Cisco NX-OS 5.0(2)N1(1) QoS inconsistencies are categorized as Type 2, and so they do not bring down vPC member ports if the configurations differ between vPC peers.

The main inconsistencies that you need to be aware of are listed in Table 1. This table also shows which inconsistencies are global (that is, which bring down all vPCs) and indicates recommended operations to avoid disruptions.

Table 1. vPC Consistency Checks

Type-1 Inconsistency

Impact

Recommendation

VLAN to MST Region mapping mismatch

Global

Pre-provision and MAP all VLANs on the MST region

System MTU

Global

Operate change during maintenance window

Rapid-PVST+ Asymmetrically Disabled

Global

Disabling STP is NOT a Best practice

STP global settings (BA, Loop Guard, Root Guard)

Global

Use per-interface STP configurations

STP Mode mismatch

Global

None (Network misconfiguration)

Port-channel mode (active/on)

vPC

Operate change during maintenance window

Port MTU/Link Speed/Duplex mode/Qos

vPC

Operate change during maintenance window

Trunk mode and Native VLAN

vPC

Operate change during maintenance window

STP interface settings

vPC

Operate change during maintenance window

Asymmetric VLANs on the trunk

VLAN on vPC

Acceptable Impact

vPC Configuration Synchronization

A vPC allows two links that are physically connected to two Cisco Nexus switches to appear as a single PortChannel. Some configurations must be identical on both switches for vPCs to forward traffic. Such configurations include port mode, channel mode, speed, and duplex.

The config-sync command simplifies the management of vPCs by synchronizing vPC configurations between primary and secondary vPC peers.

vPC config-sync is currently available on the Cisco Nexus 5000 Series starting from Cisco NX-OS 5.0(2)N1(1).

The config-sync feature uses the concept of the configuration profile. The switch profile is the construct that allows configurations to be applied both locally and on the config-sync peer. The config-sync peer definition is independent of the vPC peer definition and is specified in the switch profile configuration mode as follows:

Nexus5000(config-sp)# sync-peers destination {destination IPs}+ [source <source IP> | vrf <vrf>]

Note: Even if the config-sync peer is the same as the vPC peer device, the config-sync infrastructure has been designed so that it can be decoupled from vPC. Thus, you need to define the config-sync peer even in presence of a vPC configuration.

After the config-sync peer has been defined, the configuration that uses vPC config-sync appears as follows:

Switch# config sync
Switch(config-sync)# switch-profile profiledefinition
Switch(config-sp)# interface Port-channel100
Switch(config-sp-if)# interface Ethernet1/1
Switch(config-sp-if)# channel-group 100
Switch(config-sp-if)# exit
Switch(config-sp)# commit

Configurations are applied only after the user enters a commit command. The configuration is synchronized with the remote peer through the mgmt0 interface using routable Cisco Fabric Service protocol over IP. If the remote peer cannot be reached, the configuration is applied only locally.

All commit operations follow the two-phase commit approach: If the config-sync peer is reachable, either the configuration is fully committed on both peers or it is rolled back on both. If the config-sync peer is not reachable, then the configuration is applied only locally. When the peer becomes reachable, the configurations are merged.

Duplicate Frames Prevention in vPC

One of the most important forwarding rules for vPC is that a frame that enters the vPC peer switch from the peer link cannot exit the switch from a vPC member port.

Figure 13 shows switches 3 and 4 connected to 5k01 and 5k02 with vPCs Po51 and Po52. If one of the hosts connected to switch 4 sends either an unknown unicast or a broadcast, this traffic may get hashed to port eth2/2 on PortChannel 52. 5k02 receives the broadcast and needs to forward it to the peer link for the potential orphan ports on 5k01 to receive it.

Upon receiving the broadcast, 5k01 detects that this frame is coming from the vPC peer link. Therefore, it does not forward it to port 2/9 or 2/10; if it did, a duplicate frame on switch 3 or 4, respectively, would be created.

If a host on switch 4 sends a broadcast, 5k02 will correctly forward it to Po51 on port 2/9 and place it on the peer link. 5k01 will prevent this broadcast frame from exiting onto port 2/9 or 2/10 because this frame entered 5k01 from a vPC peer link. Should eth2/2 on switch 3 go down, port 2/9 on 5k01 would become an orphan port and as a result will receive traffic that traverses the peer link.

Figure 13. vPC Does Not Introduce Duplicate Frames

It is also important to realize that a topology based on PortChannels does not introduce loops, even if the peer link is lost and all the ports are forwarding. Figure 14 shows why.

Figure 14 shows the worst-case scenario of a vPC dual-active failure in which both peer-link and peer-keepalive-link connectivity are lost. In this particular case, one switch is running spanning tree (switch 4) with links that are not in PortChannel mode, and the other switches are configured in PortChannel mode.

With all links forwarding, a broadcast frame or an unknown unicast generated on switch 4, for example, is forwarded on both links directed to switches 1 and 2. When these two frames arrive on switch 3, they are not sent back to the PortChannel because that breaks the basic rule of Layer 2 forwarding: a frame cannot return to the port from which it originated.

Figure 14. Worst Case of Dual-Active Failure

vPC and Object Tracking

The object tracking feature available in NXOS can be used to associate the vPC status of a vPC device with the status of the interfaces that are tracked. The following explample clarifies. At the time of this writing Object Tracking is only available on the Cisco Nexus 7000 Series.

You can use a single 10 Gigabit Ethernet card on the Cisco Nexus 7000 Series for both core connectivity and the peer link, but this is not the best option. If you lose the 10 Gigabit Ethernet card on the vPC primary device, you lose not only core connectivity, but also the peer link. As a result, ports will be shut down on the peer vPC device, isolating the servers completely.

You can address this specific configuration requirement with a tracking configuration. The objects being tracked are the uplinks to the core and the peer link. If these links are lost, vPCs local to the switch are shut down so that traffic can continue on the vPC peer.

To configure this feature, use the following command syntax:

! Track the vpc peer link
track 1 interface port-channel110 line-protocol
! Track the uplinks to the core
track 2 interface Ethernet7/9 line-protocol
track 3 interface Ethernet7/10 line-protocol
! Combine all tracked objects into one.
! “OR” means if ALL object are down, this object will go down
! --> we have lost all connectivity to the core and the peer link
track 10 list boolean OR
object 1
object 2
object 3
! If object 10 goes down on the primary vPC peer,
! system will switch over to other vPC peer and disable all local vPCs
vpc domain 1
track 10

In-Service Software Upgrade and vPC

In presence of vPC, you can upgrade a device such as a Cisco Nexus 7000 Series Switch using In-Service Software Upgrade (ISSU) with no disruption to the traffic. However, if someone modifies the vPC configuration during the upgrade, it will cause an inconsistency between the vPC peer devices (the one being upgraded and the other device).

To avoid this undesirable situation, vPC can lock the configuration on the device that is not undergoing the upgrade and release it when the upgrade is complete.

Starting from Cisco NX-OS 4.2(1)N1(1), you can upgrade the Cisco Nexus 5000 Series with ISSU. In this case, the control plane can be unavailable for up to 80 seconds, while the data plane continues to forward traffic. Because of this behavior, for Cisco Nexus 5000 Series Switch to undergo ISSU, it cannot be the root switch of a Layer 2 topology, nor can it have designated ports (except edge ports and the vPC peer-link).

To use ISSU on a Cisco Nexus 5000 Series vPC topology, make sure that the peer keepalive is not a Layer 2 link between the Cisco Nexus 5000 Series Switch(this would be designated port).

ISSU on the Cisco Nexus 5000 Series requires Bridge Assurance to be disabled on all links except the vPC peer link. You can then use ISSU with the vPC peer link configured for Bridge Assurance (which is the default configuration).

Interactions Between vPC and Routing

vPC and routing concurrently coexist without problems on the same switch. A Layer 3 switch configured for vPC provides an aggregation layer that is Layer 3 connected to the core and Layer 2 connected to the access layer with vPCs.

Be sure to distinguish between a design where the vPC switch routing on Layer 3 or 2 links, and a design where the vPC switch is specifically exchanging routing updates over the Layer 2 vPC links. This latter scenario is typically relevant only to data center interconnect (DCI) designs, a topic that is not discussed in this guide.

HSRP Gateway Considerations

The use of HSRP in the context of vPC does not require any special configuration. The active HSRP interface answers ARP requests like normal HSRP deployments do, but with vPC both HSRP interfaces (active and standby) can forward traffic.

HSRP Configuration and Best Practices for vPC

The configuration on the HSRP primary device looks like this:

interface vLAN50
no shutdown
ip address 10.50.0.251/24
hsrp 50
preempt delay minimum 180
priority 150
timers 1 3
ip 10.50.0.1

The configuration on the HSRP secondary device looks like this:

interface vLAN50
no shutdown
ip address 10.50.0.252/24
hsrp 50
preempt delay minimum 180
priority 130
timers 1 3
ip 10.50.0.1

The most significant difference between the HSRP implementation of a non-vPC configuration and a vPC configuration is that the HSRP MAC addresses of a vPC configuration are programmed with the G (gateway) flag on both systems, compared with a non-vPC configuration, in which only the active HSRP interface can program the MAC address with the G flag.

Given this fact, routable traffic can be forwarded by both the vPC primary device (with HSRP) and the vPC secondary device (with HSRP), with no need to send this traffic to the HSRP primary device.

Without this flag, traffic sent to the MAC address would not be routed.

The following code shows the MAC address table programming on the vPC peer with HSRP in active mode for a given VLAN and the vPC peer with HSRP in standby mode for that same VLAN.

vPC HSRP on active:
G - 0000.0c07.ac01 static
vPC HSRP on standby:
G - 0000.0c07.ac01 static
In a non-vPC environment, the HSRP MAC looks as follows:
On Active: G - 0000.0c07.ac01 static
On Standby: * - 0000.0c07.ac01 static

ARP Synchronization

Starting from Cisco NX-OS 5.0(2) and 4.2(6), Layer 3 vPC peers synchronize their respective ARP tables. This feature is transparently enabled and helps ensure faster convergence time upon reload of a vPC switch. When two switches are reconnected after a failure, they use Cisco Fabric Services protocol over Ethernet to perform bulk synchronization of the ARP table.

Peer Gateway

If a host or a switch forwards a frame to the Layer 3 gateway and this Layer 3 gateway is present on a vPC pair of switches, so long as the frame ID is destined to the HSRP MAC address everything works as expected.

If the frame that is sent to the Layer 3 gateway uses the MAC burned-in-address instead of the HSRP MAC address, the PortChannel hashing of the frame may forward it to the wrong vPC peer, which would then just bridge the frame to the other vPC peer.

This scenario can be problematic because if the vPC peer that owns the MAC address routes the frame to a vPC member port, this frame will not be able to leave the switch, because the vPC duplicate prevention rule would apply: no frame that comes from a peer link is allowed to exit the switch on a vPC member port.

Figure 15 shows the case in which device A sends traffic to remote MAC (RMAC) address A with a PortChannel hash that forwards the traffic to switch B. The result is that the frame cannot get to server B because of the duplicate prevention rule.

Figure 15. The problem addressed by the peer-gateway feature

To address this forwarding scenario, you should configure the peer-gateway command under the vPC domain. This command enables the vPC peers to exchange information about their respective BIA MAC addresses so they can forward the traffic locally without having to send it over the peer link.

Layer 3 Link Between vPC Peers

In vPC designs, you should make sure to include a Layer 3 link or VLAN between the Layer 3 switching vPC peers so that the routing areas are adjacent. Also, you can consider HSRP tracking in non-vPC designs, but not in vPC designs.

HSRP tracking is not recommended for the reasons illustrated in Figure 16. Imagine that traffic from n5k on VLAN60 needs to be routed to n5k on VLAN 50. As a result of a core link failure, HSRP tracking shuts down switch virtual interface (SVI) 60 on Agg2 and forces the VLAN60-to-VLAN50 traffic to Agg1. Agg1 routes from SVI 60 to SVI 50 and then forwards to Po52 to reach n5k. vPC prevents this forwarding behavior as previously explained.

Because of this behavior, you should create a Layer 3 path on the peer link between the routing engines on Agg2 and Agg1 instead of using HSRP tracking.

Figure 16. HSRP Tracking Is Not Needed or Suitable for vPC Designs

The following code shows how to create a Layer 3 link to connect the aggregation layer switches to reroute the traffic to Agg1 if the routed uplinks of Agg2 fail:

tc-nexus7k01-vdc2(config)# vlan 3
tc-nexus7k01-vdc2(config-vLAN)# name l3_vlan
tc-nexus7k01-vdc2(config-vLAN)# exit
tc-nexus7k02-vdc2(config)# int vlan 3
tc-nexus7k02-vdc2(config-if)# ip address 10.3.0.2 255.255.255.252
tc-nexus7k02-vdc2(config-if)# ip router ospf 1 area 0.0.0.0
tc-nexus7k02-vdc2(config-if)# no shut
tc-nexus7k01-vdc2(config)# int Port channel 10
tc-nexus7k01-vdc2(config-if)# switchport trunk allowed vLAN add 3
tc-nexus7k01-vdc2# show ip ospf neigh
OSPF Process ID 1 VRF default
Total number of neighbors: 3
Neighbor ID Pri State Up Time Address Interface
128.0.0.3 1 FULL/DR 01:03:05 10.51.35.126 vLAN10

Layer 3 Link to the Core

At the time of this writing, we recommend the use of Layer 3 links to connect the vPC aggregation layer with the Layer 3 core instead of the use of vPC PortChannels for Layer 3 connectivity.

Figure 17 shows why. The design on the left shows of a router connected with a Layer 3 vPC to Cisco Nexus Switches Switch1 and Switch2. At the time of this writing this design does not work. Imagine that client 1 sends traffic to server 1. Router 1 has Switch1 and Switch2 as neighbors, so it load-balances the routed traffic to both BIA MAC addresses of routers 1 and 2. The PortChannel hashing is independent and may forward the routed frame with the BIA MAC address of Switch2 to Switch1 (and Switch1 to Switch2). In this case, the frame would traverse the peer link to be then routed to the PortChannel Po2. At this point, the duplicate prevention rule would intervene, and the frame would be dropped.

Thus, at the time of this writing the connectivity between the core and the aggregation layers needs to follow the topology depicted on the right side of Figure 17.

Figure 17. Interactions Between vPC and Routing

Interactions with Multicast

This section discusses the most important interactions between multicast and vPC.

IGMP Snooping and vPC

Layer 2 forwarding of multicast traffic with vPC is based on a modified IGMP snooping behavior that consists mostly of synchronizing the IGMP entries between vPC primary and secondary devices.

In a vPC implementation, IGMP traffic entering a Cisco NX-OS device through a vPC PortChannel triggers hardware programming for the multicast entry on both vPC member devices. The synchronization of the IGMP information is performed over the peer link (the M1-to-M2 link in Figure 18) using Cisco Fabric Services over Ethernet.

Figure 18. IGMP Snooping with vPC

You can verify the vPC operations with IGMP by using this command:

switch# show ip igmp snooping statistics vlan 10
…..
CFS packets sent over VPC peer link: 13
CFS packets received over VPC peer link: 13
CFS packet errors: 0

When multicast traffic reaches a vPC peer, this traffic is replicated to the ports that joined a given group as well as to the peer link. The usual duplicate prevention rule of vPC applies, and as Figure 19 shows, the traffic goes from S1 to S2 over the peer link (M1 to M2), but Link 4 (L4) does not forward this traffic because L4 is a vPC member port.

Figure 19. Multicast Traffic Forwarding with vPC

Multicast traffic is copied over the peer link to help ensure that orphan ports get the multicast stream and to help with failure scenarios, such as the loss of Link 3 (L3) in Figure 19. This happens regardless of the presence of receivers on the vPC peer.

Because of this it is important to properly size the peer link to prevent the peer link from becoming the bottleneck in the infrastructure.

Thus, as a best practice for vPC designs, you should be sure to provision the peer link with sufficient links according to the bandwidth needs of your multicast traffic. Remember that all multicast traffic traverses the peer link.

Protocol Independent Multicast and vPC

At the time of this writing, vPC works with Protocol Independent Multicast Any Source Multicast (PIM-ASM) but not with Bidirectional (Bidir-PIM) or PIM Source-Specific Multicast (PIM-SSM).

In PIM-Sparse Mode the PIM Designated Router (DR) encapsulates the traffic from a given source and unicasts it to the rendezvous point. Conversely, traffic from a source is drawn toward the PIM designated router for forwarding on a VLAN.

In vPC environments, both aggregation-layer devices operate as PIM designated routers. This behavior allows a multicast source to send traffic and have the traffic hashed to either vPC peer, which will then simply forward the traffic to the rendezvous point.

When a receiver is located in a vPC VLAN, the IGMP reports are synchronized, and Layer 3 forwarding entries (*, G) are created on both vPC peers. Both vPC peers send PIM (*, G) joins to the upstream rendezvous point. As a result, both vPC peer switches draw traffic, causing temporary duplicates.

After a multicast source starts sending traffic, only one vPC peer becomes the forwarder for a given source and sends (S, G) joins. The choice of the forwarder is based on the distance to the source (if the distances are identical, the vPC primary is chosen) and converges on the designated data forwarder for these VLANs on a per-stream basis, to prevent duplicates.

In summary, with the dual-designated-router approach, both vPC peers have IGMP routes, but only one of the peers has the Outoing Inteface List for (S, G).

As with Layer 2 traffic, multicast traffic received from the core is copied to the peer link to reach potential orphan ports.

vPC Failure Scenarios

This section describes the expected behavior of a vPC design for various link failures.

vPC Member Port Failure

If one vPC member port goes down - for instance, if a link from a NIC goes down - the member is removed from the PortChannel without bringing down the vPC entirely. Conversely, the switch on which the remaining port is located will allow frames to be sent from the peer link to the vPC orphan port (ports; recall the vPC duplicate avoidance technique). The Layer 2 forwarding table for the switch that detected the failure is also updated to point the MAC addresses that were associated with the vPC port to the peer link.

vPC Complete Dual-Active Failure (Double Failure)

If both the peer link and the peer-keepalive link are disconnected, the Cisco Nexus switch does not bring down the vPC, because each Cisco Nexus switch cannot discriminate between a vPC device reload and a combined peer-link and peer-keepalive-link failure.

The main problem with a dual-active scenario is the lack of synchronization between the vPC peers over the peer link. This behavior causes IGMP snooping to malfunction, which in turn causes multicast traffic to drop.

As described previously, a vPC topology intrinsically protects against loops in dual-active scenarios. Each vPC peer, upon losing peer-link connectivity, starts forwarding BPDUs on vPC member ports. With the peer-switch feature, both vPC peers send BPDUs with the same bridge ID to help ensure that the downstream device does not detect a spanning-tree misconfiguration.

When the peer link and the peer-keepalive link are simultaneously lost, both vPC peers become operational primary. At the time of this writing, when connectivity between the peers is restored, the vPC secondary (operational primary) stays primary, and the vPC primary (operational primary) becomes the vPC primary (operational secondary).

If you want to restore the primary role on the vPC primary, you can change the priority on one vPC switch and then flap the peer-link, which causes renegotiation of the primary and secondary roles. This procedure is disruptive and it is described in the section “vPC Role and Priority” under “vPC Domain Configuration”.

vPC Peer-Link Failure

To prevent problems caused by dual-active devices, vPC shuts down vPC member ports on the secondary switch when the peer link is lost but the peer keepalive is still present.

When the peer link fails, the vPC peers verify their reachability over the peer-keepalive link, and if they can communicate they take the following actions:

The operational secondary vPC peer (which may not match the configured secondary because vPC is nonpreemptive) brings down the vPC member ports, including the vPC member ports located on the fabric extenders in the case of a Cisco Nexus 5000 Series design with fabric extenders in straight-through mode.

The secondary vPC peer brings down the vPC VLAN SVIs: that is, all SVIs for the VLANs that happen to be configured on the vPC peer link, whether or not they are used on a vPC member port.

Note: To keep the SVI interface up when a peer link fails, use the command dual-active exclude interface-vlan.

At the time of this writing, if the peer link is lost first, the vPC secondary shuts down the vPC member ports. If this failure is followed by a vPC peer-keepalive failure, the vPC secondary keeps the interfaces shut down. This behavior may change in the future with the introduction of the autorecovery feature, which will allow the secondary device to bring up the vPC ports as a result of this sequence of events.

vPC Peer-Keepalive Failure

If connectivity of the peer-keepalive link is lost but peer-link connectivity is not changed, nothing happens; both vPC peers continue to synchronize MAC address tables, IGMP entries, and so on. The peer-keepalive link is mostly used when the peer link is lost, and the vPC peers use the peer keepalive to resolve the failure and determine which device should shut down the vPC member ports.

Examples

Figure 20 illustrates what happens during vPC peer-link failure for vPC ports. Agg1 is the vPC primary, and Agg2 is the vPC secondary.

The sequence of events is as follows:

The vPC peer link fails, but Agg1 and Agg2 can still communicate through the routed path with the vPC peer-keepalive protocol

Eth2/9 and eth2/10 on Agg2 are shut down because they are part of vPC Po51 and Po52 respectively, and Agg2 is the operational secondary vPC device

SVI VLAN50 (vPC-VLAN) is shut down on the operational secondary device to prevent traffic from the core routers from reaching the vPC secondary device on which the vPC ports are shut down

Figure 20. Peer-Link Failure

As a result of the peer-link failure, all traffic in Figure 20 takes the path on the left through the vPC primary device. This is true both for the client-to-server traffic and the server-to-client traffic.

The following show command entered on the secondary vPC peer illustrates the results of the vPC peer-link failure:

tc-nexus7k02-vdc2# show vpc br
vPC domain id : 1
Peer status : peer link is down
vPC keep-alive status : peer is alive
vPC role : secondary
Dual Active Detected
vPC Peer Link Status
---------------------------------------------------------------------
id Port Status Active vLANs
-- ---- ------ --------------------------------------------------
1 Po10 down -
vPC status
----------------------------------------------------------------------
id Port Status Consistency Reason Active vLANs
-- ---- ------ ----------- -------------------------- ------------
51 Po51 down success success -

The access switch uses the remaining link:

tc-nexus5k01# show port channel summary
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
51 Po51(SU) Eth LACP Eth2/1(P) Eth2/2(D)

The peer-keepalive communication helps ensure that the loss of the peer-link path does not introduce any unwanted flooding or split-subnet scenarios.

Figure 21 shows the failure scenario in the presence of a fabric extender. The vPC operational secondary shuts down the vPC member port to host 1, which is directly attached to N5k01 and the vPC member port of host 2 connected to Cisco Nexus 2000 Series Fabric Extender N2k01.

Figure 21. vPC Peer-Link Failure on the Cisco Nexus 5000 Series

vPC with Fabric Extender Active-Active Design

The case of fabric extender dual-connected to the Cisco Nexus 5000 Series in vPC mode is slightly different from that of other vPC designs.

Starting from Cisco NX-OS 4.1(3), you can connect a fabric extender to two Cisco Nexus 5000 Series devices configured for vPC. The fabric extender is a satellite switch that depends on the Cisco Nexus 5000 Series for both configuration and forwarding.

In such a design, both Cisco Nexus 5000 Series Switches have equal rights to configure the fabric extender switch ports.

To address this design, in which each fabric extender is controlled by two entities (the two Cisco Nexus 5000 Series Switches), the implementation relies on the modeling of each fabric extender port as if it were two independent ports configured for vPC. The same fabric extender port appears on each Cisco Nexus 5000 Series Switch, and the Cisco Nexus 5000 Series vPC peers operate as if these two ports were forming a PortChannel - and in fact the Cisco Nexus 5000 Series Switches are configured in vPC mode according to all the previously described guidelines.

The 10 Gigabit Ethernet ports connecting the Cisco Nexus® 5000 to the fabric extender (switch-port mode fabric) are configured as vPC member ports, and the individual ports on the fabric extender, such as port eth100/1/1, appear on both nexus5k01 and nexus5k02, as shown in Figure 22.

Figure 22. Fabric Extender Active-Active Design

To keep the nexus5k01 and nexus5k02 configurations synchronized, starting from Cisco NX-OS 5.0(2)N1(1) you can use the configuration synchronization feature to define the fabric extender port configuration in a switch profile to help ensure consistency between the two configurations.

With this topology, PortChannels work on fabric extenders, but you cannot create a vPC from a server that is split between two fabric extenders (for this, you need to use the fabric extender straight-through topology).

The failure scenarios previously described for vPC member ports apply equally to the fabric extender ports. If the peer link is lost, the vPC secondary device shuts down the fabric ports that are connected to the secondary Cisco Nexus 5000 Series device.

vPC Configuration Best Practices

vPC Domain Configuration

vPC Role and Priority

A domain needs to be defined (as indicated by the domain ID) as well as priorities to define primary and secondary roles in the vPC configuration. The lower number has higher priority, so it wins. For two switches (vPC peers) to form a vPC system, the domain IDs of these switches need to match. As previously described, the domain ID is used to generate the LAGID in the LACP negotiation.

agg1(config)# vpc domain <domain-id>
agg1(config-vpc-domain)# role priority 100
agg2(config)# vpc domain <domain-id - same as agg1>
agg2(config-vpc-domain)# role priority 110

Note that the role is nonpreemptive, so a device may be operationally primary but secondary from a configuration perspective. Because spanning tree is preemptive, this behavior may result in a mismatch between the spanning-tree root and the vPC operational primary device, with no consequences for traffic forwarding.

Although mismatched spanning-tree and vPC priorities do not affect traffic forwarding, you still should keep the priorities matched to have the spanning-tree root and vPC primary on the same device and the spanning-tree secondary root and vPC secondary on the same device where applicable (this recommendation applies only at the aggregation layer). The main benefit is easier management. When the peer-switch command is used, both devices are configured with the same spanning-tree priority, so this recommendation does not apply.

After failover, the vPC operational primary and vPC operational secondary do not match the original configuration. You can restore matching by following these configuration steps: from the vPC operational primary, you can change the role priority to the highest value (32768) and then enter a shut/no shut command on the peer-link PortChannel.

You can also use a script such as the following:

7k-1(config)# cli alias name vpcpreempt conf t ; vpc domain <domain-id> ;
role priority 32767 ; int <peer-link> ; shut ; no sh *

Reload Restore

If the Cisco NX-OS version supports vPC reload restore, you should configure this feature under the vPC domain configuration:

vpc domain 1
role priority 100
peer-keepalive destination 10.51.35.140 source 10.51.35.133
reload restore

If you have a Cisco Nexus 5000 Series Switch and the reload restore feature is not available, you can configure peer-config-check-bypass as follows:

vpc domain 2
role priority 100
peer-keepalive destination 10.51.35.18
peer-config-check-bypass

Peer Gateway

If the vPC switch is also performing Layer 3 switching, it is useful to add the peer-gateway configuration in the vPC domain definition:

vpc domain 1
role priority 100
peer-keepalive destination 10.51.35.140 source 10.51.35.133
peer-gateway
reload restore

vPC Peer Link

The peer-link PortChannel connects vPC peers and carries all access VLANs (defined by the user). This link also carries additional traffic that the user does not need to define: more specifically, BPDUs and HSRP hellos and MAC address synchronization between the vPC peers.

This link is by far the most important component of the vPC system. Although its failure does not disrupt existing vPC flows, its failure can impair the establishment of new flows and isolate orphan ports. Configuring the peer link in a redundant fashion helps ensure essentially uninterrupted connectivity between the vPC peers. The following script illustrates how to configure the peer link, which in this case is PortChannel 10:

agg(config)# interface port-channel10
agg(config-if)# vpc peer-link
agg(config-if)# switchport trunk allowed vLAN <all access vLANs>

The configuration of the peer link automatically installs Bridge Assurance on the peer link. This configuration is compatible with ISSU, so you can keep Bridge Assurance enabled on this link.

The peer link carries a copy of the multicast traffic regardless of whether there are orphan ports that need to receive it. You should provision the bandwidth for the peer link accordingly.

vPC Peer Keepalive

The peer-keepalive connectivity should never be carried as a VLAN on the peer link; otherwise, it will not provide any benefit. Instead, it should be carried over a routed infrastructure, and it does not need to be a direct point-to-point link.

The following configuration illustrates the use of a dedicated Gigabit Ethernet interface for this purpose:

vrf context vpc-keepalive
interface Ethernet8/16
description tc-nexus7k02-vdc2 - vPC Heartbeat Link
vrf member vpc-keepalive
ip address 192.168.1.1/24
no shutdown
vpc domain 1
peer-keepalive destination 192.168.1.2 source 192.168.1.1 vrf vpc-keepalive

You should not use the mgmt0 interface for a direct back-to-back connection between Cisco Nexus 7000 Series systems because you cannot determine which supervisor is active at any given time. You can use it instead on the Cisco Nexus 5000 Series.

The mgmt0 interface can be used both for management and for routing the peer keepalive through the out-of-band management network. In this case, each Cisco Nexus 7000 Series Switch is connected to the management network through mgmt0 of supervisor slots 5 and 6 and the Cisco Nexus 5000 Series through the single mgmt0 interface.

By following this approach, regardless of which supervisor is active, the Cisco Nexus 7000 Series Switch has one of the mgmt0 interfaces connected to the management network, which can then be used for peer-keepalive purposes.

vPC Ports

PortChannels are configured by bundling Layer 2 ports (switch ports) on each Cisco Nexus switch through the command vpc, as shown in the following code. The system sends an error message if the PortChannel was not previously configured as a switch port.

agg1(config)#interface ethernet2/9
agg1(config-if)# channel-group 51 mode active
agg1(config)#interface Port-channel 51
agg1(config-if)# switchport
agg1(config-if)# vpc 51
!
agg2(config)#interface ethernet2/9
agg2(config-if)# channel-group 51 mode active
agg2(config)#interface Port-channel 51
agg2(config-if)#switchport
agg2(config-if)# vpc 51

If the consistency check does not show success, you should verify the consistency parameters. Typical reasons that a vPC may not form include the following:

The VLAN that is defined in the trunk does not exist, or it is not defined on the peer link

One member port is configured as the access and the other as the trunk

Mismatches exist in the VLANs that are carried on the trunk, etc

The following example shows how to verify that the vPC configuration is consistent between two vPC peers for the global consistency parameter as well as for a specific PortChannel:

tc-nexus7k01-vdc2# show vpc consistency-parameters global
tc-nexus7k01-vdc2# show vpc consistency-parameters int port-channel 51
Legend:
Type 1 : vPC will be suspended in case of mismatch
Name Type Local Value Peer Value
------------- ---- ---------------------- -----------------------
STP Port Type 1 Default Default
STP Port Guard 1 None None
STP MST Simulate PVST 1 Default Default
Allowed vLANs - 10-14,21-24,50,60 10-14,21-24,50,60

After a port is defined as part of a PortChannel, any additional configurations, such as activation or disablement of Bridge Assurance or trunking mode, are performed in the interface PortChannel configuration mode. Trying to configure spanning-tree properties for the physical interface instead of the PortChannel will result in an error message.

LACP

You should use LACP for dynamic bundling of the ports in the vPC group, because LACP verifies that the ports being bundled are actually part of the same physical or virtual switch, preventing erroneous configurations.

For example, if the PortChannel is configured as active on the Cisco Nexus 7000 Series Switch and the downstream switch is not configured for PortChannel, the PortChannel ports will be shown as in the individual (I) state and will run regular spanning tree.

After the access layer switches are configured for LACP, the negotiation completes the PortChannel forms:

tc-nexus5k01(config)# int eth2/1-2
tc-nexus5k01(config-if-range)# channel-group 51 mode passive

The PortChannel on the Cisco Nexus 5000 Series access switches becomes active, indicating that the LACP negotiation is functioning between the upstream vPC system and the Cisco Nexus 5000 Series:

tc-nexus5k01# show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
S - Switched R - Routed
U - Up (port-channel)
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
51 Po51(SU) Eth LACP Eth2/1(P) Eth2/2(P)

The PortChannel on the Cisco Nexus 7000 Series Switch also becomes active because of the LACP negotiation:

tc-nexus7k01-vdc2# show vpc br
[…]
vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vLANs
-- ---- ------ --------------------------------------------------
1 Po10 up 10-14,21-24,50,60
vPC status
----------------------------------------------------------------------
id Port Status Consistency Reason Active vLANs
-- ---- ------ ----------- -------------------------- ------------
51 Po51 up success success 10-14,21-24
,50,60

If the PortChannel ports are suspended, a mismatch occurred in the PortChannel ports between the switches that are supposed to bring up the PortChannel. For example, a vPC on the Cisco Nexus 7000 Series is configured with ports that individually connect to two different PortChannels on the Cisco Nexus 5000 Series.

Alternatively, if the access-layer ports are not configured for a channel, the Cisco Nexus 7000 and 5000 Series will operate normally with spanning tree. If the ports on the Cisco Nexus 5000 Series are configured in passive channel-group mode and the Cisco Nexus 7000 Series ports are not configured for PortChannels, the Cisco Nexus 7000 and 5000 Series run spanning tree again on those ports.

For More Information

Cisco Nexus® 5000 page: http://www.cisco.com/go/nexus5000

Cisco Nexus® 7000 page: http://www.cisco.com/go/nexus7000