Guidelines and Limitations for VXLAN
VXLAN has the following guidelines and limitations:
ACL Direction |
ACL Type |
VTEP Type |
Port Type |
Flow Direction |
Traffic Type |
Supported |
---|---|---|---|---|---|---|
Ingress |
PACL |
Ingress VTEP |
L2 port |
Access to Network [GROUP:encap direction] |
Native L2 traffic [GROUP:inner] |
YES |
VACL |
Ingress VTEP |
VLAN |
Access to Network [GROUP:encap direction] |
Native L2 traffic [GROUP:inner] |
YES |
|
Ingress |
RACL |
Ingress VTEP |
Tenant L3 SVI |
Access to Network [GROUP:encap direction] |
Native L3 traffic [GROUP:inner] |
YES |
Egress |
RACL |
Ingress VTEP |
Uplink L3/L3-PO/SVI |
Access to Network [GROUP:encap direction] |
VXLAN encap [GROUP:outer] |
NO |
Ingress |
RACL |
Egress VTEP |
Uplink L3/L3-PO/SVI |
Network to Access [GROUP:decap direction] |
VXLAN encap [GROUP:outer] |
NO |
Egress |
PACL |
Egress VTEP |
L2 port |
Network to Access [GROUP:decap direction] |
Native L2 traffic [GROUP:inner] |
NO |
VACL |
Egress VTEP |
VLAN |
Network to Access [GROUP:decap direction] |
Native L2 traffic [GROUP:inner] |
NO |
|
Egress |
RACL |
Egress VTEP |
Tenant L3 SVI |
Network to Access [GROUP:decap direction] |
Post-decap L3 traffic [GROUP:inner] |
YES |
ACL Direction |
ACL Type |
VTEP Type |
Port Type |
Flow Direction |
Traffic Type |
Supported |
---|---|---|---|---|---|---|
Ingress |
PACL |
Ingress VTEP |
L2 port |
Access to Network [GROUP:encap direction] |
Native L2 traffic [GROUP:inner] |
YES (works only for base port PO) |
Egress |
PACL |
Egress VTEP |
L2 port |
Network to Access[GROUP:decap direction] |
Native L2 traffic [GROUP:inner] |
NO |
Ingress |
VACL |
Ingress VTEP |
VLAN |
Access to Network [GROUP:encap direction] |
Native L2 traffic [GROUP:inner] |
YES |
Egress |
VACL |
Egress VTEP |
VLAN |
Network to Access [GROUP:decap direction] |
Native L2 traffic [GROUP:inner] |
YES |
Ingress |
RACL |
Ingress VTEP |
Tenant L3 SVI |
Access to Network [GROUP:encap direction] |
Native L3 traffic [GROUP:inner] |
YES |
Egress |
RACL |
Egress VTEP |
Tenant L3 SVI |
Network to Access [GROUP:decap direction] |
Post-decap L3 traffic [GROUP:inner] |
YES |
Ingress |
RACL |
Egress VTEP |
Uplink L3/L3-PO/SVI |
Network to Access [GROUP:decap direction] |
VXLAN encap [GROUP:outer] |
NO |
Egress |
RACL |
Ingress VTEP |
Uplink L3/L3-PO/SVI |
Access to Network [GROUP:encap direction] |
VXLAN encap [GROUP:outer] |
NO |
-
Non-blocking Multicast (NBM) running on a VXLAN enabled switch is not supported. Feature nbm may disrupt VXLAN underlay multicast forwarding.
-
The lacp vpc-convergence command can be configured in VXLAN and non-VXLAN environments that have vPC port channels to hosts that support LACP.
-
When entering the no feature pim command, NVE ownership on the route is not removed so the route stays and traffic continues to flow. Aging is done by PIM. PIM does not age out entries having a VXLAN encap flag.
-
Beginning with Cisco NX-OS Release 7.0(3)I7(3), Fibre Channel over Ethernet (FCoE) N-port virtualization (NPV) can co-exist with VXLAN on different fabric uplinks but on same or different front panel ports on the Cisco Nexus 93180YC-EX and 93180YC-FX switches.
Fibre Channel N-port virtualization (NPV) can co-exist with VXLAN on different fabric uplinks but on same or different front panel ports on the Cisco Nexus 93180YC-FX switches. VXLAN can only exist on the Ethernet front panel ports, but not on the FC front panel ports.
-
Beginning with Cisco NX-OS Release 7.0(3)I7(3), VXLAN is supported on the Cisco Nexus 9348GC-FXP switch.
-
When SVI is enabled on a VTEP (flood and learn, or EVPN) regardless of ARP suppression, make sure that ARP-ETHER TCAM is carved using the hardware access-list tcam region arp-ether 256 double-wide command. This is not applicable to the Cisco Nexus 9200 and 9300-EX platform switches and Cisco Nexus 9500 platform switches with 9700-EX line cards.
-
IP Unnumbered for VXLAN underlay is supported starting with Cisco NX-OS Release 7.0(3)I7(2). Only single unnumbered link between same devices (for example, spine - leaf) is supported. If multiple physical links are connecting the same leaf and spine, you must use the single L3 port-channel with unnumbered link.
-
For information about the load-share keyword usage for the PBR with VXLAN feature, see the Guidelines and Limitations section of the Configuring Policy-Based Routing chapter of the Cisco Nexus 9000 Series NX-OS Unicast Routing Configuration Guide, Release 7.x.
-
For Cisco NX-OS Release 7.0(3)F3(3) the following features are not supported:
-
VXLAN with vPC is not supported.
-
DHCP snooping, ACL, and QoS policies are not supported on VXLAN VLANs.
-
IGMP snooping is not supported on VXLAN enabled VLANs.
-
-
Beginning with Cisco NX-OS Release 7.0(3)F3(3), VXLAN Layer 2 Gateway is supported on the 9636C-RX line card. VXLAN and MPLS cannot be enabled on the Cisco Nexus 9508 switch at the same time.
-
Beginning with Cisco NX-OS Release 7.0(3)F3(3), if VXLAN is enabled, the Layer 2 Gateway cannot be enabled when there is any line card other than the 9636C-RX.
-
Beginning with Cisco NX-OS Release 7.0(3)F3(3), PIM/ASM is supported in the underlay ports. PIM-BiDir is not supported. For more information, see the Cisco Nexus 9000 Series NX_OS Multicast Routing Configuration Guide, Release 7.x.
-
Beginning with Cisco NX-OS Release 7.0(3)F3(3), IPv6 hosts routing in the overlay is supported.
-
Beginning with Cisco NX-OS Release 7.0(3)F3(3), ARP suppression is supported.
-
Beginning with Cisco NX-OS Release 7.0(3)I7(1), the keyword has been added to the Configuring a Route Policy procedure for the PBR over VXLAN feature.
For more information, see the Cisco Nexus 9000 Series NX_OS Unicast Routing Configuration Guide, Release 7.x.
-
Beginning with Cisco NX-OS Release 7.0(3)I6(1), a new CLI command lacp vpc-convergence is added for better convergence of Layer 2 EVPN VXLAN:
interface port-channel10 switchport switchport mode trunk switchport trunk allowed vlan 1001-1200 spanning-tree port type edge trunk spanning-tree bpdufilter enable lacp vpc-convergence vpc 10 interface Ethernet1/34 <- The port-channel member-port is configured with LACP-active mode (for example, no changes are done at the member-port level.) switchport switchport mode trunk switchport trunk allowed vlan 1001-1200 channel-group 10 mode active no shutdown
-
Beginning with Cisco NX-OS Release 7.0(3)I6(1), port-VLAN with VXLAN is supported on Cisco Nexus 9300-EX and 9500 Series switches with 9700-EX line cards with the following exceptions:
-
Only Layer 2 (no routing) is supported with port-VLAN with VXLAN on these switches.
-
No inner VLAN mapping is supported.
-
-
Beginning with Cisco NX-OS Release 7.0(3)I6(1), VXLAN is supported on Cisco Nexus 3232C and 3264Q switches. Cisco Nexus 3232C and 3264Q switches do not support inter-VNI routing.
IGMP snooping on VXLAN enabled VLANs is not supported in Cisco Nexus 3232C and 3264Q switches. VXLAN with flood and learn and Layer 2 EVPN is supported in Cisco Nexus 3232C and 3264Q switches.
-
The system nve ipmc CLI command is not applicable to the Cisco 9200 and 9300-EX platform switches and Cisco 9500 platform switches with 9700-EX line cards.
-
Bind NVE to a loopback address that is separate from other loopback addresses that are required by Layer 3 protocols. A best practice is to use a dedicated loopback address for VXLAN. This best practice should be applied not only for the VPC VXLAN deployment, but for all VXLAN deployments.
-
To remove configurations from an NVE interface, we recommend manually removing each configuration rather than using the default interface nve command.
-
When SVI is enabled on a VTEP (flood and learn or EVPN), make sure that ARP-ETHER TCAM is carved using the hardware access-list tcam region arp-ether 256 CLI command. This is not applicable to Cisco 9200 and 9300-EX Series switches and Cisco 9500 Series switches with 9700-EX line cards.
-
show commands with the internal keyword are not supported.
-
FEX ports do not support IGMP snooping on VXLAN VLANs.
-
Beginning with Cisco NX-OS Release 7.0(3)I4(2), VXLAN is supported for the Cisco Nexus 93108TC-EX and 93180YC-EX switches and for Cisco Nexus 9500 Series switches with the X9732C-EX line card.
-
DHCP snooping (Dynamic Host Configuration Protocol snooping) is not supported on VXLAN VLANs.
-
RACLs are not supported on Layer 3 uplinks for VXLAN traffic. Egress VACLs support is not available for de-capsulated packets in the network to access direction on the inner payload.
As a best practice, use PACLs/VACLs for the access to the network direction.
-
QoS classification is not supported for VXLAN traffic in the network to access direction on the Layer 3 uplink interface.
-
The QoS buffer-boost feature is not applicable for VXLAN traffic.
-
For 7.0(3)I1(2), Cisco Nexus 9500 platform switches do not support VXLAN tunnel endpoint functionality, however they can be used as spines.
-
SVI and subinterfaces as uplinks are not supported.
-
VTEPs do not support VXLAN encapsulated traffic over Parent-Interfaces if subinterfaces are configured. This is regardless of VRF participation.
-
VTEPs do not support VXLAN encapsulated traffic over subinterfaces. This is regardless of VRF participation or IEEE 802.1q encapsulation.
-
Mixing Sub-Interfaces for VXLAN and non-VXLAN enabled VLANs is not supported.
-
Point to multipoint Layer 3 and SVI uplinks are not supported.
-
For 7.0(3)I2(1) and later, a FEX HIF (FEX host interface port) is supported for a VLAN that is extended with VXLAN.
-
In an ingress replication VPC setup, Layer 3 connectivity is needed between vPC peer devices. This aids the traffic when the Layer 3 uplink (underlay) connectivity is lost for one of the vPC peers.
-
Rollback is not supported on VXLAN VLANs that are configured with the port VLAN mapping feature.
-
The VXLAN UDP port number is used for VXLAN encapsulation. For Cisco Nexus NX-OS, the UDP port number is 4789. It complies with IETF standards and is not configurable.
-
For 7.0(3)I2(1) and later, VXLAN is supported on Cisco Nexus 9500 Series switches with the following line cards:
-
9500-R
-
9564PX
-
9564TX
-
9536PQ
-
9700-EX
-
9700-FX
-
-
Cisco Nexus 9300 Series switches with 100G uplinks only support VXLAN switching/bridging. (7.0(3)I2(1) and later)
Cisco Nexus 9200, Cisco Nexus 9300-EX, and Cisco Nexus 9300-FX platform switches do not have this restriction.
Note
For VXLAN routing support, a 40G uplink module is required.
-
For 7.0(3)I2(1) and later, MDP is not supported for VXLAN configurations.
-
For 7.0(3)I2(1) and later, bidirectional PIM is not supported for underlay multicast.
-
Consistency checkers are not supported for VXLAN tables.
-
ARP suppression is supported for a VNI only if the VTEP hosts the First-Hop Gateway (Distributed Anycast Gateway) for this VNI. The VTEP and SVI for this VLAN must be properly configured for the Distributed Anycast Gateway operation (for example, global anycast gateway MAC address configured and anycast gateway with the virtual IP address on the SVI).
-
ARP suppression is a per-L2VNI fabric-wide setting in the VXLAN fabric. Enable or disable this feature consistently across all VTEPs in the fabric. Inconsistent ARP suppression configuration across VTEPs is not supported.
-
For Cisco Nexus 9200 platform switches that have the Application Spine Engine (ASE2). There exists a Layer 3 VXLAN (SVI) throughput issue. There is a data loss for packets of sizes 99–122. (7.0(3)I3(1) and later).
-
For the NX-OS 7.0(3)I2(3) release, the VXLAN network identifier (VNID) 16777215 is reserved and should not be configured explicitly.
-
For 7.0(3)I4(1) and later, VXLAN supports In Service Software Upgrade (ISSU).
-
VXLAN does not support co-existence with the GRE tunnel feature or the MPLS (static or segment-routing) feature on Cisco Nexus 9000 Series switches with a Network Forwarding Engine (NFE).
-
VTEP connected to FEX host interface ports is not supported (7.0(3)I2(1) and later).
-
In Cisco NX-OS Release 7.0(3)I4(1), resilient hashing (port-channel load-balancing resiliency) and VXLAN configurations are not compatible with VTEPs using ALE uplink ports.
Note
Resilient hashing is disabled by default.
-
If multiple VTEPs use the same multicast group address for underlay multicast but have different VNIs, the VTEPs should have at least one VNI in common. Doing so ensures that NVE peer discovery occurs and underlay multicast traffic is forwarded correctly. For example, leafs L1 and L4 could have VNI 10 and leafs L2 and L3 could have VNI 20, and both VNIs could share the same group address. When leaf L1 sends traffic to leaf L4, the traffic could pass through leaf L2 or L3. Because NVE peer L1 is not learned on leaf L2 or L3, the traffic is dropped. Therefore, VTEPs that share a group address need to have at least one VNI in common so that peer learning occurs and traffic is not dropped. This requirement applies to VXLAN bud-node topologies.
-
NVE source interface loopback for VTEP should only be IPv4 address. Use of IPv6 address for NVE source interface is not supported.
-
Next hop address in overlay (in bgp l2vpn evpn address family updates) should be resolved in underlay URIB to the same address family. For example, the use of VTEP (NVE source loopback) IPv4 addresses in fabric should only have BGP l2vpn evpn peering over IPv4 addresses.
-
The following features are not supported:
-
Consistency checkers are not supported for VXLAN tables.
-
DHCP snooping and DAI features are not supported on VXLAN VLANs.
-
IPv6 for VXLAN EVPN ESI MH is not supported.
-
Native VLANs for VXLAN are not supported. All traffic on VXLAN Layer 2 trunks needs to be tagged. This limitation is applicable to Cisco Nexus 9300 and 9500 switches with 95xx line cards. This is not applicable to Cisco Nexus 9200, 9300-EX, 9300-FX, and 9500 platform switches with -EX or -FX line cards.
-
QoS buffer-boost is not applicable for VXLAN traffic.
-
QoS classification is not supported for VXLAN traffic in the network-to-host direction as ingress policy on uplink interface.
-
Static MAC pointing to remote VTEP (VXLAN Tunnel End Point) is not supported with BGP EVPN (Ethernet VPN).
-
TX SPAN (Switched Port Analyzer) for VXLAN traffic is not supported for the access-to-network direction.
-
VXLAN routing and VXLAN Bud Nodes features on the 3164Q platform are not supported.
-
-
The following ACL related features are not supported:
-
Egress RACL that is applied on an uplink Layer 3 interface that matches on the inner or outer payload in the access-to-network direction (encapsulated path).
-
Ingress RACL that is applied on an uplink Layer 3 interface that matches on the inner or outer payload in the network-to-access direction (decapsulated path).
-
Considerations for VXLAN Deployment
-
When configuring VXLAN BGP EVPN, only the "System Routing Mode: Default" is applicable for the following hardware platforms:
-
Cisco Nexus 9200/9300-EX/FX/FX2
-
Cisco Nexus 9300 platform switches
-
Cisco Nexus 9500 platform switches with X9500 line cards
-
Cisco Nexus 9500 platform switches with X9700-EX/FX/FX2 line cards
-
-
The “System Routing Mode: template-vxlan-scale” is not applicable to Cisco NX-OS Release 7.0(3)I5(2) and later.
-
When using VXLAN BGP EVPN in combination with Cisco NX-OS Release 7.0(3)I4(x) or NX-OS Release 7.0(3)I5(1), the “System Routing Mode: template-vxlan-scale” is required on the following hardware platforms:
-
Cisco Nexus 9300-EX Switches
-
Cisco Nexus 9500 Switches with X9700-EX line cards
-
-
Changing the “System Routing Mode” requires a reload of the switch.
-
A loopback address is required when using the source-interface config command. The loopback address represents the local VTEP IP.
-
During boot-up of a switch (7.0(3)I2(2) and later), you can use the source-interface hold-down-time hold-down-time command to suppress advertisement of the NVE loopback address until the overlay has converged. The range for the hold-down-time is 0 - 2147483647 seconds. The default is 300 seconds.
-
To establish IP multicast routing in the core, IP multicast configuration, PIM configuration, and RP configuration is required.
-
VTEP to VTEP unicast reachability can be configured through any IGP protocol.
-
In VXLAN flood and learn mode (7.0(3)I1(2) and earlier), the default gateway for VXLAN VLANs should be provisioned on external routing devices.
In VXLAN flood and learn mode (7.0(3)I2(1) and later), the default gateway for VXLAN VLAN is recommended to be a centralized gateway on a pair of VPC devices with FHRP (First Hop Redundancy Protocol) running between them.
In BGP EVPN, it is recommended to use the anycast gateway feature on all VTEPs.
-
For flood and learn mode (7.0(3)I2(1) and later), only a centralized Layer 3 gateway is supported. Anycast gateway is not supported. The recommended Layer 3 gateway design would be a pair of switches in VPC to be the Layer 3 centralized gateway with FHRP protocol running on the SVIs. The same SVI's cannot span across multiple VTEPs even with different IP addresses used in the same subnet.
Note
When configuring SVI with flood and learn mode on the central gateway leaf, it is mandatory to configure hardware access-list tcam region arp-ether size double-wide . (You must decrease the size of an existing TCAM region before using this command.)
For example:
hardware access-list tcam region arp-ether 256 double-wide
Note
Configuring the hardware access-list tcam region arp-ether size double-wide is not required on Cisco Nexus 9200 Series switches.
-
When configuring ARP suppression with BGP-EVPN, use the hardware access-list tcam region arp-ether size double-wide command to accommodate ARP in this region. (You must decrease the size of an existing TCAM region before using this command.)
Note
This step is required for Cisco Nexus 9300 switches (NFE/ALE) and Cisco Nexus 9500 switches with N9K-X9564PX, N9K-X9564TX, and N9K-X9536PQ line cards. This step is not needed with Cisco Nexus 9200 switches, Cisco Nexus 9300-EX switches, or Cisco Nexus 9500 switches with N9K-X9732C-EX line cards.
-
VXLAN tunnels cannot have more than one underlay next hop on a given underlay port. For example, on a given output underlay port, only one destination MAC address can be derived as the outer MAC on a given output port.
This is a per-port limitation, not a per-tunnel limitation. This means that two tunnels that are reachable through the same underlay port cannot drive two different outer MAC addresses.
-
When changing the IP address of a VTEP device, you must shut the NVE interface before changing the IP address.
-
As a best practice, the RP for the multicast group should be configured only on the spine layer. Use the anycast RP for RP load balancing and redundancy.
The following is an example of an anycast RP configuration on spines: ip pim rp-address 1.1.1.10 group-list 224.0.0.0/4 ip pim anycast-rp 1.1.1.10 1.1.1.1 ip pim anycast-rp 1.1.1.10 1.1.1.2
Note
-
1.1.1.10 is the anycast RP IP address that is configured on all RPs participating in the anycast RP set.
-
1.1.1.1 is the local RP IP.
-
1.1.1.2 is the peer RP IP.
-
-
Static ingress replication and BGP EVPN ingress replication do not require any IP Multicast routing in the underlay.
vPC Considerations for VXLAN Deployment
-
As a best practice, when feature vpc is enabled or disabled on a VTEP, the NVE interfaces on both the vPC primary and the vPC secondary must be shut down before the change is made. Enabling feature vpc without the vPC domain being properly configured will result in the NVE loopback being held administratively down until the configuration is completed and the vPC peer-link is brought up.
-
Bind NVE to a loopback address that is separate from other loopback addresses that are required by Layer 3 protocols. A best practice is to use a dedicated loopback address for VXLAN.
-
On vPC VXLAN, it is recommended to increase the delay restore interface-vlan timer under the vPC configuration, if the number of SVIs are scaled up. For example, if there are 1000 VNIs with 1000 SVIs, it is recommended to increase the delay restore interface-vlan timer to 45 Seconds.
-
If a ping is initiated to the attached hosts on VXLAN VLAN from a vPC VTEP node, the source IP address used by default is the anycast IP that is configured on the SVI. This ping can fail to get a response from the host in case the response is hashed to the vPC peer node. This issue can happen when a ping is initiated from a VXLAN vPC node to the attached hosts without using a unique source IP address. As a workaround for this situation, use VXLAN OAM or create a unique loopback on each vPC VTEP and route the unique address via a backdoor path.
-
The loopback address used by NVE needs to be configured to have a primary IP address and a secondary IP address.
The secondary IP address is used for all VxLAN traffic that includes multicast and unicast encapsulated traffic.
-
vPC peers must have identical configurations.
-
Consistent VLAN to VN-segment mapping.
-
Consistent NVE1 binding to the same loopback interface
-
Using the same secondary IP address.
-
Using different primary IP addresses.
-
-
Consistent VNI to group mapping.
-
-
For multicast, the vPC node that receives the (S, G) join from the RP (rendezvous point) becomes the DF (designated forwarder). On the DF node, encap routes are installed for multicast.
Decap routes are installed based on the election of a decapper from between the vPC primary node and the vPC secondary node. The winner of the decap election is the node with the least cost to the RP. However, if the cost to the RP is the same for both nodes, the vPC primary node is elected.
The winner of the decap election has the decap mroute installed. The other node does not have a decap route installed.
-
On a vPC device, BUM traffic (broadcast, unknown-unicast, and multicast traffic) from hosts is replicated on the peer-link. A copy is made of every native packet and each native packet is sent across the peer-link to service orphan-ports connected to the peer vPC switch.
To prevent traffic loops in VXLAN networks, native packets ingressing the peer-link cannot be sent to an uplink. However, if the peer switch is the encapper, the copied packet traverses the peer-link and is sent to the uplink.
Note
Each copied packet is sent on a special internal VLAN (VLAN 4041).
-
When peer-link is shut, the loopback interface used by NVE on the vPC secondary is brought down and the status is Admin Shut. This is done so that the route to the loopback is withdrawn on the upstream and that the upstream can divert all traffic to the vPC primary.
Note
Orphans connected to the vPC secondary will experience loss of traffic for the period that the peer-link is shut. This is similar to Layer 2 orphans in a vPC secondary of a traditional vPC setup.
-
When the vPC domain is shut, the loopback interface used by NVE on the VTEP with shutdown vPC domain is brought down and the status is Admin Shut. This is done so that the route to the loopback is withdrawn on the upstream and that the upstream can divert all traffic to the other vPC VTEP.
-
When peer-link is no-shut, the NVE loopback address is brought up again and the route is advertised upstream, attracting traffic.
-
For vPC, the loopback interface has 2 IP addresses: the primary IP address and the secondary IP address.
The primary IP address is unique and is used by Layer 3 protocols.
The secondary IP address on loopback is necessary because the interface NVE uses it for the VTEP IP address. The secondary IP address must be same on both vPC peers.
-
The vPC peer-gateway feature must be enabled on both peers.
As a best practice, use peer-switch, peer gateway, ip arp sync, ipv6 nd sync configurations for improved convergence in vPC topologies.
In addition, increase the STP hello timer to 4 seconds to avoid unnecessary TCN generations when vPC role changes occur.
The following is an example (best practice) of a vPC configuration:
switch# sh ru vpc version 6.1(2)I3(1) feature vpc vpc domain 2 peer-switch peer-keepalive destination 172.29.206.65 source 172.29.206.64 peer-gateway ipv6 nd synchronize ip arp synchronize
-
When the NVE or loopback is shut in vPC configurations:
-
If the NVE or loopback is shut only on the primary vPC switch, the global VxLAN vPC consistency checker fails. Then the NVE, loopback, and vPCs are taken down on the secondary vPC switch.
-
If the NVE or loopback is shut only on the secondary vPC switch, the global VXLAN vPC consistency checker fails. Then the NVE, loopback, and secondary vPC are brought down on the secondary. Traffic continues to flow through the primary vPC switch.
As a best practice, you should keep both the NVE and loopback up on both the primary and secondary vPC switches.
-
- Redundant anycast RPs configured in the network for multicast load-balancing and RP redundancy are supported on vPC VTEP topologies.
-
Enabling vpc peer-gateway configuration is mandatory. For peer-gateway functionality, at least one backup routing SVI is required to be enabled across peer-link and also configured with PIM. This provides a backup routing path in the case when VTEP loses complete connectivity to the spine. Remote peer reachability is re-routed over peer-link in his case. In BUD node topologies, the backup SVI needs to be added as a static OIF for each underlay multicast group.
The following is an example of backup SVI with PIM enabled:
swithch# sh ru int vlan 2 interface Vlan2 description backupl_svi_over_peer-link no shutdown ip address 30.2.1.1/30 ip router ospf 1 area 0.0.0.0 ip pim sparse-mode ip igmp static-oif route-map match-mcast-groups route-map match-mcast-groups permit 1 match ip multicast group 225.1.1.1/32
Note
In BUD node topologies, the backup SVI needs to be added as a static OIF for each underlay multicast group.
Note
The SVI must be configured on both vPC peers and requires PIM to be enabled.
-
As a best practice when changing the secondary IP address of an anycast vPC VTEP, the NVE interfaces on both the vPC primary and the vPC secondary should be shut before the IP changes are made.
-
Using the ip forward command enables the VTEP to forward the VXLAN de-capsulated packet destined to its router IP to the SUP/CPU.
-
Before configuring it as an SVI, the backup VLAN needs to be configured on Cisco Nexus 9200, 9300-EX, 9300-FX, and 9300-FX2 platform switches as an infra-VLAN with the system nve infra-vlans command.
-
When ARP suppression is enabled or disabled in a vPC setup, a down time is required because the global VXLAN vPC consistency checker will fail and the VLANs will be suspended if ARP suppression is disabled or enabled on only one side.
Network Considerations for VXLAN Deployments
-
MTU Size in the Transport Network
Due to the MAC-to-UDP encapsulation, VXLAN introduces 50-byte overhead to the original frames. Therefore, the maximum transmission unit (MTU) in the transport network must be increased by 50 bytes. If the overlays use a 1500-byte MTU, the transport network must be configured to accommodate 1550-byte packets at a minimum. Jumbo-frame support in the transport network is required if the overlay applications tend to use larger frame sizes than 1500 bytes.
-
ECMP and LACP Hashing Algorithms in the Transport Network
As described in a previous section, Cisco Nexus 9000 Series Switches introduce a level of entropy in the source UDP port for ECMP and LACP hashing in the transport network. As a way to augment this implementation, the transport network uses an ECMP or LACP hashing algorithm that takes the UDP source port as input for hashing, which achieves the best load-sharing results for VXLAN encapsulated traffic.
-
Multicast Group Scaling
The VXLAN implementation on Cisco Nexus 9000 Series Switches uses multicast tunnels for broadcast, unknown unicast, and multicast traffic forwarding. Ideally, one VXLAN segment mapping to one IP multicast group is the way to provide the optimal multicast forwarding. It is possible, however, to have multiple VXLAN segments share a single IP multicast group in the core network. VXLAN can support up to 16 million logical Layer 2 segments, using the 24-bit VNID field in the header. With one-to-one mapping between VXLAN segments and IP multicast groups, an increase in the number of VXLAN segments causes a parallel increase in the required multicast address space and the number of forwarding states on the core network devices. At some point, multicast scalability in the transport network can become a concern. In this case, mapping multiple VXLAN segments to a single multicast group can help conserve multicast control plane resources on the core devices and achieve the desired VXLAN scalability. However, this mapping comes at the cost of suboptimal multicast forwarding. Packets forwarded to the multicast group for one tenant are now sent to the VTEPs of other tenants that are sharing the same multicast group. This causes inefficient utilization of multicast data plane resources. Therefore, this solution is a trade-off between control plane scalability and data plane efficiency.
Despite the suboptimal multicast replication and forwarding, having multitenant VXLAN networks to share a multicast group does not bring any implications to the Layer 2 isolation between the tenant networks. After receiving an encapsulated packet from the multicast group, a VTEP checks and validates the VNID in the VXLAN header of the packet. The VTEP discards the packet if the VNID is unknown to it. Only when the VNID matches one of the VTEP’s local VXLAN VNIDs, does it forward the packet to that VXLAN segment. Other tenant networks will not receive the packet. Thus, the segregation between VXLAN segments is not compromised.
Considerations for the Transport Network
The following are considerations for the configuration of the transport network:
-
On the VTEP device:
-
Enable and configure IP multicast.*
-
Create and configure a loopback interface with a /32 IP address.
(For vPC VTEPs, you must configure primary and secondary /32 IP addresses.)
-
Enable UP multicast on the loopback interface. *
-
Advertise the loopback interface /32 addresses through the routing protocol (static route) that runs in the transport network.
-
Enable IP multicast on the uplink outgoing physical interface. *
-
-
Throughout the transport network:
-
Enable and configure IP multicast.*
-
With the Cisco Nexus 9200, 9300-EX, 9300-FX, and 9300-FX2, the use of the system nve infra-vlans command is required, as otherwise VXLAN traffic (IP/UDP 4789) is actively treated by the switch. The following scenarios are a non-exhaustive list but most commonly seen, where the need for a system nve infra-vlans definition is required.
Every VLAN that is not associated with a VNI (vn-segment) is required to be configured as system nve infra-vlans in the following cases:
In the case of VXLAN flood and learn as well as VXLAN EVPN, the presence of non-VXLAN VLANs could be related to:
-
An SVI related to a non-VXLAN VLAN is used for backup underlay routing between vPC peers via a vPC peer-link (backup routing).
-
An SVI related to a non-VXLAN VLAN is required for connecting downstream routers (external connectivity, dynamic routing over vPC).
-
An SVI related to a non-VXLAN VLAN is required for per Tenant-VRF peering (L3 route sync and traffic between vPC VTEPs in a Tenant VRF).
-
An SVI related to a non-VXLAN VLAN is used for first-hop routing toward endpoints (Bud-Node).
In the case of VXLAN flood and learn, the presence of non-VXLAN VLANs could be related to:
-
An SVI related to a non-VXLAN VLAN is used for an underlay uplink toward the spine (Core port).
The rule of defining VLANs as system nve infra-vlans can be relaxed for special cases such as:
-
An SVI related to a non-VXLAN VLAN that does not transport VXLAN traffic (IP/UDP 4789).
-
Non-VXLAN VLANs that are not associated with an SVI or not transporting VXLAN traffic (IP/UDP 4789).
Note |
You must not configure certain combinations of infra-VLANS, for example, 2 and 514, 10 and 522, which are 512 apart. This is specifically but not exclusive to the "Core port" scenario that is described for VXLAN flood and learn. |
Note |
* Not required for static ingress replication or BGP EVPN ingress replication. |
Considerations for Tunneling VXLAN
DC Fabrics with VXLAN BGP EVPN are becoming the transport infrastructure for overlays. These overlays, often originated on the server (Host Overlay), require integration or transport over the top of the existing transport infrastructure (Network Overlay).
Nested VXLAN (Host Overlay over Network Overlay) support has been added starting with Cisco NX-OS Release 7.0(3)I7(4) on the Cisco Nexus 9200, 9300-EX, 9300-FX, 9300-FX2, 9500-EX, and 9500-FX platform switches.
To provide Nested VXLAN support, the switch hardware and software must differentiate between two different VXLAN profiles:
-
VXLAN originated behind the Hardware VTEP for transport over VXLAN BGP EVPN (nested VXLAN)
-
VXLAN originated behind the Hardware VTEP to integrated with VXLAN BGP EVPN (BUD Node)
The detection of the two different VXLAN profiles is automatic and no specific configuration is needed for nested VXLAN. As soon as VXLAN encapsulated traffic arrives in a VXLAN enabled VLAN, the traffic is transported over the VXLAN BGP EVPN enabled DC Fabric.
The following attachment modes are supported for Nested VXLAN:
-
Untagged traffic (in native VLAN on a trunk port or on an access port)
-
Tagged traffic (tagged VLAN on a IEEE 802.1Q trunk port)
-
Untagged and tagged traffic that is attached to a vPC domain
-
Untagged traffic on a Layer 3 interface of a Layer 3 port-channel interface