Cisco ACI Forwarding

Forwarding Within the Fabric

ACI Fabric Optimizes Modern Data Center Traffic Flows

The Cisco ACI architecture addresses the limitations of traditional data center design, and provides support for the increased east-west traffic demands of modern data centers.

Today, application design drives east-west traffic from server to server through the data center access layer. Applications driving this shift include big data distributed processing designs like Hadoop, live virtual machine or workload migration as with VMware vMotion, server clustering, and multi-tier applications.

North-south traffic drives traditional data center design with core, aggregation, and access layers, or collapsed core and access layers. Client data comes in from the WAN or Internet, a server processes it, and then it exits the data center, which permits data center hardware oversubscription due to WAN or Internet bandwidth constraints. However, Spanning Tree Protocol is required to block loops. This limits available bandwidth due to blocked links, and potentially forces traffic to take a suboptimal path.

In traditional data center designs, IEEE 802.1Q VLANs provide logical segmentation of Layer 2 boundaries or broadcast domains. However, VLAN use of network links is inefficient, requirements for device placements in the data center network can be rigid, and the VLAN maximum of 4094 VLANs can be a limitation. As IT departments and cloud providers build large multi-tenant data centers, VLAN limitations become problematic.

A spine-leaf architecture addresses these limitations. The ACI fabric appears as a single switch to the outside world, capable of bridging and routing. Moving Layer 3 routing to the access layer would limit the Layer 2 reachability that modern applications require. Applications like virtual machine workload mobility and some clustering software require Layer 2 adjacency between source and destination servers. By routing at the access layer, only servers connected to the same access switch with the same VLANs trunked down would be Layer 2-adjacent. In ACI, VXLAN solves this dilemma by decoupling Layer 2 domains from the underlying Layer 3 network infrastructure.

Figure 1. ACI Fabric

As traffic enters the fabric, ACI encapsulates and applies policy to it, forwards it as needed across the fabric through a spine switch (maximum two-hops), and de-encapsulates it upon exiting the fabric. Within the fabric, ACI uses Intermediate System-to-Intermediate System Protocol (IS-IS) and Council of Oracle Protocol (COOP) for all forwarding of endpoint to endpoint communications. This enables all ACI links to be active, equal cost multipath (ECMP) forwarding in the fabric, and fast-reconverging. For propagating routing information between software defined networks within the fabric and routers external to the fabric, ACI uses the Multiprotocol Border Gateway Protocol (MP-BGP).

VXLAN in ACI

VXLAN in Cisco ACI is a protocol that extends Layer 2 segments over Layer 3 infrastructure, enabling the creation of scalable and flexible overlay networks in data centers.

  • All traffic in the Cisco ACI fabric is normalized as VXLAN packets, encapsulating external VLAN, VXLAN, and NVGRE packets.

  • VXLAN enables isolated broadcast and failure bridge domains, reducing the risk of large failure domains.

  • The VXLAN header includes policy attributes and a 24-bit VNID, supporting up to 16 million unique Layer 2 segments.

VXLAN Encapsulation and Policy Enforcement in Cisco ACI

VXLAN in Cisco ACI encapsulates various packet types and enforces policy attributes, allowing for distributed and consistent policy enforcement across the fabric.

  • Encapsulates external VLAN, VXLAN, and NVGRE packets into VXLAN packets at ingress.

  • Supports isolated broadcast and failure bridge domains in the overlay.

  • Decouples application policy EPG identity from forwarding, enabling flexible endpoint placement.

  1. Ingress packets are encapsulated as VXLAN packets.

  2. Policy attributes are carried in every packet for distributed enforcement.

  3. Forwarding is not constrained by encapsulation type or overlay network.

  • Layer 2 MAC address: Source and destination fields for efficient forwarding.

  • Layer 3 IP address: Source and destination fields for scalable routing.

  • VNID: 24-bit field for up to 16 million unique segments.

Table 1. VXLAN and Traditional VLAN Comparison

Attributes

VXLAN

VLAN

Address Space

16 million segments (24-bit VNID)

4096 segments (12-bit VLAN ID)

Overlay Capability

Yes, over Layer 3

No, Layer 2 only


Note


TIP: VXLAN in Cisco ACI allows for scalable, policy-driven Layer 2 overlays across Layer 3 infrastructure, supporting large multitenant data centers.


Figure 2. Cisco ACI Encapsulation Normalization
Figure 3. Cisco ACI VXLAN Packet Format

Example: VXLAN Overlay in Cisco ACI

For example, an application endpoint host can be placed anywhere in the data center network, regardless of the Layer 3 boundary, and still maintain Layer 2 adjacency through the VXLAN overlay network in Cisco ACI.

Layer 3 VNIDs Facilitate Transporting Inter-subnet Tenant Traffic

Layer 3 VNIDs are identifiers assigned to each tenant VRF in the Cisco ACI fabric that enable routing and transport of inter-subnet tenant traffic across the fabric.

  • Each tenant receives a single L3 VNID, which is used to route traffic between subnets within the tenant.

  • The ACI fabric provides a distributed virtual default gateway, sharing the same IP and MAC address across all ingress interfaces for a tenant subnet.

  • VXLAN encapsulation and VTEP devices decouple endpoint identity from location, allowing efficient forwarding and routing without spanning-tree protocol.

How Layer 3 VNIDs Enable Inter-subnet Routing in Cisco ACI

The Cisco ACI fabric provides tenant default gateway functionality that routes between VXLAN networks. For each tenant, the fabric provides a virtual default gateway that spans all of the leaf switches assigned to the tenant. This gateway operates at the ingress interface of the first leaf switch connected to the endpoint, and all ingress interfaces share the same router IP and MAC address for a given tenant subnet.

The Cisco ACI fabric decouples the tenant endpoint address (its identifier) from the location of the endpoint, which is defined by its VXLAN tunnel endpoint (VTEP) address. Forwarding within the fabric is performed between VTEPs.

  • Each VTEP has a switch interface on the local LAN segment for local endpoint communication through bridging.

  • Each VTEP has an IP interface to the transport IP network, with a unique IP address identifying the VTEP device on the infrastructure VLAN.

The VTEP device uses its IP interface to encapsulate Ethernet frames and transmit them to the transport network. It also discovers remote VTEPs and learns remote MAC Address-to-VTEP mappings through its IP interface.

The VTEP in Cisco ACI maps the internal tenant MAC or IP address to a location using a distributed mapping database. After a lookup, the VTEP sends the original data packet encapsulated in VXLAN to the destination VTEP. The destination leaf switch de-encapsulates the packet and delivers it to the receiving host. This model enables a full mesh, single hop, loop-free topology without the need for spanning-tree protocol.

The VXLAN segments are independent of the underlying network topology, and the underlying IP network between VTEPs is independent of the VXLAN overlay. Routing is based on the outer IP address header, with the initiating VTEP as the source and the terminating VTEP as the destination.

For each tenant VRF, Cisco ACI assigns a single L3 VNID. Traffic is transported across the fabric according to the L3 VNID, and at the egress leaf switch, ACI routes the packet from the L3 VNID to the VNID of the egress subnet.

Traffic arriving at the fabric ingress and sent to the Cisco ACI default gateway is routed into the Layer 3 VNID, providing efficient forwarding for intra-tenant routed traffic. For example, traffic between two VMs in the same tenant but on different subnets only needs to reach the ingress switch interface before being routed to the correct destination.

To distribute external routes within the fabric, Cisco ACI route reflectors use multiprotocol BGP (MP-BGP). The fabric administrator provides the autonomous system (AS) number and specifies the spine switches that become route reflectors.


Note


Cisco ACI does not support IP fragmentation. Therefore, when you configure Layer 3 Outside (L3Out) connections to external routers, or Multi-Pod connections through an Inter-Pod Network (IPN), it is recommended that the interface MTU is set appropriately on both ends of a link.

IGP Protocol Packets (EIGRP, OSPFv3) are constructed by components based on the Interface MTU size. In Cisco ACI , if the CPU MTU size is less than the Interface MTU size and if the constructed packet size is greater than the CPU MTU, then the packet is dropped by the kernel, especially in IPv6. To avoid such control packet drops always configure the same MTU values on both the control plane and on the interface.

On some platforms, such as Cisco ACI , Cisco NX-OS, and Cisco IOS, the configurable MTU value does not take into account the Ethernet headers (matching IP MTU, and excluding the 14-18 Ethernet header size), while other platforms, such as IOS-XR, include the Ethernet header in the configured MTU value. A configured value of 9000 results in a max IP packet size of 9000 bytes in Cisco ACI , Cisco NX-OS, and Cisco IOS, but results in a max IP packet size of 8986 bytes for an IOS-XR untagged interface.

For the appropriate MTU values for each platform, see the relevant configuration guides.

We highly recommend that you test the MTU using CLI-based commands. For example, on the Cisco NX-OS CLI, use a command such as ping 1.1.1.1 df-bit packet-size 9000 source-interface ethernet 1/1.


Figure 4. ACI Decouples Identity and Location
Figure 5. Layer 3 VNIDs Transport ACI Inter-subnet Tenant Traffic