Implementing BGP

Border Gateway Protocol (BGP) is an Exterior Gateway Protocol (EGP) that allows you to create loop-free interdomain routing between autonomous systems. An autonomous system is a set of routers under a single technical administration. Routers in an autonomous system can use multiple Interior Gateway Protocols (IGPs) to exchange routing information inside the autonomous system and an EGP to route packets outside the autonomous system.

This module provides conceptual and configuration information on BGP.


Tip


You can programmatically configure BGP and retrieve operational data using openconfig-network-instance.yang OpenConfig data model. To get started with using data models, see the Programmability Configuration Guide for Cisco 8000 Series Routers.


Prerequisites for Implementing BGP

  • You must be in a user group associated with a task group that includes the proper task IDs. The command reference guides include the task IDs required for each command. If you suspect user group assignment is preventing you from using a command, contact your AAA administrator for assistance.

  • The current Internet BGP table contains approximately 1.1 million IPv4 routes and 200,000 IPv6 routes. With an average of two paths per route, the BGP process typically requires around 5.5 GB of RAM to manage the full Internet BGP table. As the IPv6 Internet table continues to expand, the memory requirements for the BGP process are expected to increase. Therefore, Cisco recommends using the Service Edge (SE) version of Route Processor (RP) or Route Switch Processor (RSP) cards, or fixed chassis, on routers that will maintain a full BGP table

BGP Functional Overview

BGP uses TCP as its transport protocol. Two BGP routers form a TCP connection between one another (peer routers) and exchange messages to open and confirm the connection parameters.

BGP routers exchange network reachability information. This information is mainly an indication of the full paths (BGP autonomous system numbers) that a route should take to reach the destination network. This information helps construct a graph that shows which autonomous systems are loop free and where routing policies can be applied to enforce restrictions on routing behavior.

Any two routers forming a TCP connection to exchange BGP routing information are called peers or neighbors. BGP peers initially exchange their full BGP routing tables. After this exchange, incremental updates are sent as the routing table changes. BGP keeps a version number of the BGP table, which is the same for all of its BGP peers. The version number changes whenever BGP updates the table due to routing information changes. Keepalive packets are sent to ensure that the connection is alive between the BGP peers and notification packets are sent in response to error or special conditions.


Note


ASN change for BGP process is not currently supported via commit replace .


BGP Router Identifier

For BGP sessions between neighbors to be established, BGP must be assigned a router ID. The router ID is sent to BGP peers in the OPEN message when a BGP session is established.

BGP attempts to obtain a router ID in the following ways (in order of preference):

  • By means of the address configured using the bgp router-id command in router configuration mode.

  • By using the highest IPv4 address on a loopback interface in the system if the router is booted with saved loopback address configuration.

  • By using the primary IPv4 address of the first loopback address that gets configured if there are not any in the saved configuration.

If none of these methods for obtaining a router ID succeeds, BGP does not have a router ID and cannot establish any peering sessions with BGP neighbors. In such an instance, an error message is entered in the system log, and the show bgp summary command displays a router ID of 0.0.0.0.

After BGP has obtained a router ID, it continues to use it even if a better router ID becomes available. This usage avoids unnecessary flapping for all BGP sessions. However, if the router ID currently in use becomes invalid (because the interface goes down or its configuration is changed), BGP selects a new router ID (using the rules described) and all established peering sessions are reset.


Note


We strongly recommend that the bgp router-id command is configured to prevent unnecessary changes to the router ID (and consequent flapping of BGP sessions).


BGP Route Distinguisher

In network design solutions where customer equipment is dual-homed and Fast Reroute is required, such as in EVPN and BGP PIC Edge solutions, the Route Distinguisher (RD) associated with each VRF must be unique per Provider Edge (PE) router. In other design scenarios, while it isn’t mandatory for the RD to be unique per PE, it is highly recommended to make it unique. This practice facilitates easier transitions to dual-homed solutions in the future.

There are few available options to keep unique RD per device:

  • Manual configuration: You must manually assign a unique value per device in the network. For example, in this scenario:

    • Leaf (ToR) = RD 1

    • Edge DCI Gateway = RD 2

    • Remote PE = RD 3

  • Use rd auto command under VRF. To assign a unique route distinguisher for each router, you must ensure that each router has a unique BGP router-id. If so, the rd auto command assigns a Type 1 route distinguisher to the VRF using the following format: ip-address:number. The IP address is specified by the BGP router-id statement and the number (which is derived as an unused index in the 0 to 65535 range) is unique across the VRFs.


Note


In a DCI deployment, for route re-originate with stitching-rt for a particular VRF, using the same Route Distinguisher (RD) between edge DCI gateway and MPLS-VPN PE or same RD between edge DCI gateway and Leaf (ToR) is not supported.


BGP Maximum Prefix - Discard Extra Paths

IOS XR BGP maximum-prefix feature imposes a maximum limit on the number of prefixes that are received from a neighbor for a given address family. Whenever the number of prefixes received exceeds the maximum number configured, the BGP session is terminated, which is the default behavior, after sending a cease notification to the neighbor. The session is down until a manual clear is performed by the user. The session can be resumed by using the clear bgp command. It is possible to configure a period after which the session can be automatically brought up by using the maximum-prefix command with the restart keyword. The maximum prefix limit can be configured by the user.


Note


Starting IOS-XR Release 7.3.1, the router does not apply default limits if the user does not configure the maximum number of prefixes for the address family.


Discard Extra Paths

An option to discard extra paths is added to the maximum-prefix configuration. Configuring the discard extra paths option drops all excess prefixes received from the neighbor when the prefixes exceed the configured maximum value. This drop does not, however, result in session flap.

The benefits of discard extra paths option are:

  • Limits the memory footstamp of BGP.

  • Stops the flapping of the peer if the paths exceed the set limit.

When the discard extra paths configuration is removed, BGP sends a route-refresh message to the neighbor if it supports the refresh capability; otherwise the session is flapped.

On the same lines, the following describes the actions when the maximum prefix value is changed:

  • If the maximum value alone is changed, a route-refresh message is sourced, if applicable.

  • If the new maximum value is greater than the current prefix count state, the new prefix states are saved.

  • If the new maximum value is less than the current prefix count state, then some existing prefixes are deleted to match the new configured state value.

There is currently no way to control which prefixes are deleted.

Configure Discard Extra Paths

The discard extra paths option in the maximum-prefix configuration allows you to drop all excess prefixes received from the neighbor when the prefixes exceed the configured maximum value. This drop does not, however, result in session flap.

The benefits of discard extra paths option are:

  • Limits the memory footstamp of BGP.

  • Stops the flapping of the peer if the paths exceed the set limit.

When the discard extra paths configuration is removed, BGP sends a route-refresh message to the neighbor if it supports the refresh capability; otherwise the session is flapped.

Note


  • When the router drops prefixes, it is inconsistent with the rest of the network, resulting in possible routing loops.

  • If prefixes are dropped, the standby and active BGP sessions may drop different prefixes. Consequently, an NSR switchover results in inconsistent BGP tables.

  • The discard extra paths configuration cannot co-exist with the soft reconfig configuration.

  • When the system runs out of physical memory, bgp process exits and you must manually restart bpm. To manually restart, use the process restart bpm command.


Perform this task to configure BGP maximum-prefix discard extra paths.

SUMMARY STEPS

  1. configure
  2. router bgp as-number
  3. neighbor ip-address
  4. address-family { ipv4 | ipv6 } unicast
  5. maximum-prefix maximum discard-extra-paths
  6. Use the commit or end command.

DETAILED STEPS


Step 1

configure

Example:
RP/0/RP0/CPU0:router# configure

Enters XR Config mode.

Step 2

router bgp as-number

Example:
RP/0/RP0/CPU0:router(config)# router bgp 10 

Specifies the autonomous system number and enters the BGP configuration mode, allowing you to configure the BGP routing process.

Step 3

neighbor ip-address

Example:
RP/0/RP0/CPU0:router(config-bgp)# neighbor 10.0.0.1 

Places the router in neighbor configuration mode for BGP routing and configures the neighbor IP address as a BGP peer.

Step 4

address-family { ipv4 | ipv6 } unicast

Example:
RP/0/RP0/CPU0:router(config-bgp-nbr)# address-family ipv4 unicast 

Specifies either the IPv4 or IPv6 address family and enters address family configuration submode.

Step 5

maximum-prefix maximum discard-extra-paths

Example:
RP/0/RP0/CPU0:router(config-bgp-nbr-af)# maximum-prefix 1000 discard-extra-paths 

Configures a limit to the number of prefixes allowed.

Configures discard extra paths to discard extra paths when the maximum prefix limit is exceeded.

Step 6

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.


Example

The following example shows how to configure discard extra paths feature for the IPv4 address family:


RP/0//CPU0:router# configure
RP/0//CPU0:router(config)# router bgp 10
RP/0//CPU0:router(config-bgp)# neighbor 10.0.0.1
RP/0//CPU0:router(config-bgp-nbr)# address-family ipv4 unicast
RP/0//CPU0:router(config-bgp-nbr-af)# maximum-prefix 1000 discard-extra-paths
RP/0//CPU0:router(config-bgp-vrf-af)# commit

The following screen output shows details about the discard extra paths option:


RP/0//CPU0:ios# show bgp neighbor 10.0.0.1 

BGP neighbor is 10.0.0.1
Remote AS 10, local AS 10, internal link
Remote router ID 0.0.0.0
BGP state = Idle (No best local address found)
Last read 00:00:00, Last read before reset 00:00:00
Hold time is 180, keepalive interval is 60 seconds
Configured hold time: 180, keepalive: 60, min acceptable hold time: 3
Last write 00:00:00, attempted 0, written 0
Second last write 00:00:00, attempted 0, written 0
Last write before reset 00:00:00, attempted 0, written 0
Second last write before reset 00:00:00, attempted 0, written 0
Last write pulse rcvd not set last full not set pulse count 0
Last write pulse rcvd before reset 00:00:00
Socket not armed for io, not armed for read, not armed for write
Last write thread event before reset 00:00:00, second last 00:00:00
Last KA expiry before reset 00:00:00, second last 00:00:00
Last KA error before reset 00:00:00, KA not sent 00:00:00
Last KA start before reset 00:00:00, second last 00:00:00
Precedence: internet
Multi-protocol capability not received
Received 0 messages, 0 notifications, 0 in queue
Sent 0 messages, 0 notifications, 0 in queue
Minimum time between advertisement runs is 0 secs

For Address Family: IPv4 Unicast
BGP neighbor version 0
Update group: 0.1 Filter-group: 0.0 No Refresh request being processed
Route refresh request: received 0, sent 0
0 accepted prefixes, 0 are bestpaths
Cumulative no. of prefixes denied: 0. 
Prefix advertised 0, suppressed 0, withdrawn 0
Maximum prefixes allowed 10 (discard-extra-paths) <<<<<<<<<<<<<<<<<<<<<
Threshold for warning message 75%, restart interval 0 min
AIGP is enabled
An EoR was not received during read-only mode
Last ack version 1, Last synced ack version 0
Outstanding version objects: current 0, max 0
Additional-paths operation: None
Send Multicast Attributes

Connections established 0; dropped 0
Local host: 0.0.0.0, Local port: 0, IF Handle: 0x00000000
Foreign host: 10.0.0.1, Foreign port: 0
Last reset 00:00:00

Restrictions

These restrictions apply to the discard extra paths feature:

  • When the router drops prefixes, it is inconsistent with the rest of the network, resulting in possible routing loops.

  • If prefixes are dropped, the standby and active BGP sessions may drop different prefixes. Consequently, an NSR switchover results in inconsistent BGP tables.

  • The discard extra paths configuration cannot co-exist with the soft reconfig configuration.

BGP Labeled Unicast

The BGP Labeled Unicast (LU) feature, also known as unified MPLS, provides MPLS transport between Provider Edge (PE) routers that are separated by either many IGP boundaries (intra-AS) or by many autonomous systems (inter-AS). Using autonomous systems border routers (ASBRs), you can advertise loopback prefixes of PEs and their MPLS label bindings: iBGP between area border routers (ABRs) and eBGP between autonomous system border routers. You can use Multihop eBGP between the PEs if they are in different autonomous systems (ASes) to exchange the VPN routes. You can run 6PE and other services between the PEs that have BGP LU connectivity.

The BGP LU feature lowers the IGP labeled prefix scale and adjacency scale values. If the router is not being configured with BGP LU, it is necessary to prevent lowering of scale values. Hence it is mandatory to configure the hw-module command before you enable the BGP LU feature. Restart the router for the hw-module command configuration to take effect.

Restrictions

  • Cisco 8000 supports only per-vrf label mode.

  • You can use LDP or Segment Routing (SR) as the transport underlay. You cannot use TE as the transport underlay.

  • BGP PIC edge feature is not supported.

  • L3VPN and 6VPE over BGP LU feature is not supported.

  • BGP PIC core feature is supported.

  • The label-allocation-mode is deprecated from release 7.4.1. The function of this command can be carried out using label mode command under configured address-family.

Supported features

The following features are supported:

  • BGP LU with inter-AS option C

  • 6PE over MPLS transport using LDP or Segment Routing.

  • BGP PIC core

Topology

Figure 1. BGP Labeled Unicast (Intra-Autonomous System) Control Plane and Data Plane

The above diagram explains how PE1 is connected with PE2 through MPLS connectivity. PE1 and PE2 are separated by many areas within the same AS. Consider three network areas OSPF1, OSPF2, and OSPF3. Each of these areas is running separate OSPFs. LDP acts as transport between each of these areas. To establish a connection between the Provider Edge routers PE1 and PE2, send iBGP from PE2 to PE1 through P3, ASBR2, P1 and ASBR1, P2. PE1 must learn the loopback address of PE2 to establish a connection between the loopback address of PE1 and the loopback address of PE2.

The loopback address of PE2 which is 10.1.1.7 advertises a BGP label through iBGP to ASBR2. This address is advertised as an implicit null label. The ASBR2 allocates a local label 14003 for the loopback address 10.1.1.7 and sends it to ASBR1. ASBR1 allocates its own label 14005 to the loopback address 10.1.1.7 and sends it to PE1. PE1 has learnt the prefix of loopback address 10.1.1.7 and the BGP label 14005. The BGP next hop for PE1 is ASBR1. When PE1 sends traffic to PE2, PE1 adds two labels: the BGP-LU label and transport LDP label. The transport LDP label 24000, is above the BGP-LU label 14005. PE1 imposes the transport LDP label and the BGP-LU label when PE1 transmits an IP packet destined to the loopback address 10.1.1.7. The transport LDP label carries the packet to ASBR1. ASBR1 receives the IP packet. It contains only the BGP-LU label, 14005. ASBR1 swaps the BGP-LU label from 14005 to 14003 and imposes transport LDP label 24001 and sends the IP packet to ASBR2. ASBR2 receives the packet. The BGP-LU label for the loopback address 10.1.1.7 in ASBR2 is implicit null. Only the transport label is pushed to 24002. ASBR2 transmits the transport label that carries the transport to PE2.

Figure 2. BGP Labeled Unicast (Intra-Autonomous System Option C) Control Plane and Data Plane

ASBR2 prefers IGP MPLS path over BGP path 10.1.1.7. It advertises LDP local label as BGP label to ASBR1. A LDP swap operation takes place on ASBR2.

The above figure explains how PE1 is connected with PE2 through MPLS connectivity using eBGP. In the above-mentioned scenario, eBGP exists between ASBR1 and ASBR2. PE2 advertises the BGP-LU label which has a value of implicit null to ASBR2 through iBGP. The loopback address is known to ASBR2 through the IGP. ASBR2 prefers the IGP path with ldp label 24002. ASBR2 allocates local label 24004 to loopback 10.1.1.7. It advertises the local label 24004 to ASBR1. ASBR1 creates a local label 14005 and advertises it to PE1. Now, PE1 is aware of the loopback address 10.1.1.7. The IP packet has two labels: the BGP label 14005 and the transport label 24000. PE1 transmits the IP packet to ASBR1. The IP packet received by ASBR1 has only the BGP LU label 14005. ASBR1 swaps BGP-LU label from 14005 to 24004. The IP packet reaches ASBR2 where LDP label 24002 is pushed and transmits the packet to PE2.

Figure 3. 6PE over BGP LU (Inter-AS Option C) Control Plane and Data Plane

The above illustration explains how PE1 is connected with PE2 through MPLS connectivity using Multihop eBGP between multiple ASes. Multihop BGP exists between PE1 and PE2. PE1 and PE2 can exchange 6PE routes on the multihop eBGP with the labels. The label value for 6PE is v6 explicit null. When PE2 advertises v6 prefix 10::2/128, the label is always the explicit null label. The BGP label and LDP label constitute the top two labels. The 6PE label constitutes the bottom label which is v6 explicit null. The v6 packet reaches PE1 with destination IP 10:2. The label imposition takes place here. The 6PE label of value 2 is imposed first, the BGP label 14005 is imposed next, and then the next hop LDP label 14005 for the BGP LU next hop is imposed. ASBR1 swaps BGP-LU label from 14005 to 24004 and forwards the packet to ASBR2. ASBR2 adds LDP label on top of 6PE label 2 and forwards it to P3 where LDP label is POPed, so PE2 receives packet with 6PE explicit null label only. PE2 performs a v6 lookup and forwards the packet.

Configure BGP Labeled Unicast


Router(config)# hw-module profile cef bgplu enable
Router(config)# router bgp 1
Router(config-bgp)# bgp router-id 2001:DB8::1
Router(config-bgp)# address-family ipv6 unicast
Router(config-bgp-af)# redistribute connected route-policy set-lbl-idx
Router(config-bgp-af)# allocate-label all
Router(config-bgp-af)# exit
Router(config-bgp)# neighbor 2001:DB8::2
Router(config-bgp)# remote-as 1
Router(config-bgp)# update-source Loopback 0
Router(config-bgp)# address-family ipv6 labeled-unicast
Router(config-bgp)# route-policy pass-all in
Router(config-bgp)# route-policy pass-all out

/* Note: Restart the router for the hw-module command configuration to take effect. */

Running Configuration

!
hw-module profile cef bgplu enable
!
router bgp 1
 bgp router-id 2001:DB8::1
 address-family ipv6 unicast
 redistribute connected route-policy set-lbl-idx
 allocate-label all
!
 neighbor 2001:DB8::2
 remote-as 1
 update-source Loopback0
 
!
 address-family ipv6 labeled-unicast
 route-policy pass-all in
 route-policy pass-all out
!

Verification

SME to provide the show output required below.

Router # show bgp ipv6 unicast labels
  Network              Next Hop         Rcvd Label      Local Label

Router# show bgp ipv6 unicast labels
Network            Next Hop        Rcvd Label      Local Label

BGP Labeled Unicast over RSVP-TE

Table 1. Feature History Table

Feature Name

Release Information

Feature Description

BGP Labeled Unicast over RSVP-TE

Release 7.11.1

You can now steer the MPLS traffic as per your requirement instead of relying on what the IGP directs.

This feature extends the BGP Labeled Unicast (LU) functionality over RSVP-TE protocol. BGP LU advertises label bindings while RSVP-TE establishes the traffic engineering paths that you specify. This feature allows the provider Edge (PE) routers to forward incoming traffic using the label bindings along the specific path reserved using RSVP-TE. This ability to provide explicit routing ensures optimal use of your network resources.

The feature introduces these changes:

CLI:

YANG Data Models:

BGP Labeled Unicast over RSVP-TE feature enables the routers to forward the BGP labeled unicast traffic to the BGP-LU next hop router through Reservation Protocol - Traffic Engineering (RSVP-TE) tunnels. With this feature, you can choose the tunnel (path) to transport the traffic as per your requirement. For example, Autoroute Announce (AA) tunnels can be used exclusively for the traffic that is sent to the tunnel destination address. All other traffic, by default, is routed through the Forwarding-Adjacency (FA) tunnels.

Figure 4. BGP Labeled Unicast over RSVP-TE

In this example, BGP-LU connects ASBR1 with CE1, and ASBR2 with CE2. ASBR1 is configured with two RSVP-TE tunnels to ASBR2. The tunnel configured with FA is connected to the primary IP address (10.1.1.2), and the tunnel configured with AA is connected to the secondary IP address (10.1.1.3). ASBR2 sets the next-hop of the BGP-LU prefixes received from CE2 with the secondary IP address (10.1.1.3). So, ASBR1 uses the AA tunnel to forward packets that are destined for CE2.


Note


If a BGP-LU route is learned via two BGP-NH routers, and if one BGP-NH router is reachable via RSVP-TE, and the other BGP-NH router is reachable via regular next-hop, the path reachable via regular next-hop is selected for forwarding.


Fast Reroute (FRR) mechanism provides protection to the transported traffic against link and node failures.

Restrictions

The following restrictions apply for the BGP-LU over RSVP-TE feature:

  • Configuring BGP-LU over RSVP-TE along with BGP-LU (over NH) and Class-based forwarding (CBF) is not allowed. You must disable BGP-LU and CBF configurations before enabling the BGP-LU over RSVP-TE feature. Otherwise, the router displays an error message.

  • BGP-LU over RSVP-TE feature is not supported on Q100-based line cards.

  • BGP-LU over SR-TE is not supported.

  • Services such as L3VPN, 6PE, and 6VPE are not supported.

  • You can use LDP or Segment Routing (SR) as the transport underlay. But, you cannot use TE as the transport underlay.

  • Reaching ASBR (BGP-NH) through regular NH and RSVP-TE is not supported.

Configure BGP-LU over RSVP-TE

Configuration Example

This example shows how to configure the BGP-LU over RSVP-TE feature.

/* Disable BGP-LU and CBF.*/
Router(config)# no hw-module profile cef bgplu enable  
Router(config)# no hw-module profile cef cbf enable 
/* Enable BGP-LU over RSVP-TE.*/
Router(config)# hw-module profile cef bgplu-over-rsvpte enable 

Note


By default, this feature supports a maximum of 1k tunnels. To increase the capacity to support 5k tunnels, run the hw-module profile cef te-tunnel highscale-no-ldp-over-te command.


/* Configure loopback interfaces. */
Router(config)# interface Loopback1001 
Router(config-if)# ipv4 address 10.10.10.10 255.255.255.255 
Router(config-if)# exit 
Router(config)# interface tunnel-te1 
Router(config-if)# ipv4 unnumbered Loopback0 
Router(config-if)# autoroute announce 
Router(config-if)# exit 
Router(config-if)# destination 10.10.10.11 
Router(config-if)# path-option 1 dynamic 
/* Configure BGP.*/
Router(config)# router bgp 100 
Router(config-bgp)# bgp router-id 10.10.10.10 
Router(config-bgp)# address-family ipv4 unicast 
Router(config-bgp)# allocate-label all unlabeled-path 
Router(config-bgp)# exit 
Router(config-bgp)# address-family ipv6 unicast 
Router(config-bgp)# exit 
/* Configure BGP Neighbor.*/
Router(config-bgp)# neighbor 10.0.0.1 
Router(config-bgp-nbr)# remote-as 200 
Router(config-bgp-nbr)# update-source Loopback0 
Router(config-bgp-nbr)# address-family ipv4 labeled-unicast 
Router(config-bgp-nbr-af)# route-policy PASS-ALL in 
Router(config-bgp-nbr-af)# route-policy PASS-ALL out 
Router(config-bgp-nbr-af)# next-hop-self 
Router(config-bgp-nbr-af)# exit 
/* Configure MPLS LDP.*/
Router(config)# mpls ldp  
Router(config-ldp)# router-id 10.1.1.1 
Router(config-ldp)# interface tunnel-te1 
Router(config-ldp)# exit 

Note


Reload the router for the hw-module commands to take effect.


Running Configuration
Router configuration:
!
hw-module profile cef bgplu-over-rsvpte enable
!
router bgp 200
 nsr
 bgp router-id 10.1.1.1
 mpls activate
  interface Bundle-Ether10
  interface Bundle-Ether40
  interface Bundle-Ether100
  interface Bundle-Ether101
  interface HundredGigE0/0/0/22
 !
 bgp graceful-restart
 ibgp policy out enforce-modifications
 address-family ipv4 unicast
  additional-paths receive
  additional-paths send
  additional-paths selection route-policy INSTALL_BACKUP
  network 10.1.1.5/32
  allocate-label all unlabeled-path
 !
 neighbor 10.1.4.1             
  remote-as 200
  bfd fast-detect
  bfd multiplier 3
  bfd minimum-interval 100
  update-source Loopback0
  address-family ipv4 labeled-unicast
   next-hop-self
   soft-reconfiguration inbound always
   !
 neighbor 10.1.5.1       
  remote-as 200
  bfd fast-detect
  bfd multiplier 3
  bfd minimum-interval 100
  update-source Loopback0
  address-family ipv4 labeled-unicast
   next-hop-self
   soft-reconfiguration inbound always
  !
  !
 neighbor 10.1.6.1           
  remote-as 200
  bfd fast-detect
  bfd multiplier 3
  bfd minimum-interval 100
  address-family ipv4 labeled-unicast
   next-hop-self
   route-policy PASS-ALL in
   route-reflector-client
   route-policy PASS-ALL out
  !

Enabling LDP (to assign labels to the tunnel):

mpls ldp
 router-id 10.1.1.1
 address-family ipv4
  label
   local
    allocate for ldp-acl
   !
  !
router isis core
 is-type level-2-only
 net 49.1111.0000.0001.00
 nsr
 nsf cisco
 log adjacency changes
 address-family ipv4 unicast
  metric-style wide
  mpls traffic-eng level-2-only
  mpls traffic-eng router-id Loopback0
  mpls traffic-eng igp-intact
 !
 address-family ipv6 unicast
  metric-style wide
  maximum-paths 64
 !
 interface Bundle-Ether40
  circuit-type level-2-only
  point-to-point
  address-family ipv4 unicast
   metric 10
  !
  address-family ipv6 unicast
   metric 10
  !
   interface Bundle-Ether100
  circuit-type level-2-only
  point-to-point
  address-family ipv4 unicast
   metric 10
  !
  address-family ipv6 unicast
   metric 10
  !
 interface Bundle-Ether101
  circuit-type level-2-only
  point-to-point
  address-family ipv4 unicast
   metric 10
  !
  address-family ipv6 unicast
   metric 10
  !

Tunnel Configuration:

interface tunnel-te141
 description PE1-PE4
 ipv4 unnumbered Loopback0
 signalled-bandwidth 1000000
 autoroute announce
 !
 destination 10.1.4.1
 fast-reroute
 path-protection
 !
 path-option 1 explicit name R1-R4-141
!
interface tunnel-te142
 description PE1-PE4
 ipv4 unnumbered Loopback0
 shutdown
 signalled-bandwidth 1000000
 autoroute announce
 !
 destination 10.1.4.1
 fast-reroute
 path-option 1 explicit name R1-R4-142
!
interface tunnel-te13641
 ipv4 unnumbered Loopback0
 signalled-bandwidth 1000000
 autoroute announce
 !
 destination 10.1.4.1
 path-option 1 explicit name R1-R3-R6-R4-Phy protected-by 2
 path-option 2 explicit name R1-R3-R6-R4-Bundle
!
!

mpls traffic-eng
 interface Bundle-Ether10
 !
 interface Bundle-Ether100
  backup-path tunnel-te 13641
 !
 interface Bundle-Ether101
  backup-path tunnel-te 13641
Verification

Verify the details of route paths:

Router# show cef 209.165.200.225/27
Tue Jun  6 13:59:39.649 UTC
201.1.1.10/32, version 838761, internal 0x5000001 0x40 (ptr 0xb6848370) [1], 0x600 (0xb67bc1d8), 0xa08 (0xbbc3c0d8)
 Updated Jun  6 13:56:34.879
 Prefix Len 32, traffic index 0, precedence n/a, priority 4
  gateway array (0xc020eac8) reference count 3, flags 0x100078, source rib (7), 0 backups
                [2 type 5 flags 0x441 (0xc1807b38) ext 0x0 (0x0)]
  LW-LDI[type=5, refc=3, ptr=0xb67bc1d8, sh-ldi=0xc1807b38]
  gateway array update type-time 1 Jun  6 13:56:34.879
 LDI Update time Jun  6 13:56:34.879
 LW-LDI-TS Jun  6 13:56:34.879
   via 10.1.4.1/32, 60047 dependencies, recursive [flags 0x6000]
    path-idx 0 NHID 0x0 [0x97518b90 0x0]
    recursion-via-/32
    next hop 10.1.4.1/32 via 24000/0/21
     local label 36112 
     next hop 10.1.4.1/32 tt141        labels imposed {ImplNull 34184}
     next hop 10.1.4.1/32 tt142        labels imposed {ImplNull 34184}
     next hop 10.1.4.1/32 tt13641      labels imposed {ImplNull 34184}
   via 10.1.5.1/32, 30045 dependencies, recursive, backup [flags 0x6100]
    path-idx 1 NHID 0x0 [0x97524fc0 0x0]
    recursion-via-/32
    next hop 10.1.5.1/32 via 24002/0/21
     local label 36112 
     next hop 10.1.5.1/32 tt13651      labels imposed {ImplNull 39146}
          
    Load distribution: 0 (refcount 2)
          
    Hash  OK  Interface                 Address
    0     Y   recursive                 24000/0        
Router# show route 10.1.4.1
Tue Jun  6 14:02:31.653 UTC

Routing entry for 10.1.4.1/32
  Known via "isis core", distance 115, metric 20, type level-2
  Installed Jun  6 13:59:07.013 for 00:03:24
  Routing Descriptor Blocks
    10.1.4.1, from 10.1.4.1, via tunnel-te141
      Route metric is 20
    10.1.4.1, from 10.1.4.1, via tunnel-te142
      Route metric is 20
    10.1.4.1, from 10.1.4.1, via tunnel-te13641
      Route metric is 20
  No advertising protos. 
Router# show route summary
Wed May 31 17:47:01.203 UTC
Route Source                     Routes     Backup     Deleted     Memory(bytes)
connected                        536        2          0           116248       
local                            539        0          0           116424       
local LSPV                       1          0          0           216          
local SMIAP                      1          0          0           216          
application fib_mgr              0          0          0           0            
static                           4          0          0           904          
bgp 200                          48152      60         0           11936632     
te-client                        0          0          0           0            
isis core                        14056       534        0           4088288      
dagr                             0          0          0           0            
vxlan                            0          0          0           0            
Total                            61364      596        0           16202240   

Verify the details of LSP tunnel:

Router# show mpls forwarding prefix 209.165.200.225/27
Tue Jun  6 14:00:17.601 UTC
Local  Outgoing    Prefix             Outgoing     Next Hop        Bytes       
Label  Label       or ID              Interface                    Switched    
------ ----------- ------------------ ------------ --------------- ------------
36112  34184       209.165.200.225/27                 10.1.4.1        0           
       39146       209.165.200.225/27                 10.1.5.1        0            

Verify the contents of the Fast Reroute (FRR) database:

show mpls traffic-eng fast-reroute database
Tue Jun  6 14:01:59.907 UTC
Tunnel head FRR information:
Tunnel       Out Intf : Label   FRR Intf : Label   Status 
------------ ------------------ ------------------ -------
tt141        BE100:Pop          tt13641:Pop        Ready  
tt142        BE101:Pop          tt13641:Pop        Ready  

Verify the forwarding information on tunnels:

Router# show mpls traffic-eng forwarding tunnel-id 141
Mon Jun  5 23:46:04.961 UTC
P2P tunnels:

Tunnel ID                  Ingress IF     Egress IF      In lbl  Out lbl        Backup 
-------------------------- -------------- -------------- ------- -------------- -------
10.1.1.1 141_10                         -          BE100 81920   3              tt13641
Displayed 1 tunnel heads, 0 label P2P rewrites
Displayed 0 tunnel heads, 0 label P2MP rewrites

Verify the utilization of banks in the NPU resources:

Router# show grid pool 2 bank 13 
Wed May 31 17:46:56.848 UTC

Bank Ptr                      : 0x308d069d38
Bank ID                       : 13
Pool                          : GLIF (id 2)
Bank Start                    : 530295
Bank End                      : 589823
Max Bank Size                 : 59529
Max Resource Pages            : 1861
Available resource IDs        : 11375 (19.108% free)
Bank statistics:                         Success      Error         (since last clear)
  Resource IDs reserved                    51728          0           51728          0
  Resource IDs returned                     3574          0            3574          0
Client                        : lsd
  Resource IDs reserved                        2          0               2          0
  Resource IDs returned                        0          0               0          0
current usage                 : 2
Client                        : rib-v4
  Resource IDs reserved                    51726          0           51726          0
  Resource IDs returned                     3574          0            3574          0
current usage                 : 48152
  

Exclusion of Label Allocation for Non-Advertised Routes

Table 2. Feature History Table

Feature Name

Release Information

Feature Description

Exclusion of Label Allocation for Non-Advertised Routes

Release 7.10.1

We have enabled better label space management and hardware resource utilization by making MPLS label allocation more flexible. This flexibility means you can now assign these labels to only those routes that are advertised to their peer routes, ensuring better label space management and hardware resource utilization.

Prior to this release, label allocation was done regardless of whether the routes being advertised. This resulted in inefficient use of label space.

The functionality to control label allocation to the routes which are not advertised to peers is introduced. You can now choose to assign labels to the routes which are advertised to the peers.

Provider Edge (PE) routers works as autonomous systems border routers (ASBRs) where this feature is configured.

You can set the community attribute to either no-advertise or no-export in route-policy configuration mode to the routes which are not going to be advertised to peers. Once the community attribute in the route-policy is updated, the router doesn’t allocate any label to those routes.


Note


no-export is only for eBGP and no-advertise can be used for both eBGP and iBGP.


How to exclude label allocation for non-advertised routes

Configuration Example

This example shows how to set the community parameter to no-advertise for the routes which are not going to be advertised to any peer routes.
/*Configure the community set*/
Router(config)#community-set no-advertise
Router(config-comm)#no-advertise
Router(config-comm)#end-set

/*Configure the route policy*/
Router(config)#route-policy set-no-advertise
Router(config-rpl)#set community no-advertise additive
Router(config-rpl)#end-policy 
Router(config-bgp-af)#route-policy pass_all
Router(config-rpl)#  pass
Router(config-rpl)#end-policy
Router(config)#route-policy pass_all
Router(config-rpl)#  pass
Router(config-rpl)#end-policy

/*Apply the route policy as inbound route policy*/
Router(config)#router bgp 1
Router(config-bgp)# neighbor 192.0.2.1
Router(config-bgp-nbr)#  remote-as 1
Router(config-bgp-nbr)#  update-source Loopback0
Router(config-bgp-nbr)#  address-family ipv4 unicast
Router(config-bgp-nbr-af)#   route-policy set-no-advertise in
Router(config-bgp-nbr-af)#   route-policy pass_all out
Router(config-bgp-nbr-af)#commit

Running Configuration

community-set no-advertise
  no-advertise
end-set
  !
!
route-policy set-no-advertise
  set community no-advertise additive
end-policy
  !
!
route-policy pass_all
  pass
end-policy
!

Verification

Use show bgp vpnv6 unicast rd command to verify the community parameter is set to no-advertised .

Router(config)# show bgp vpnv6 unicast rd 2001:DB8:0:ABCD::1

BGP routing table entry for 0:ABCD::1 Route Distinguisher: 2001:DB8
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker               19207        19207
Paths: (1 available, best #1, not advertised to any peer)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local, (Received from a RR-client)
    192.0.2.254 from 192.0.2.1 (192.0.2.1)
      Received Label 16
      Origin IGP, metric 3, localpref 3, aigp metric 3, valid, internal, best, group-best, import-candidate, not-in-vrf
      Received Path ID 0, Local Path ID 1, version 19207
      Community: 1:1 no-advertise 
      Extended community: Color:3333 RT:2001:DB8
      AIGP set by inbound policy metric
      Total AIGP metric 3

EIBGP Policy-Based Multipath with Equal Cost Multipath

Table 3. Feature History Table

Feature Name

Release Name

Description

EIBGP Policy-Based Multipath with Equal Cost Multipath

Release 7.10.1

You can gain control over traffic distribution and load-balancing capabilities by including policy-based multipath selection across various BGP variations, including iBGP, eBGP, and eiBGP. This is achieved through the utilization of BGP communities, nexthops, and path types.

Additionally, by employing the equal cost multipath (ECMP) option in eiBGP, this feature provides the capability to select ECMP across the iBGP paths chosen for eiBGP.

The feature introduces these changes:

CLI:

The keywords

route-policy and equal-cost are added to the command:

maximum-paths

YANG Data Model:

  • Cisco-IOS-XR-um-router-bgp-cfg

(see GitHub, YANG Data Models Navigator)

Overview

The enhanced policy-based multipath selection in BGP operates now at the default Virtual Routing and Forwarding (VRF) level for variations of BGP, such as iBGP, eBGP and eiBGP. To improve this functionality, the policy-based multipath selection is now extended to include iBGP, eBGP and eiBGP by utilizing communities as the underlying mechanism. By utilizing communities, the selection of multiple paths based on specific policy criteria becomes more elaborate. It enables better control over the routing decisions within the BGP network.

eiBGP traditionally implements the unequal-cost mutipath (UCMP) capability to enable the use of both iBGP and eBGP paths. This feature, utilizing the equal-cost multipath option (ECMP), ensures that the nexthop IGP metric remains consistent across the chosen iBGP paths. Hence the metric evaluation is not performed between eBGP and iBGP paths because they have distinct path types.

Topology

This topology illustrates a network comprising BGP peers denoted as R1 through R6. Consider a scenario, there is specific need wherein you are in the process of transitioning from utilizing eBGP multipaths to iBGP multipaths. Throughout this transition, you require the simultaneous operation of both eBGP and iBGP to facilitate a seamless migration.

Topology Setup

This topology showcases distinct path types, where eBGP paths are visually depicted using a red-colored line labeled as 1, and the iBGP paths are visually illustrated using a green-colored line labeled as 2.

Expected Behavior

In the context of CE routers (CEI, CE2, CE3, CE4, C5, and C6), the preferred path for prefixes will be from eBGP, specifically from the R4 router. Although there might be paths from R5 and R6 routers and also from RI and R2 routers through iBGP, the selection of best paths will prioritize eBGP multipaths from R4. This is the classic behavior. In classic eiBGP, unequal-cost paths are employed, leading to the disregard of metrics. However, you rely on the IGP metric for optimal performance.

After Implementing This Feature

The iBGP paths with the shortest AS-PATH length are chosen for R5 and R6 router paths. The same iBGP multipath selection process applies to paths from R1 and R2 routers. As a result, the R1 and R2 routers establishes an iBGP peering session with the R3 router. Therefore, a combination of eBGP and iBGP paths, referred to as eiBGP, is now available for prefixes advertised to hosts beyond the CE devices. The CE routers require load balancing of prefixes to R3 router and R4 router. However, it is necessary to exclude paths originating from R5 and R6 routers and R1 and R2 routers. Therefore, you must configure additive community on the R1 router and R2 routers towards the R5 and R6 routers.

With the setup depicted in the topology, you can establish the coexistence of both eBGP and iBGP, thus enabling seamless transition from utilizing eBGP multipaths to iBGP multipaths. By including the default VRF in policy-based multipath selection, you apply route policies to control how traffic is distributed within your network. By leveraging the BGP attributes such as BGP communities, nexthops, and path types within these route policies, you determine path selection. For example, you can use BGP communities to prioritize certain routes or manipulate nexthops to direct traffic over specific paths. This enables you to optimize routing decisions based on your specific requirements and goals, allowing you to gain control over traffic distribution and load-balancing capabilities across various BGP variations within your network.

By enabling ECMP, you allow a router to distribute traffic evenly across multiple equal-cost paths. This ensures that each path carries a portion of the traffic load, preventing any single path from becoming overwhelmed. By enabling the ECMP option in eiBGP, you allow the router to consider multiple iBGP paths with equal costs as viable options for traffic distribution. These paths are treated as equal-cost paths. This enhances load balancing in your network.

Benefits

This feature, with the inclusion of policy-based multipath selection, enables you to gain control over traffic distribution and load-balancing capabilities across various BGP variations, including iBGP, eBGP, and eiBGP. This is achieved through the utilization of BGP communities, nexthops, and path types.

Neglecting the utilization of BGP communities, nexthops, and path types within the default VRF during policy-based multipath selection can lead to limited control over traffic routing. The absence of BGP communities hinders the ability to apply specific policies to route updates, while ignoring nexthops and path types diminishes the accuracy of path selection decisions. This may result in suboptimal traffic distribution and load balancing.

Not applying ECMP within eiBGP can make the router to depend on its default path selection procedure to designate a singular optimal route from the accessible iBGP paths. This approach does not yield the load balancing and traffic distribution advantages offered by ECMP.

Restrictions for EIBGP Policy-Based Multipath with Equal Cost Multipath

The following are the restricions:

  • Configuring eiBGP along with either eBGP or iBGP is not allowed.

  • The maximum-paths route policy allows for checks on community, nexthop, and path type only.

  • The usage of the Accumulated Interior Gateway Protocol (AIGP) metric attribute is restricted only to equal-cost EIBGP scenarios.

  • The OpenConfig model is not supported.

  • When configuring eBGP and iBGP multipath together, it is possible to assign distinct or identical route policies to each of them. However, the selection of the policy to be applied between eBGP and iBGP is determined by the bestpath path type of the prefixes. If a prefix is determined to have a better path via iBGP, the iBGP route policy will be applied, while for prefixes where eBGP is deemed better, the eBGP route policy will be applied.

Configure EIBGP Policy-Based Multipath with Equal Cost Multipath

Configuration Example

Perform the following steps to configure EIBGP Policy-Based Multipath with Equal Cost Multipath:

  • Configure the community, path-type, or nexthop.

  • Configure the route-policy with the multipath selection and equal-cost multipath for eiBGP.

Configure the community-set from the R1 and R2 routers


Router(config)# community-set ABC
Router(config-comm)# 2:1
Router(config-comm)# end-set

Configure the route-policy and equal-cost multipath option for eiBGP

The route-policy EIBGP is configured on R1 and R2 routers. This route-policy examines the BGP communities associated with BGP routes and takes specific actions based on the community values. If the community matches “ABC”, the route is not selected for multipath. For all the other cases, the router selects a path for multipath if it matches the best-path's metric and has the same path-type (i.e., iBGP or EBGP). If the path-type is different from the best path-type, it must be the best among the other path types. In addition to community, you also use path-type or next-hop as a route-policy option.


Router(config)# route-policy EIBGP
Router(config-rpl)# if community matches-any ABC then
Router(config-rpl-if)# pass
Router(config-rpl-if)# else
Router(config-rpl-else)# drop
Router(config-rpl-else)# endif
Router(config-rpl)# end-policy
Router(config)# router bgp 100
Router(config-bgp)# address-family ipv4 unicast
Router(config-bgp-af)# maximum-paths eibgp 32 equal-cost route-policy EIBGP
Router(config-bgp-af)# commit
Running Configuration

community-set ABC
 2:1
 end-set
!

route-policy EIBGP
  if community matches-any ABC then
    pass
  else
    drop
  endif
end-policy router bgp 100
 address-family ipv4 unicast
  maximum-paths eibgp 32 equal-cost route-policy EIBGP
!
Verification

Verify that the router supports eiBGP multipath for this destination, and the route entries has been successfully received and processed.

Router# show bgp 203.0.113.99/32  
BGP routing table entry for 203.0.113.99/32
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker                 27           27
Last Modified: Feb 23 16:08:54.000 for 04:12:23
Paths: (7 available, best #2)
  Advertised IPv4 Unicast paths to update-groups (with more than one peer):
    0.1 0.4
  Path #1: Received by speaker 0
  Not advertised to any peer
  200 300
    209.165.200.11  from 209.165.200.11 (192.168.0.3), -> From R4
   Origin IGP, localpref 100, valid, external, multipath
      Received Path ID 0, Local Path ID 0, version 0
      Community: 2:1
      Origin-AS validity: (disabled)

  Path #2: Received by speaker 0
  Advertised IPv4 Unicast paths to update-groups (with more than one peer):
    0.1 0.4
  200 300
    209.165.201.1  from 209.165.201.1  (209.165.201.1) -> From R4
   Origin IGP, localpref 100, valid, external, best, group-best, multipath
      Received Path ID 0, Local Path ID 1, version 27
      Community: 2:1
      Origin-AS validity: (disabled)

  Path #3: Received by speaker 0
  Not advertised to any peer
  200 300, (Received from a RR-client)
    192.168.2.6 (metric 2) from 198.51.100.1  (198.51.100.1)  -> From R3
   Origin IGP, localpref 100, valid, internal, multipath, backup, add-path 
      Received Path ID 0, Local Path ID 2, version 6
      Community: 2:1
  Path #4: Received by speaker 0
  Not advertised to any peer
  200 300, (Received from a RR-client)
    192.168.0.6 (metric 2) from 192.0.2.1 (192.0.2.1)  -> From R5 
      Origin IGP, localpref 100, valid, internal
      Received Path ID 0, Local Path ID 0, version 0
      Community: 11:11 99:99

  Path #5: Received by speaker 0
  Not advertised to any peer
  200 300, (Received from a RR-client)
    192.168.0.2 (metric 5) from 192.168.0.2 (192.168.0.2)  -> From R2 
      Origin IGP, localpref 100, valid, internal
      Received Path ID 0, Local Path ID 0, version 0
      Community: 2:1 99:99
/* The router does not select Path 5, even though it satisfies the route-policy community constraint, because it has a higher metric (i.e., metric 5) than the best path of its path type (i.e., iBGP metric 2). */

  Path #6: Received by speaker 0
  Not advertised to any peer
  200 300, (Received from a RR-client)
    192.168.0.4 (metric 2) from 192.168.0.4 (192.168.0.4) -> From R5
      Origin IGP, localpref 100, valid, internal
      Received Path ID 0, Local Path ID 0, version 0
      Community: 11:11 99:99

  Path #7: Received by speaker 0
  Not advertised to any peer
  100 300, (Received from a RR-client)
    192.168.0.5 (metric 2) from 192.168.0.5 (192.168.0.5) -> From R3
   Origin IGP, localpref 100, valid, internal, multipath 
      Received Path ID 0, Local Path ID 0, version 0
      Community: 2:1

ECMP Out of Resource Avoidance

Table 4. Feature History Table

Feature Name

Release Information

Feature Description

ECMP Out of Resource Avoidance

Release 24.2.11

You can now ensure minimum packet loss and service disruption during network reconfigurations or migrations by preventing Equal-Cost Multi-Path (ECMP) Out of Resource (OOR) conditions. This feature allows BGP to delay route updates and FIB to delay programming the routes in hardware when resources are low, thus avoiding system overload.

The feature introduces these changes:

CLI:

YANG Data Models:

  • Cisco-IOS-XR-um-router-bgp-cfg.yang

  • Cisco-IOS-XR-ipv4-bgp-oper.yang

  • Cisco-IOS-XR-fib-common-cfg.yang

  • Cisco-IOS-XR-fib-common-oper.yang

(see GitHub, YANG Data Models Navigator)

Cisco 8000 routers may encounter transient Equal-Cost Multi-Path (ECMP) resource shortages (Out of Resource condition) and subsequent traffic drops for IP-BGP routes under the following conditions:

  • Data center migrations or network maintenance events, such as data center cost-in and cost-out.

  • The introduction of new data center sites, which can lead to network instability and a temporary increase in ECMP resource usage.

After the network stabilizes, the router gracefully recovers from the ECMP spike. However, the traffic that was dropped during an OOR condition doesn’t automatically recover.

Avoiding OOR Conditions

This feature allows the hardware resource usage to be tracked using an inline resource tracking mechanism within the Forwarding Information Base (FIB). Inline resource tracking provides real-time feedback on resource consumption directly within the FIB, which is beneficial for managing hardware resources more effectively. This approach allows for admission control mechanisms within the Border Gateway Protocol (BGP) and the FIB. These mechanisms can cache updates and delay certain operations until the OOR condition is resolved, ensuring that the system doesn’t exceed its resource capacity.

When the resource utilization reaches a predefined threshold, BGP delays best path selection and route installation into the Routing Information Base (RIB), while the FIB delays hardware programming. This delay is configurable and is designed to prevent the system from reaching a state where it can’t accommodate new routes because of resource constraints. By allowing BGP to delay route updates and FIB to delay hardware programming when resources are low, the system can avoid entering an OOR state, thereby achieving minimal to zero traffic loss and improved network performance.

FIB Dampening

When the resource usage reaches the configured dampening threshold, instead of immediately programming every route update into the hardware, the FIB consolidates or caches the route updates in the CPU memory, and delays the hardware programming. This approach prevents a sudden overload of the network's resources and keeps traffic flowing without interruption, even when resources are low.

FIB dampening is disabled by default. You can enable it through CEF configuration.

Dampening Switchover

Dampening Switchover is a mechanism that can detect when the state of route churn stabilizes. Once stability is detected, the route updates of stable state are programmed into the hardware.

Forced Switchover

If the network continues to experience churn and the dampening switchover algorithm couldn’t find a stable state, a forced dampening switchover occurs once the maximum dampening duration expires. The default duration for this dampening period is typically set to 5 minutes.

During a forced switchover, some routes may be switched to Destination-Based Load Balancing (DLB) mode. This switch depends on the hardware resource usage. If the hardware resource usage exceeds the configured DLB threshold, the system may enter the DLB mode.

Destination-Based Load Balancing (DLB)

Routes are programmed in DLB mode only under specific conditions:

  • New Route Installation: If a new route is being installed and the current hardware resource usage exceeds the configured DLB threshold, the route should be programmed in DLB mode to prevent an OOR condition.

  • Dampening Switchover: During a forced dampening switchover, if the hardware resource usage is above the DLB threshold, the routes are programmed in DLB mode.

Uni-path Mode

DLB operates in a uni-path mode, which means that when DLB is triggered, the router selects a single path for forwarding traffic instead of multiple equal-cost paths. This is a protective measure to prevent the system from hitting an Out of Resource (OOR) condition.

Link-Over-Subscription Risk

When DLB mode is activated, the ability to spread data traffic evenly across multiple paths (ECMP) isn’t available. This can lead to a risk of link over-subscription, as traffic that could have been distributed across several paths is now sent over a single path.

Automatic Switching Between DLB and ECMP

The system automatically switches between DLB and ECMP modes based on the current hardware resource utilization. If the hardware resource usage falls below the configured DLB threshold, the system reverts to using ECMP for the affected routes. Conversely, if the resource usage reaches the configured DLB threshold again, the system switches back to DLB mode.

Limitations for ECMP OOR Avoidance

These limitations apply to the ECMP OOR Avoidance feature:

  • Resource accounting is designed only for deployments without MPLS in the path, such as IGP with MPLS, BGP LU/VPN, and so on. In cases where MPLS is present, and the system detects a significant number of Link Down Indications (LDIs) with MPLS protocol (more than approximately 1000 LDI), the system self-adjusts by increasing the resource count to account for the maximum MPLS paths. MPLS resource usage will only be increased after the system identifies considerable usage, to prevent misclassification of internal labels (like BFD internal label) as MPLS deployment.

  • Resource accounting will only cover recursive and non-recursive LDI utilized by FIB. Other objects or features (for example, L2) that reserve ECMP or members will not be accounted for.

  • The inline resource accounting in FIB may not align with the SDK resource accounting that is displayed in the show controller npu resource command output.

  • FIB is not expected to transition LDIs from one load-balancing level to another (for example, SHLDI to REC_SHLDI or to PHLDI, and so on.). If any such transition occurs, the system disables resource monitoring accounting and triggers a warning message to alert the user. This precaution is necessary because different counters are used for different levels, and transitions could lead to inaccuracies in resource accounting.

  • Resource accounting is not enabled for management interfaces and special (drop) adjacencies.

Configure BGP for ECMP OOR Avoidance

In BGP, you must configure the ECMP delay duration and the resource usage threshold limit.

Procedure

Step 1

Execute the prefix-ecmp-delay interval_value oor-threshold threshold_value command to configure the ECMP delay duration and the OOR threshold value.

Example:
router bgp 100
  address-family ipv4 unicast
    prefix-ecmp-delay 10000 oor-threshold 30

In this sample configuration, when the resource usage exceeds a threshold of 30%, programming of new routes into the hardware is delayed by 10 seconds (10000 ms).

Currently, this command is supported only in global Address Family Identifier (AFI) and Subsequent Address Family Identifiers (SAFI) for IPv4 and IPv6.

Step 2

Execute the show bgp ipv4 unicast process detail performance-statistics | b OOR command or show bgp ipv4 unicast process detail | b OOR command to verify the configuration.

Example:
Router# show bgp ipv4 unicast process detail performance-statistics | b OOR
Fri Jun  7 17:35:20.284 UTC
OOR queue Info:
 Oldest Queue Num: 0
 Recent Queue Num: 0
 Prefix count HWM: 40000
 Delayed Paths count: 30680000
 Delayed Nets count: 280000
 Processed Nets count: 270000
 Last delayed Q time: May 29 22:30:23.412
 Last processed Q time: May 29 22:31:35.409
 Last OOR recovery time: ---
 Q-num  Q-size   Expiry-Time
  1     0       ---
  2     0       ---
  3     0       ---
  4     0       ---
  5     0       ---
Example:
Router# show bgp ipv4 unicast process detail | b OOR 
Fri Jun  7 17:38:18.613 UTC
 OOR Flag 0 OOR Threshold 0
 Prefix Download Delay 10000
Dampening is not enabled

Step 3

Execute the show bgp location detail command to view the details of BGP prefix delays.

Router# show bgp 209.165.201.9/27 detail   
Wed Jul 31 14:01:13.358 EDT
BGP routing table entry for 209.165.201.9/27
Versions:
  Process           bRIB/RIB   SendTblVer
  Speaker           18490149     18490149
    Flags: 0x00023201+0x28010000+0x00000000 multipath; 
Last Modified: Jul 30 19:17:47.643 for 18:43:25
Last Delayed at: Jul 30 19:10:32.643
Paths: (16 available, best #1)
  Advertised IPv4 Unicast paths to update-groups (with more than one peer):
    10.1 0.7 0.8 
  Advertised IPv4 Unicast paths to peers (in unique update groups):
    172:23:1:79::2                          
  Path #1: Received by speaker 0
  Flags: 0x3000000001078001+0x00, import: 0x020
  Advertised IPv4 Unicast paths to update-groups (with more than one peer):
    10.1 0.7 0.8 
  Advertised IPv4 Unicast paths to peers (in unique update groups):
    172:23:1:79::2                          
  9001 64313 56001 58505, (received & used)
    209.165.201.2 from 209.165.201.2 (10.1.1.1), if-handle 0x00000000
      Origin IGP, localpref 100, valid, external, best, group-best, multipath
      Received Path ID 0, Local Path ID 1, version 18490149
      Origin-AS validity: (disabled)
  Path #2: Received by speaker 0
  Flags: 0x3000000001038001+0x00, import: 0x020
  Not advertised to any peer
  9002 64313 56001 58505, (received & used)
    209.165.200.2 from 209.165.200.2 (10.1.1.2), if-handle 0x00000000
      Origin IGP, localpref 100, valid, external, group-best, multipath
      Received Path ID 0, Local Path ID 0, version 0
      Origin-AS validity: (disabled)
  Path #3: Received by speaker 0
  Flags: 0x3000000001038001+0x00, import: 0x020
  Not advertised to any peer
  9003 64313 56001 58505, (received & used)
    209.165.202.2 from 209.165.202.2 (50.1.1.3), if-handle 0x00000000
      Origin IGP, localpref 100, valid, external, group-best, multipath
      Received Path ID 0, Local Path ID 0, version 0
      Origin-AS validity: (disabled)
  Path #4: Received by speaker 0
  Flags: 0x3000000001038001+0x00, import: 0x020
  Not advertised to any peer
  9004 64313 56001 58505, (received & used)
    209.165.200.6 from 209.165.200.6 (10.1.1.4), if-handle 0x00000000
      Origin IGP, localpref 100, valid, external, group-best, multipath
      Received Path ID 0, Local Path ID 0, version 0
      Origin-AS validity: (disabled)
....

The highlighted content in the sample output indicates that the BGP prefix download to the RIB has been delayed.


Configure Dampening and DLB Modes

In FIB, you must enable dampening and DLB modes.
Procedure

Step 1

To enable dampening and DLB features with their default values, use the cef load-balancing recursive oor mode dampening-and-dlb command.

Example:
Router(config)# cef load-balancing recursive oor mode dampening-and-dlb

The default hardware usage values for FIB dampening and DLB are 70% and 90% respectively. The default FIB dampening switchover duration is 300 seconds.

  1. To manually configure the FIB dampening switchover duration, use the cef load-balancing recursive oor mode dampening-and-dlb max-duration value command.

    Example:
    Router(config)# cef load-balancing recursive oor mode dampening-and-dlb max-duration 500

    The FIB dampening switchover duration value ranges from 1 second to 600 seconds. FIB dampening and DLB are enabled with default hardware usage values (70%, and 90%).

  2. To manually configure the FIB dampening threshold value, FIB dampening maximum switchover duration, and DLB threshold value, use the cef load-balancing recursive oor mode dampening-and-dlb dampening resource-threshold mbb_threshold max-duration value dlb resource-threshold dlb_threshold command.

    Example:
    Router(config)# cef load-balancing recursive oor mode dampening-and-dlb dampening resource-threshold 80 max-duration 400 dlb resource-threshold 50 

    The FIB dampening threshold value ranges from 1 through 99, the FIB dampening switchover duration value ranges from 1 second to 600 seconds, and the DLB threshold value ranges from 1 through 99.

Step 2

When Hierarchical Load Balancing (HLB) routes are present, configure the cef load-balancing mode hierarchical ecmp min-paths value command.

Example:
Router(config)# cef load-balancing mode hierarchical ECMP min-paths 100

The minimum paths value ranges from 1 through 128.

Note

 

Before Release 24.2.1, the cef hierarchical-load-balancing ecmp min-paths value command was used to enable HLB with ECMP.

Step 3

You can always monitor the syslog messages to see if dampening or DLB is triggered. If the syslog messages are not displayed by default on the console, use the show logging | i OOR command to view the syslog messages.

Example:
Router#show logging | i OOR
Fri Jun  7 02:05:08.556 EDT
RP/0/RP0/CPU0:Jun  7 01:50:52.159 EDT: fib_mgr[408]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 58% 
LC/0/1/CPU0:Jun  7 01:50:52.159 EDT: fib_mgr[253]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 58% 
LC/0/6/CPU0:Jun  7 01:50:52.159 EDT: fib_mgr[265]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 58% 
RP/0/RP1/CPU0:Jun  7 01:50:52.158 EDT: fib_mgr[213]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 58% 
RP/0/RP1/CPU0:Jun  7 01:50:56.219 EDT: fib_mgr[213]: %ROUTING-FIB-4-LB_OOR_DLB_HANDLING : Enter Load Balancing OOR DLB (uni-path) mode. HW resmon: 85% 
RP/0/RP0/CPU0:Jun  7 01:50:56.220 EDT: fib_mgr[408]: %ROUTING-FIB-4-LB_OOR_DLB_HANDLING : Enter Load Balancing OOR DLB (uni-path) mode. HW resmon: 85% 
LC/0/6/CPU0:Jun  7 01:50:56.223 EDT: fib_mgr[265]: %ROUTING-FIB-4-LB_OOR_DLB_HANDLING : Enter Load Balancing OOR DLB (uni-path) mode. HW resmon: 85% 
LC/0/1/CPU0:Jun  7 01:50:56.224 EDT: fib_mgr[253]: %ROUTING-FIB-4-LB_OOR_DLB_HANDLING : Enter Load Balancing OOR DLB (uni-path) mode. HW resmon: 85% 
LC/0/6/CPU0:Jun  7 01:50:56.931 EDT: npu_drvr[296]: %PLATFORM-OFA-4-OOR_YELLOW : NPU 1, Table npu, Resource stage1_lb_member 
RP/0/RP1/CPU0:Jun  7 01:55:56.357 EDT: fib_mgr[213]: %ROUTING-FIB-4-LB_OOR_DAMPENING_EXIT : Exit FIB Load Balancing OOR Dampening. HW resmon: 85% 
RP/0/RP0/CPU0:Jun  7 01:55:56.386 EDT: fib_mgr[408]: %ROUTING-FIB-4-LB_OOR_DAMPENING_EXIT : Exit FIB Load Balancing OOR Dampening. HW resmon: 85% 
LC/0/6/CPU0:Jun  7 01:55:56.888 EDT: fib_mgr[265]: %ROUTING-FIB-4-LB_OOR_DAMPENING_EXIT : Exit FIB Load Balancing OOR Dampening. HW resmon: 85% 
LC/0/1/CPU0:Jun  7 01:55:56.975 EDT: fib_mgr[253]: %ROUTING-FIB-4-LB_OOR_DAMPENING_EXIT : Exit FIB Load Balancing OOR Dampening. HW resmon: 85% 
LC/0/1/CPU0:Jun  7 02:04:10.037 EDT: fib_mgr[253]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 84% 
LC/0/6/CPU0:Jun  7 02:04:10.039 EDT: fib_mgr[265]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 84% 
RP/0/RP0/CPU0:Jun  7 02:04:10.048 EDT: fib_mgr[408]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 84% 
RP/0/RP1/CPU0:Jun  7 02:04:10.055 EDT: fib_mgr[213]: %ROUTING-FIB-4-LB_OOR_DAMPENING_HANDLING : Enter Load Balancing OOR Dampening mode. HW resmon: 84% 

This sample output shows the history of routes entering and exiting the dampening and DLB modes.

Step 4

To verify the hardware resource usage of the platform, run the show cef misc command.

Example:
IPv4
Router# show cef misc location 0/6/CPU0 | i LVL
Fri Jun  7 02:04:47.585 EDT
LVL1 LB Group:        Max: 8192; Used: 823(10%); high watermark: 1042, Jun  7 01:50:56.223 (LB OOR threshold: Dampening,40%; DLB,85%);
LVL1 LB Member Paths: Max: 16384; Used: 7566(46%); high watermark: 13969, Jun  7 02:04:19.891 (LB OOR threshold: Dampening,40%; DLB,85%);
LVL2 LB Group:        Max: 8192; Used: 3289(40%); high watermark: 3289, Jun  7 02:04:11.571;
LVL2 LB Member Paths: Max: 16384; Used: 4671(28%); high watermark: 4671, Jun  7 02:04:11.571
Example:
IPv6
Router# show cef ipv6 misc location 0/6/CPU0 | i LVL 
Fri Jun  7 02:04:54.442 EDT
LVL1 LB Group:        Max: 8192; Used: 823(10%); high watermark: 1042, Jun  7 01:50:56.223 (LB OOR threshold: Dampening,40%; DLB,85%);
LVL1 LB Member Paths: Max: 16384; Used: 7566(46%)); high watermark: 13969, Jun  7 02:04:19.891 (LB OOR threshold: Dampening,40%; DLB,85%);
LVL2 LB Group:        Max: 8192; Used: 3289(40%); high watermark: 3289, Jun  7 02:04:11.571;
LVL2 LB Member Paths: Max: 16384; Used: 4671(28%); high watermark: 4671, Jun  7 02:04:11.571

This example shows that the percentage of hardware resource used (46%) is greater than the configured dampening percentage (40%).

Note

 

Since IPv4 and IPv6 counters share the same resources, the hardware usage values in both IPv4 and IPv6 outputs are identical.

Step 5

To verify entries that are queued in the FIB OOR retry queue based on the object queue ID, use the show cef object-queue queue-id queue_id command.

Example:
Router# show cef object-queue queue-id 23 detail location 0/6/CPU0      
Fri Jun  7 00:57:19.942 EDT
OBJ_PARTITION_MARKER id:PiDLB
 objs:0, walks:0, walked pl:0 route:0, active N, last-obj-add:Not Yet Recorded
 ptr: 0x308c152610
 obj type: OBJ_MARKER, flags: 0, refcnt: 0
 update time May 31 13:53:49.105
OBJ_PARTITION_MARKER id:MBBO
 objs:42, walks:0, walked pl:0, last-obj-add:Jun  7 00:57:14.996
 ptr: 0x308c152a90
 obj type: OBJ_MARKER, flags: 0, refcnt: 0
 update time May 31 13:53:49.105PATHLIST pl:0x3094a09f98 paths:50 pl-type:Shared refcnt:500
    1st prefix dependent: default 0xe0000000 209.1.83.1/32 leaf:0x309dbadfa8
 ptr: 0x308c3ddb40
 obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
 update time Jun  7 00:57:08.820
PATHLIST pl:0x3094a1de98 paths:54 pl-type:Shared refcnt:1500
    1st prefix dependent: default 0xe0000000 209.1.85.1/32 leaf:0x309dbcd3a8
 ptr: 0x308c3c87c8
 obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
 update time Jun  7 00:57:09.697
OBJ_PARTITION_MARKER id:MBBN
 objs:48, walks:7, walked pl:687 route:161479, merged-pl:17581, max-dur:300s, sleep:0, force:0, active Y, last-obj-add:Jun  7 00:57:14.994
 ptr: 0x308c152f10
 obj type: OBJ_MARKER, flags: 0, refcnt: 0
 update time May 31 13:53:49.103
 OOR Dampening - MBB Switchover History, num entries 7
  -------------------------------------------------------------------------------------------------
 |      Time Stamp     | resource avail check (nhg/mem) | wlk-pl | pl-left | mbb2dlb | RM low/peak |
  -------------------------------------------------------------------------------------------------
 | Jun  1 18:09:18.592 | 155  / 5665  .vs. 59   / 2097  | 59     | 0       |  0      |  16% /  49% |
 | Jun  1 18:25:03.488 | 0    / 0     .vs. 371  / 3661  | 371    | 0       |  0      |  17% /  39% |
 | Jun  1 18:25:06.688 | 0    / 0     .vs. 23   / 1273  | 23     | 0       |  0      |  27% /  35% |
 | Jun  1 18:25:27.936 | 5    / 230   .vs. 62   / 3236  | 62     | 0       |  0      |  14% /  33% |
 | Jun  3 16:54:41.920 | 111  / 4970  .vs. 58   / 2119  | 58     | 0       |  0      |  23% /  51% |
 | Jun  3 18:47:12.128 | 79   / 4497  .vs. 46   / 1908  | 46     | 0       |  0      |  26% /  52% |
  --------------------------------------------------------------------------------------------------
PATHLIST pl:0x3094a1bf98 paths:69 pl-type:Shared refcnt:1500
    1st prefix dependent: default 0xe0000000 209.1.85.1/32 leaf:0x309dbcd3a8
 ptr: 0x308c3e3370
 obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
 update time Jun  7 00:57:08.817
PATHLIST pl:0x3094a1ff98 paths:61 pl-type:Shared refcnt:500
    1st prefix dependent: default 0xe0000000 209.1.83.1/32 leaf:0x309dbadfa8
 ptr: 0x308c3d6f68
 obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
 update time Jun  7 00:57:08.567

This example indicates that the system is in dampening state.

MBBO (old path) has 54 paths, and MBBN (new path) has 69 paths.

PiDLB indicates that the prefix or route is programmed in uni-path to avoid ECMP OOR condition.

  1. To verify the event history of dampening switchover and DLB recovery, run the show cef object-queue queue-id queue_id detail command.

    Example:
    Dampening switchover
    Router# show cef ipv6 object-queue queue-id 23 detail location 0/6/CPU0 | b MBB
    Fri Jun  7 01:03:59.295 EDT
    OBJ_PARTITION_MARKER id:MBBO
     objs:0, walks:0, walked pl:0, last-obj-add:Jun  7 00:56:47.889
     ptr: 0x308cc88390
     obj type: OBJ_MARKER, flags: 0, refcnt: 0
     update time May 31 13:53:49.418
    OBJ_PARTITION_MARKER id:MBBN
     objs:0, walks:7, walked pl:102 route:25251, merged-pl:162, max-dur:300s, sleep:0, force:1, active N, 
    last-obj-add:Jun  7 00:56:56.796
     ptr: 0x308cc88810
     obj type: OBJ_MARKER, flags: 0, refcnt: 0
     update time May 31 13:53:49.418
     OOR Dampening - MBB Switchover History, num entries 7
      -------------------------------------------------------------------------------------------------
     |      Time Stamp     | resource avail check (nhg/mem) | wlk-pl | pl-left | mbb2dlb | RM low/peak |
      -------------------------------------------------------------------------------------------------
     | May 31 22:24:51.840 | 53   / 229   .vs. 9    / 120   | 9      | 0       |  0      |  53% /  54% |
     | May 31 22:25:43.296 | 0    / 0     .vs. 3    / 42    | 3      | 0       |  0      |  24% /  24% |
     | Jun  3 16:53:30.624 | 227  / 1558  .vs. 24   / 325   | 24     | 0       |  0      |  37% /  45% |
     | Jun  3 18:45:44.320 | 304  / 2246  .vs. 51   / 645   | 51     | 0       |  0      |  37% /  50% |
     | Jun  3 18:46:34.496 | 0    / 0     .vs. 1    / 15    | 1      | 0       |  0      |  39% /  39% |
     | Jun  3 18:47:12.128 | 1    / 13    .vs. 1    / 15    | 1      | 0       |  0      |  26% /  26% |
     | Jun  7 01:01:55.840 | 0    / 0     .vs. 13   / 342   | 13   F | 0       |  12      |  46% /  48% |
      --------------------------------------------------------------------------------------------------
    OBJ_PARTITION_MARKER id:MBBNR
     objs:0, walks:1, walked pl:13, merged-pl:0, last-obj-add:Jun  7 00:56:56.796
     ptr: 0x308cc88c90
     obj type: OBJ_MARKER, flags: 0, refcnt: 0
     update time May 31 13:53:49.418
     OOR Dampening - HLB Site Routes MBB Switchover History, num entries 1
      ---------------------------------------------------------------
     |      Time Stamp     | wlk-pl | wlk-lf | pl-left | RM low/peak |
      ---------------------------------------------------------------
     | Jun  7 01:01:55.840 | 13   F | 13     | 0       |  16% /  16% |
      ---------------------------------------------------------------
    OBJ_PARTITION_MARKER id:OOR
     objs:0, walks:3, walked pl:10, last-obj-add:Jun  3 18:36:16.526
     ptr: 0x308cc89110
     obj type: OBJ_MARKER, flags: 0, refcnt: 0
     update time May 31 13:53:49.418
    

    In this sample output,

    • Dampening switchover is configured with a dampening threshold of 300 s (5 mins). The objects remain in dampening queue for five minutes until the timer expires. After five minutes, the routes are programmed in ECMP mode or DLB mode based on the hardware resource state.

    • MBB Switchover History displays the history of dampening switchovers happened at different time stamps.

      • pl-left =0 implies an empty object queue.

      • mbb2dlb =12 indicates that dampening switchover has happened and 12 routes will be programmed in DLB mode.

      • F indicates dampening switchover by force.

      • active N indicates that the system is not in dampening state.

    • HLB Site Routes MBB switchover history displays the history of HLB site routes switchovers happened at different time stamps.

      HLB routes use non recursive resources.

    Example:
    DLB recovery
    Router# show cef object-queue queue-id 23 detail location 0/6/CPU0  
    Fri Jun  7 02:16:29.223 EDT
    OBJ_PARTITION_MARKER id:PiDLB
     objs:1536, walks:3, walked pl:3 route:37, active Y, last-obj-add:Jun  7 02:09:11.828
     ptr: 0x308c152610
     obj type: OBJ_MARKER, flags: 0, refcnt: 0
     update time May 31 13:53:49.104
     OOR Dampening - PI-DLB Recovery History, num entries 3
      ---------------------------------------------------------------
     |      Time Stamp     | wlk-pl | wlk-lf | pl-left | RM low/peak |
      ---------------------------------------------------------------
     | Jun  7 02:04:08.832 | 1      | 16     | 1511    |  84% /  85% |
     | Jun  7 02:04:11.008 | 1      | 18     | 1525    |  84% /  85% |
     | Jun  7 02:04:20.096 | 1      | 3      | 1536    |  84% /  85% |
      ---------------------------------------------------------------
    PATHLIST pl:0x30a51b6698 paths:15 pl-type:Shared refcnt:10
        1st prefix dependent: default 0xe0000000 207.1.89.101/32 leaf:0x30a59daaa8
     ptr: 0x30a4fa1068
     obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
     update time Jun  7 01:50:56.233
    PATHLIST pl:0x30a51b6798 paths:10 pl-type:Shared refcnt:9
        1st prefix dependent: default 0xe0000000 207.1.89.103/32 leaf:0x30a59daba8
     ptr: 0x30a4fa10f0
     obj type: QUEUE-EXTENSION, flags: 0, refcnt: 0
     update time Jun  7 01:50:56.233
    

    In this sample output,

    • active Y indicates that the DLB state is active.

    • PI-DLB Recovery History displays the number of pathlists and leafs that are yet to be walked.

      • The objs value and pl-left value will match most of the time.

        Note

         

        The object queue for line cards, for example, LC1, and LC2 can have similar or slightly different values.

Step 6

To verify if the route is installed in DLB mode, use the show cef ipv4 | ipv6 command.

Example:
Router# show cef 209.165.200.225
Mon Nov 27 17:56:39.569 PST
198.0.0.2/32, version 12, PI-DLB, internal 0x1000001 0x0 (ptr 0x62f656d0) [1], 0x0 (0x0), 0x0 (0x0)
Updated Nov 27 17:55:40.203
Prefix Len 32, traffic index 0, precedence n/a, priority 0
  gateway array (0x6323a8d0) reference count 2, flags 0x2010, source rib (7), 0 backups
                [1 type 3 flags 0x48449 (0x6329c0d8) ext 0x0 (0x0)]
  LW-LDI[type=0, refc=0, ptr=0x0, sh-ldi=0x0]
  gateway array update type-time 1 Nov 27 17:55:40.203
   via 10.0.0.2/32, 5 dependencies, recursive [flags 0x0]
    path-idx 0 NHID 0x0 [0x62f65cd8 0x0], Internal 0x643fc0a0
    next hop 10.0.0.2/32 via 10.0.0.2/32
   via 11.0.0.2/32, 3 dependencies, recursive [flags 0x0]
    path-idx 1 NHID 0x0 [0x62f65a68 0x0], Internal 0x643fc1d0
    next hop 10.10.10.2/32 via 10.09.0.2/32
 
    Load distribution: 0 (refcount 2)
 
    Hash  OK  Interface                 Address	
    0     Y   UNKNOWN intf 0x00000014   10.0.1.2

This sample output shows that the route is installed in DLB mode, and the single path is picked by Hash calculations.


Protection of Directly Connected EBGP Neighbors through Interface-Based LPTS Identifier

Table 5. Feature History Table

Feature Name

Release Name

Description

Protection of Directly Connected EBGP Neighbors through Interface-Based LPTS Identifier

Release 7.10.1

We have enhanced the network security for directly connected eBGP neighbors by ensuring that only packets originating from designated eBGP neighbors can traverse through a single interface, thus preventing IP spoofing. This is made possible because we've now added an interface identifier for Local Packet Transport Services (LPTS). LPTS filters and polices the packets based on the type of flow rate you configure.

The feature introduces these changes:

CLI:

YANG Data Model:

Local Packet Transport Services (LPTS) maintains tables describing all packet flows destined for the secure domain router (SDR), making sure that packets are delivered to their intended destinations.

With respect to BGP sessions, LPTS bindings can be categorized as follows:

  • BGP Known: These LPTS entries correspond to BGP sessions with established neighbors.

  • BGP Configured Peer: LPTS entries in this category are designated to receive the initial packets (TCP SYN and 3rd ACK) from specifically configured BGP neighbors.

  • BGP Default Entries: This category encompasses LPTS entries that capture all packets originating from un-configured BGP neighbors.

An attacker who spoofs a packet using the exact combination of source IP, destination IP, source port, and destination port, and then floods these packets from another interface within the same VRF, will cause the packet to match the BGP known LPTS entry. As a result, the packet will traverse up to the TCP layer and potentially be dropped at that level. All BGP known LPTS entries share a common LPTS policer, which means that packets arriving through any of these entries will be policed at the specified rate.

However, if the attacker sends these packets at a rate exceeding the policer's defined rate, this will lead to congestion in this flow, adversely impacting BGP established peers. As a result, these BGP sessions may experience instability, which could lead to flapping.

This feature enables you to protect your network by adding an interface identifier for LPTS in directly connected eBGP neighbors. LPTS filters and polices the packets based on the type of flow rate you configure. This feature ensures that only packets originating from designated eBGP neighbors can traverse through a single interface, thus preventing IP spoofing. The interface identifier that is added will be passed to the LPTS and TCP only when the below-mentioned criteria are met:

  • The BGP peer is configured to be external.

  • The Fast External Failover (FEF) is not disabled.

  • The BGP peer is direclty connected.

  • The BGP peer is not a dynamic peer.

  • eBGP multihop is not enabled.

  • The default eBGP TTL is used.

  • The "ignore connected" option is not configured.

  • A non-link local IPv6 neighbor address is configured.

In the LPTS binding process through the LPTS socket option, BGP generates a tuple for the interface identifier for every directly configured eBGP neighbor.

The configured BGP LPTS entry will only match an incoming connection (TCP SYN packet) if it is received from the programmed interface.

The BGP default entry handles incoming connections, or any other packets, received on interfaces other than the specified ones. These packets are subjected to rigorous policing and forwarded to TCP for reset generation. As a result, any spoofed packets arriving from non-desired interfaces will not affect the BGP configured peer LPTS entries.

Upon receiving a passive connection from the programmed interface and establishing it at the TCP level, TCP will inherit the same interface for the BGP known LPTS entry, which will be created for this specific connection.

Packets that match the source IP, destination IP, source port, destination port, and VRF information of an established connection , but are received from a different interface, will not be matched to the LPTS entry. As a result, these packets will be directed to the BGP default entry. This mechanism ensures that spoofed packets originating from non-desired interfaces will not affect the BGP known peer LPTS entries.

During the bind process for an active connection, BGP will also furnish the interface identifier. TCP will incorporate this interface information into the LPTS entry corresponding to the active connection, effectively safeguarding BGP known LPTS entries against spoofed packets that might match this connection but originate from a different interface.

Configure Protection of Directly Connected EBGP Neighbors through Interface-Based LPTS Identifier

To enable Local Packet Transport Services (LPTS) secure binding, perform the following steps:


Router#(config)router bgp 100
Router#(config-bgp) bgp lpts-secure-binding
Running Configuration

router bgp 100
 bgp lpts-secure-binding
Verification

Verify the LPTS bindings along with the connected interface identifier:

Router# show lpts pifib entry brief 

 IPv4    default  TCP    any          [0x00000003]      10.10.10.1,23756 10.10.10.2,179
 IPv4    default  TCP    any          0/0/CPU0           10.10.10.1,179 10.10.10.2
 IPv4    default  TCP    Gi0/2/0/1    [0x00000003]       192.0.2.1,57342 192.0.2.3,179
 IPv4    default  TCP    Gi0/2/0/1    0/0/CPU0           192.0.2.1,179 192.0.2.3
 IPv4    default  TCP    any          [0x00000003]       209.165.201.1,179 209.165.201.4,52798
 IPv4    default  TCP    any          0/0/CPU0           209.165.201.1,179 209.165.201.0/24
 IPv4    default  TCP    Gi0/2/0/3    [0x00000003]       172.16.0.1,179 172.16.0.5,49505
 IPv4    default  TCP    Gi0/2/0/3    0/0/CPU0           172.16.0.1,179 172.16.0.5
 IPv4    default  TCP    any          [0x00000003]       192.168.0.1,179 192.168.0.6,32909
 IPv4    default  TCP    any          0/0/CPU0           192.168.0.1,179 192.168.0.6

Verify that the LPTS secure binding is enabled:

Router# show bgp process | in LPTS

Wed Dec 14 14:28:33.779 PST
LPTS secure binding is enabled

Verify that the status of the connected interface identifier in LPTS is active:

Router# show bgp neighbor 192.0.2.3, detail | in Connected

Wed Dec 14 14:28:51.814 PST
  Connected IFH: 0x1000080, IFH in LPTS 0x1000080

Convergence for BGP Labeled Unicast PIC Edge

Table 6. Feature History Table

Feature Name

Release Information

Feature Description

Convergence for BGP Labeled Unicast PIC Edge

Release 7.7.1

This feature improves the convergence time of BGP labeled unicast (LU) routes to subseconds when an ingress provider edge router fails or loses PE router connectivity, and another PE router needs to be connected. This feature minimizes traffic drops when the primary paths fail for the BGP LU routes.

BGP Labeled Unicast (LU) PIC Edge feature enables you to create and store both the primary and backup path in the Routing Information Base (RIB), Forwarding Information Base (FIB), and Cisco Express Forwarding. When the router detects a failure, the backup or alternate path immediately takes over, thus this feature enables fast failover and convergence in subseconds.

For BGP LU PIC Edge to work, the edge iBGP devices, such as ingress PEs and Autonomous System Border Router (ASBR), must support BGP PIC and must receive backup BGP next hop.

The topology diagram given below illustrates the Convergence for BGP Labeled Unicast PIC Edge feature. The topology is explained as follows:

  • The BGP LU PIC Edge feature is enabled on a provider edge router, PE1.

  • PE1 learns the BGP LU prefix from the remote PE router, PE2.

  • PE1 routes traffic through the Area Border Routers, ABR1, ABR2 and ABR3. If one of them fails, the preprogrammed backup of the failed ABR routes the traffic.

  • PE1 routes traffic through the Area Border Routers, ABR1, ABR2 and ABR3.

  • PE2 is marked as the backup or alternate next hop and is programmed into the FIB of PE1.

  • When PE1 learns PE2 is not reachable through ABR1, it immediately changes the BGP next hop for the PE1's prefix to ABR2.

  • The switchover occurs in less than a second regardless of the number of prefixes.

  • Subsecond convergence occurs although updates to multiple BGP prefixes are pending.

Topology

Figure 5. BGP LU PIC Edge

Guidelines and Limitations

This feature supports BGP multipaths that allows the router to install multiple internal BGP paths and multiple external BGP paths to the forwarding table. The multiple paths enable BGP to load balance traffic across multiple links.

The convergence time is independent of the BGP LU route scale.

Configure Convergence for BGP Labeled Unicast PIC Edge

Perform the following steps to configure Convergence for BGP Labeled Unicast PIC Edge:

  • Configure BGP labeled unicast and attach route-policy to BGP address families.

  • Configure BGP labeled unicast multipath and attach route-policy to BGP address families


Router(config)# route-policy BGP-PIC-EDGE
Router(config-rpl)# set path-selection backup 1 install
Router(config-rpl)# end-policy
Router(config)# end
Router(config)# router bgp 200
Router(config-bgp)# bgp router-id 10.0.0.1 
Router(config-bgp)#  address-family ipv4 unicast 
Router(config-bgp-af)# additional-paths receive
Router(config-bgp-af)# additional-paths send
Router(config-bgp-af)# additional-paths selection route-policy BGP-PIC-EDGE

/*Perform the following steps to configure BGP labeled unicast multipath and attach route-policy to BGP address families: */
Router(config)# route-policy BGP-PIC-EDGE-MULTIPATH 
Router(config-rpl)# set path-selection backup 1 install multipath-protect 
Router(config)# end-policy 
Router(config)# router bgp 200 
Router(config)# bgp router-id 192.168.1.0 
Router(config)# address-family ipv4 unicast
Router(config)# maximum-paths ibgp 2
Router(config)# additional-paths receive
Router(config)# additional-paths send 
Router(config)# additional-paths selection route-policy BGP-PIC-EDGE-MULTIPATH

Running Configuration

route-policy BGP-PIC-EDGE 
 set path-selection backup 1 install
 end-policy
router bgp 200
 bgp router-id 192.168.1.0
 address-family ipv4 unicast
  additional-paths receive
  additional-paths send
  additional-paths selection route-policy BGP-PIC-EDGE

route-policy BGP-PIC-EDGE-MULTIPATH
 set path-selection backup 1 install multipath-protect
 end-policy
router bgp 200
 bgp router-id 192.168.1.0
 address-family ipv4 unicast
  maximum-paths ibgp 2
  additional-paths receive
  additional-paths send
  additional-paths selection route-policy BGP-PIC-EDGE-MULTIPATH

Verification

Verify that the backup path is established.

Router# show cef 192.0.2.1/32
192.168.0.0/32, version 31, internal 0x5000001 0x40 (ptr 0x901d2370) [1], 0x0 (0x90d2beb8), 0xa08 (0x91c74378)
 Prefix Len 32, traffic index 0, precedence n/a, priority 4
   via 203.0.113.1/32, 3 dependencies, recursive [flags 0x6000]  << Primary Path
    path-idx 0 NHID 0x0 [0x90319650 0x0]
    recursion-via-/32
    next hop 192.51.100.1/32 via 24006/0/21
    next hop 209.165.200.225/32 Hu0/0/0/25   labels imposed {24002 24000}
    next hop 10.0.0.1/32 Hu0/0/0/26   labels imposed {24002 24000}
   via 203.0.113.2/32, 2 dependencies, recursive, backup [flags 0x6100]  << Backup Path
    path-idx 1 NHID 0x0 [0x903197b8 0x0]
    recursion-via-/32
    next hop 209.165.200.225/32 via 24005/0/21
    next hop 192.51.100.1/32 Hu0/0/0/25   labels imposed {24001 24000}
    next hop 10.0.0.1/32 Hu0/0/0/26   labels imposed {24001 24000}

Black Box Monitoring

Table 7. Feature History Table

Feature Name

Release Information

Feature Description

Black Box Monitoring

Release 7.3.2

This feature enables you to set up forwarding path on the router that you can use to probe customer circuits for system metrics specific to the network devices. Such monitoring helps you to keep up the service level agreements with your customers.

This feature uses a technique whereby a dummy BGP session is established across the GRE encapsulation and decapsulation infrastructure. To terminate the dummy BGP session, the router peers to an address that is configured on the peering fabric which is peering to itself.

The router must peer to an address which is configured on the PF, peering to itself in essence. The only way to make this work is by plugging two interfaces into one another with a physical cable. After two interfaces are connected to one another place one of them into a VRF so that the BGP session is brought up. A router does not attempt to establish a BGP session to itself normally, so you must separate the routing table using a VRF. On the other interface it is a 'normal' interface in the global vrf with the same configuration that is typically on a PF peering interface.

Configuration Example

Perform the following steps to configure BGP and GRE tunnel..

/* Configure the Local Proxy ARP on the Bundle-Ether interfaces.*/
Router(config)# interface Bundle-Ether1.1
Router(config-if)# ipv4 address 10.1.1.1 255.255.255.240
Router(config-if)# local-proxy-arp
Router(config-if)# encapsulation dot1q 12
Router(config-if)# ipv4 access-group acl-aa ingress

Router(config-if)# exit
Router(config)# interface Bundle-Ether2.1
Router(config-if)# vrf aa
Router(config-if-vrf)# ipv4 address 10.1.1.2 255.255.255.240
Router(config-if-vrf)# local-proxy-arp
Router(config-if-vrf)# encapsulation dot1q 12

/* Configure a bundle on FortyGigE interfaces.*/
Router(config)# interface FortyGigE 0/0/0/46
Router(config-if)# bundle id 1 mode on
Router(config-if)# exit
Router(config)# interface FortyGigE0/0/0/47
Router(config-if)# bundle id 2 mode on

/* Configure the access list.*/
Router(config-if)# ipv4 access-list acl-aa
Router(config-if)# 1 permit icmp any host 10.1.1.1 echo-reply
Router(config-if)# 2 permit ipv4 any any nexthop1 ipv4 100.100.2.2
Router(config-if)# 10 permit tcp any eq bgp any
Router(config-if)# 20 permit tcp any any eq bgp

/* Configure BGP.*/
Router(config)# router bgp 100
Router(config-bgp)# bgp router-id 10.10.10.10
Router(config-bgp)# bgp log neighbor changes detail
Router(config-bgp)# address-family ipv4 unicast
Router(config-bgp)# maximum-paths ebgp 64
Router(config-bgp)# maximum-paths ibgp 64 

/* Apply route policy. */
Router(config)# address-family vpnv4 unicast
Router(config-af)# vrf aa
Router(config-af)# rd auto
Router(config-af)# exitexit
Router(config)# address-family ipv4 unicast
Router(config)# exit
Router(config)# neighbor 10.1.1.1
Router(config-nbr)# remote-as 200
Router(config-nbr)# ebgp-multihop 4
Router(config-nbr)# exit
Router(config)# address-family ipv4 unicast
Router(config-af)#send-community-ebgp
Router(config-af)# route-policy pass-all in 
Router(config-af)# route-policy pass-all out 

/* Configure loopback interfaces. */
Router(config)# interface Loopback1001
Router(config-if)# ipv4 address 10.10.10.10 255.255.255.255
Router(config)# exit
Router(config)# interface Loopback1002
Router(config-if)# vrf aa
Router(config-if-vrf)# ipv4 address 10.10.10.10 255.255.255.255

/* Configure a class map. */
Router(config)# class-map type traffic match-all aa
Router(config-cmap)# match protocol gre
Router(config-cmap)# match destination-address ipv4 10.10.10.10 255.255.255.255
Router(config-cmap)# end-class-map

/* Configure a policy map. */
Router(config)# policy-map type pbr pmap1
Router(config-pmap)# class type traffic aa
Router(config-pmap-c)# decapsulate gre
Router(config-pmap-c)# class type traffic class-default
Router(config-pmap-c)# end-policy-map

/* Configure VRF policy. */
Router(config)# vrf-policy 
Router(config-vrf)# vrf default address-family ipv4 policy type pbr input pmap1
Router(config)# interface tunnel-ip 1100
Router(config-if)#ipv4 unnumbered Loopback1001 
Router(config-if)#tunnel mode gre ipv4 encap 
Router(config-if)#tunnel source Loopback1001 
Router(config-if)#tunnel destination 200.1.2.1 
Router(config-if)#logging events link-status 

Running Configuration


interface Bundle-Ether1.1
 ipv4 address 10.1.1.1 255.255.255.240
 local-proxy-arp
 encapsulation dot1q 12
 ipv4 access-group aa-acl ingress

interface Bundle-Ether2.1
 vrf aa
 ipv4 address 10.1.1.2 255.255.255.240
 local-proxy-arp
 encapsulation dot1q 12

interface FortyGigE0/0/0/46
 bundle id 1 mode on

interface FortyGigE0/0/0/47
 bundle id 2 mode on
ipv4 access-list aa-acl
 1 permit icmp any host 10.1.1.1 echo-reply
 2 permit ipv4 any any nexthop1 ipv4 100.100.2.2
 10 permit tcp any eq bgp any
 20 permit tcp any any eq bgp

router bgp 100
 bgp router-id 10.10.10.10
 bgp log neighbor changes detail
 address-family ipv4 unicast
  maximum-paths ebgp 64
  maximum-paths ibgp 64
 !
 address-family vpnv4 unicast
 !
 vrf aa
  rd auto
  address-family ipv4 unicast
  !
  neighbor 10.1.1.1
   remote-as 200
   ebgp-multihop 4
   address-family ipv4 unicast
    send-community-ebgp
    route-policy pass-all in
    route-policy pass-all out

interface Loopback1001
 ipv4 address 10.10.10.10 255.255.255.255
RP/0/RP0/CPU0:SF-DD#sh run int loopback 1002
interface Loopback1002
 vrf aa
 ipv4 address 10.10.10.10 255.255.255.255

class-map type traffic match-all aa
 match protocol gre
 match destination-address ipv4 10.10.10.10 255.255.255.255
 end-class-map

policy-map type pbr pmap1
 class type traffic aa
  decapsulate gre
 class type traffic class-default
 end-policy-map
!
vrf-policy
 vrf default address-family ipv4 policy type pbr input pmap1

interface tunnel-ip1100
 ipv4 unnumbered Loopback1001
 tunnel mode gre ipv4 encap
 tunnel source Loopback1001
 tunnel destination 200.1.2.1
 logging events link-status

Verification

Verify the configuration of black box monitoring.

Router# show bgp vrf aa neighbors
BGP neighbor is 10.1.1.1, vrf aa
 Remote AS 200, local AS 100, external link
 Remote router ID 200.1.2.1
  BGP state = Established, up for 00:12:35
  NSR State: None
  Last read 00:00:30, Last read before reset 00:00:00
  Hold time is 180, keepalive interval is 60 seconds
  Configured hold time: 180, keepalive: 60, min acceptable hold time: 3
  Last write 00:00:30, attempted 19, written 19
  Second last write 00:01:30, attempted 19, written 19
  Last write before reset 00:00:00, attempted 0, written 0
  Second last write before reset 00:00:00, attempted 0, written 0
  Last write pulse rcvd  Sep 29 05:50:49.983 last full not set pulse count 30
  Last write pulse rcvd before reset 00:00:00
Connections established 1; dropped 0
  Local host: 10.1.1.2, Local port: 52660, IF Handle: 0x00000000
  Foreign host: 10.1.1.1, Foreign port: 179
  Last reset 00:00:00
  External BGP neighbor may be up to 4 hops away.

BGP Labeled Unicast Version 6

Table 8. Feature History Table

Feature Name

Release Information

Feature Description

BGP Labeled Unicast Version 6 Release 7.3.16

This feature extends the BGP Labeled Unicast (LU) functionality over IPv6. This feature provides connectivity between PEs to run services, such as L3VPN and 6PVE. This feature allows the PEs to transport traffic across autonomous systems (AS) boundaries.

BGP LU allows you to transport MPLS traffic across IGP boundaries. By advertising loopbacks and label bindings across IGP boundaries routers communicate with other routers in remote areas that do not share the same local IGP.

Overview of BGP Labeled Unicast

The BGP Labeled Unicast (LU) feature, also known as unified MPLS, provides MPLS transport between Provider Edge (PE) routers that are separated by either many IGP boundaries (intra-AS) or by many autonomous systems (inter-AS). Using autonomous systems border routers (ASBRs), you can advertise loopback prefixes of PEs and their MPLS label bindings: iBGP between area border routers (ABRs) and eBGP between autonomous system border routers. You can use Multihop eBGP between the PEs if they are in different autonomous systems (ASes) to exchange the VPN routes. You can run 6PE and other services between the PEs that have BGP LU connectivity.

The BGP LU feature lowers the IGP labeled prefix scale and adjacency scale values. If the router is not being configured with BGP LU, it is necessary to prevent lowering of scale values. Hence it is mandatory to configure the hw-module command before you enable the BGP LU feature. Restart the router for the hw-module command configuration to take effect.

The BGP Labeled Unicast Version 6 (BGP LU v6) feature extends the BGP Labeled Unicast (LU) functionality over IPv6.

Restrictions

  • 6VPE over BGP LU feature is not supported.

  • Inter-AFI is not supported.

  • BGP PIC core feature is not supported.

  • Coexistence of 6PE with the same neighbor is not supported.

  • Coexistence of BGP LU version 6 IPv6 unicast-address family is not supported.

  • VPNV6 over BGP LU v6 is not supported.

  • Link-local addresses are not supported.

  • Rewrite cases, in which BGP LU is itself the transport, is not supported.

  • Carrier Supporting Carrier Version 6 is not supported.

  • Inter-AS Option-C with BGP LU Version 6 is not supported.

Configure BGP Labeled Unicast Version 6


Router(config)# hw-module profile cef bgplu enable
Router(config)# router bgp 1
Router(config-bgp)# bgp router-id 2001:DB8::1
Router(config-bgp)# address-family ipv6 unicast
Router(config-bgp-af)# redistribute connected route-policy set-lbl-idx
Router(config-bgp-af)# allocate-label all
Router(config-bgp-af)# exit
Router(config-bgp)# neighbor 2001:DB8::2
Router(config-bgp)# remote-as 1
Router(config-bgp)# update-source Loopback 0
Router(config-bgp)# address-family ipv6 labeled-unicast
Router(config-bgp)# route-policy pass-all in
Router(config-bgp)# route-policy pass-all out
Router(config-bgp)# commit

Note


Reload the router for the hw-module profile cef bgplu enable command to take effect.

Running Configuration

hw-module profile cef bgplu enable
router bgp 1
 bgp router-id 2001:DB8::1
 address-family ipv6 unicast
  redistribute connected route-policy set-lbl-idx
  allocate-label all
  exit
 neighbor 2001:DB8::2
 remote-as 1
 update-source Loopback 0
  address-family ipv6 labeled-unicast
   route-policy pass-all in
   route-policy pass-all out

Verification

Verify that the BGP LU has been configured.

Router# show hw-module profile cef
Thu Jun 17 00:06:32.974 UTC
------------------------------------------------------------------------------------ 
Knob                        Status        Applied   Action         
------------------------------------------------------------------------------------
BGPLU                      Configured      Yes       None    
LPTS ACL                   Unconfigured    Yes       None           
Dark Bandwidth             Unconfigured    Yes       None           
MPLS Per Path Stats        Unconfigured    Yes       None           
Tunnel TTL Decrement       Unconfigured    Yes       None           
High-Scale No-LDP-Over-TE  Unconfigured    Yes       None           
IPv6 Hop-limit Punt        Unconfigured    Yes       None           
IP Redirect Punt           Unconfigured    Yes       None           

Verify the details of route paths along with the BGP and transport label information.

Router# show cef ipv6 192:168:9::80/128
Wed Jun 16 07:42:04.789 UTC
192:168:9::80/128, version 27, internal 0x5000001 0x40 (ptr 0x93f2d478) [1], 0x0 (0x93ef6cc0), 0xa08 (0x9460a8a8)
 Updated Jun 16 07:36:00.189
 Prefix Len 128, traffic index 0, precedence n/a, priority 4, encap-id 0x1001000000001
   via 10:0:1::51/128, 3 dependencies, recursive [flags 0x6000]
    path-idx 0 NHID 0x0 [0x94720660 0x0]
    recursion-via-/128
    next hop 10:0:1::51/128 via 16061/0/21
     next hop fe80::7af8:c2ff:fee4:20c0/128 Hu0/0/0/27   labels imposed {16061 25001}
/*
16061 - Transport Label
25001 – BGP Label 
*/

Verify the BGP LU version 6 routes and BGP label information in BGP process.

Router# show bgp ipv6 unicast labels
Wed Jun 16 07:34:58.968 UTC
BGP router identifier 10.0.1.50, local AS number 1
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0800000   RD version: 6
BGP main routing table version 6
BGP NSR Initial initsync version 3 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs

Status codes: s suppressed, d damped, h history, * valid, > best
              i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network             Next Hop        Rcvd Label      Local Label
*> 192:168::/64        192:168:1::70   nolabel         24006
*>i192:168:9::80/128   10:0:1::51      25001           nolabel

Processed 2 prefixes, 2 paths

BGP Default Limits

Table 9. Feature History Table

Feature Name

Release Information

Feature Description

Support for Increased Number of BGP Peers

Release 7.3.1

This feature is now enhanced to support 750 IPv4 and 750 IPv6 BGP peers.

BGP imposes maximum limits on the number of neighbors that can be configured on the router and on the maximum number of prefixes that are accepted from a peer for a given address family. This limitation safeguards the router from resource depletion caused by misconfiguration, either locally or on the remote neighbor. The following limits apply to BGP configurations:

  • The default maximum number of peers that can be configured is 4000. The default can be changed using the bgp maximum neighbor command. The limit range is 1 to 15000. Any attempt to configure additional peers beyond the maximum limit or set the maximum limit to a number that is less than the number of peers currently configured will fail.

  • To prevent a peer from flooding BGP with advertisements, a limit is placed on the number of prefixes that are accepted from a peer for each supported address family. The default limits can be overridden through configuration of the maximum-prefix limit command for the peer for the appropriate address family. The following default limits are used if the user does not configure the maximum number of prefixes for the address family:
    • IPv4 Unicast: 1048576

    • IPv4 Labeled-unicast: 131072

    • IPv4 Tunnel: 1048576

    • IPv6 Unicast: 524288

    • IPv6 Labeled-unicast: 131072

    • IPv4 Multicast: 131072

    • IPv6 Multicast: 131072

    • IPv4 MVPN: 2097152

    • VPNv4 Unicast: 2097152

    • IPv4 MDT: 131072

    • VPNv6 Unicast: 1048576

    • L2VPN EVPN: 2097152

    A cease notification message is sent to the neighbor and the peering with the neighbor is terminated when the number of prefixes received from the peer for a given address family exceeds the maximum limit (either set by default or configured by the user) for that address family.

    It is possible that the maximum number of prefixes for a neighbor for a given address family has been configured after the peering with the neighbor has been established and a certain number of prefixes have already been received from the neighbor for that address family. A cease notification message is sent to the neighbor and peering with the neighbor is terminated immediately after the configuration if the configured maximum number of prefixes is fewer than the number of prefixes that have already been received from the neighbor for the address family.

BGP Next Hop Tracking

Table 10. Feature History Table

Feature Name

Release Information

Feature Description

BGP Next Hop Tracking

Release 24.3.1

Introduced in this release on: Fixed Systems (8200, 8700); Modular Systems (8800 [LC ASIC: P100]) (select variants only*)

The BGP next-hop tracking feature allows refined route resolution, avoiding aggregate routes and oscillation risks by filtering based on prefix length and source protocols, configurable through the nexthop trigger-delay and nexthop route-policy commands.

* The BGP next hop tracking functionality is now extended to:

  • 8212-48FH-M

  • 8711-32FH-M

  • 88-LC1-52Y8H-EM

  • 88-LC1-12TH24FH-E

BGP receives notifications from the Routing Information Base (RIB) when next-hop information changes (event-driven notifications). BGP obtains next-hop information from the RIB to:

  • Determine whether a next hop is reachable.

  • Find the fully recursed IGP metric to the next hop (used in the best-path calculation).

  • Validate the received next hops.

  • Calculate the outgoing next hops.

  • Verify the reachability and connectedness of neighbors.

BGP is notified when any of the following events occurs:

  • Next hop becomes unreachable

  • Next hop becomes reachable

  • Fully recursed IGP metric to the next hop changes

  • First hop IP address or first hop interface change

  • Next hop becomes connected

  • Next hop becomes unconnected

  • Next hop becomes a local address

  • Next hop becomes a nonlocal address


Note


Reachability and recursed metric events trigger a best-path recalculation.


Event notifications from the RIB are classified as critical and noncritical. Notifications for critical and noncritical events are sent in separate batches. However, a noncritical event is sent along with the critical events if the noncritical event is pending and there is a request to read the critical events.

  • Critical events are related to the reachability (reachable and unreachable), connectivity (connected and unconnected), and locality (local and nonlocal) of the next hops. Notifications for these events are not delayed.

  • Noncritical events include only the IGP metric changes. These events are sent at an interval of 3 seconds. A metric change event is batched and sent 3 seconds after the last one was sent.

The next-hop trigger delay for critical and noncritical events can be configured to specify a minimum batching interval for critical and noncritical events using the nexthop trigger-delay command. The trigger delay is address family dependent.

The BGP next-hop tracking feature allows you to specify that BGP routes are resolved using only next hops whose routes have the following characteristics:

  • To avoid the aggregate routes, the prefix length must be greater than a specified value.

  • The source protocol must be from a selected list, ensuring that BGP routes are not used to resolve next hops that could lead to oscillation.

This route policy filtering is possible because RIB identifies the source protocol of route that resolved a next hop as well as the mask length associated with the route. The nexthop route-policy command is used to specify the route-policy.

Next Hop as the IPv6 Address of Peering Interface

BGP can carry IPv6 prefixes over an IPv4 session. The next hop for the IPv6 prefixes can be set through a nexthop policy. In the event that the policy is not configured, the nexthops are set as the IPv6 address of the peering interface (IPv6 neighbor interface or IPv6 update source interface, if any one of the interfaces is configured).

If the nexthop policy is not configured and neither the IPv6 neighbor interface nor the IPv6 update source interface is configured, the next hop is the IPv4 mapped IPv6 address.

IPv6 Multiprotocol BGP Peering Using a Global Address

When all ECMP links are shutdown except any one of the interfaces, the next-hop is changed from global address to link-local address which leads to traffic loss of all flows for a few seconds transient time.

You can then configure the set next-hop ipv6-global command under the BGP table-policy to avoid traffic loss over an undisturbed path.

BGP installs global ipv6 address nexthop for multipath routes and install linklocal and ifhandle for single path route to connect ebgp neighbor directly. You can configure the set next-hop ipv6-global command under the BGP table-policy as follows to set the global ipv6 address nexthop:


route-policy RESILIENT-HASH-V6
  if destination in (1000:1000::/32 le 128) or destination in (2000:1000::/32 le 128) then
    set load-balance ecmp-consistent
    set next-hop ipv6-global
    pass
  endif
  pass
end-policy

Scoped IPv4 Table Walk

To determine which address family to process, a next-hop notification is received by first de-referencing the gateway context associated with the next hop, then looking into the gateway context to determine which address families are using the gateway context. The IPv4 unicast address families share the same gateway context, because they are registered with the IPv4 unicast table in the RIB. As a result, the global IPv4 unicast table processed when an IPv4 unicast next-hop notification is received from the RIB. A mask is maintained in the next hop, indicating the next hop belongs to IPv4 unicast. This scoped table walk localizes the processing in the appropriate address family table.

Reordered Address Family Processing

The software walks address family tables based on the numeric value of the address family. When a next-hop notification batch is received, the order of address family processing is reordered to the following order:

  • IPv4 tunnel

  • VPNv4 unicast

  • IPv4 labeled unicast

  • IPv4 unicast

  • IPv4 multicast

  • IPv6 unicast

New Thread for Next-Hop Processing

The critical-event thread in the spkr process handles only next-hop, Bidirectional Forwarding Detection (BFD), and fast-external-failover (FEF) notifications. This critical-event thread ensures that BGP convergence is not adversely impacted by other events that may take a significant amount of time.

show, clear, and debug Commands

The show bgp nexthops command provides statistical information about next-hop notifications, the amount of time spent in processing those notifications, and details about each next hop registered with the RIB. The clear bgp nexthop performance-statistics command ensures that the cumulative statistics associated with the processing part of the next-hop show command can be cleared to help in monitoring. The clear bgp nexthop registration command performs an asynchronous registration of the next hop with the RIB.

The debug bgp nexthop command displays information on next-hop processing. The out keyword provides debug information only about BGP registration of next hops with RIB. The in keyword displays debug information about next-hop notifications received from RIB. The out keyword displays debug information about next-hop notifications sent to the RIB.

BGP Configuration

BGP in Cisco IOS XR software follows a neighbor-based configuration model that requires that all configurations for a particular neighbor be grouped in one place under the neighbor configuration. Peer groups are not supported for either sharing configuration between neighbors or for sharing update messages. The concept of peer group has been replaced by a set of configuration groups to be used as templates in BGP configuration and automatically generated update groups to share update messages between neighbors.

Configuration Modes

BGP configurations are grouped into modes. The following sections show how to enter some of the BGP configuration modes. From a mode, you can enter the ? command to display the commands available in that mode.

Router Configuration Mode

The following example shows how to enter router configuration mode:


  Router# configuration
  Router(config)# router bgp 140
  Router(config-bgp)# 
  
Router Address Family Configuration Mode

The following example shows how to enter router address family configuration mode:


  Router(config)# router bgp 112
  Router(config-bgp)# address-family ipv4 unicast
  Router(config-bgp-af)#
  
Neighbor Configuration Mode

The following example shows how to enter neighbor configuration mode:


  Router(config)# router bgp 140
  Router(config-bgp)# neighbor 10.0.0.1
  Router(config-bgp-nbr)#
  
VRF Configuration Mode

The following example shows how to enter VPN routing and forwarding (VRF) configuration mode:


  Router(config)# router bgp 140
  Router(config-bgp)# vrf vrf_A
  Router(config-bgp-vrf)#
  
VRF Neighbor Configuration Mode

The following example shows how to enter VRF neighbor configuration mode:


  Router(config)# router bgp 140
  Router(config-bgp)# vrf vrf_A
  Router(config-bgp-vrf)# neighbor 11.0.1.2
  Router(config-bgp-vrf-nbr)# 
  
VRF Neighbor Address Family Configuration Mode

The following example shows how to enter VRF neighbor address family configuration mode:


  RP/0/RP0/CPU0:router(config)# router bgp 112
  RP/0/RP0/CPU0:router(config-bgp)# vrf vrf_A
  RP/0/RP0/CPU0:router(config-bgp-vrf)# neighbor 11.0.1.2 
  RP/0/RP0/CPU0:router(config-bgp-vrf-nbr)# address-family ipv4 unicast
  RP/0/RP0/CPU0:router(config-bgp-vrf-nbr-af)#
  
VPNv6 Address Family Configuration Mode

The following example shows how to enter VPNv6 address family configuration mode:


  Router(config)# router bgp 150
  Router(config-bgp)# address-family vpnv6 unicast
  Router(config-bgp-af)#
  
L2VPN Address Family Configuration Mode

The following example shows how to enter L2VPN address family configuration mode:


  Router(config)# router bgp 100
  Router(config-bgp)# address-family l2vpn vpls-vpws
  Router(config-bgp-af)#
  

Neighbor Submode 

Cisco IOS XR BGP uses a neighbor submode to make it possible to enter configurations without having to prefix every configuration with the neighbor keyword and the neighbor address:

  • Cisco IOS XR software has a submode available for neighbors in which it is not necessary for every command to have a “neighbor x.x.x.x” prefix:

    In Cisco IOS XR software, the configuration is as follows:

    
      Router(config-bgp)# neighbor 192.23.1.2
      Router(config-bgp-nbr)# remote-as 2002
      Router(config-bgp-nbr)# address-family ipv4 unicast
  • An address family configuration submode inside the neighbor configuration submode is available for entering address family-specific neighbor configurations. In the Cisco IOS XR software, the configuration is as follows:

    
      Router(config-bgp)# neighbor 2002::2
      Router(config-bgp-nbr)# remote-as 2023
      Router(config-bgp-nbr)# address-family ipv6 unicast
      Router(config-bgp-nbr-af)# next-hop-self
      Router(config-bgp-nbr-af)# route-policy one in

Configuration Templates

The af-group, session-group, and neighbor-group configuration commands provide template support for the neighbor configuration in Cisco IOS XR software.

The af-group command is used to group address family-specific neighbor commands within an IPv4, IPv6, address family. Neighbors that have the same address family configuration are able to use the address family group (af-group) name for their address family-specific configuration. A neighbor inherits the configuration from an address family group by way of the use command. If a neighbor is configured to use an address family group, the neighbor (by default) inherits the entire configuration from the address family group. However, a neighbor does not inherit all of the configuration from the address family group if items are explicitly configured for the neighbor. The address family group configuration is entered under the BGP router configuration mode. The following example shows how to enter address family group configuration mode


Router(config)# router bgp 140
Router(config-bgp)# af-group afmcast1 address-family ipv4 unicast
Router(config-bgp-afgrp)#
    

The session-group command allows you to create a session group from which neighbors can inherit address family-independent configuration. A neighbor inherits the configuration from a session group by way of the use command. If a neighbor is configured to use a session group, the neighbor (by default) inherits the entire configuration of the session group. A neighbor does not inherit all of the configuration from a session group if a configuration is done directly on that neighbor. The following example shows how to enter session group configuration mode:


  Router# router bgp 140
  Router(config-bgp)# session-group session1
  Router(config-bgp-sngrp)# 
   

The neighbor-group command helps you apply the same configuration to one or more neighbors. Neighbor groups can include session groups and address family groups and can comprise the complete configuration for a neighbor. After a neighbor group is configured, a neighbor can inherit the configuration of the group using the use command. If a neighbor is configured to use a neighbor group, the neighbor inherits the entire BGP configuration of the neighbor group.

The following example shows how to enter neighbor group configuration mode:


  Router(config)# router bgp 123
  Router(config-bgp)# neighbor-group nbrgroup1
  Router(config-bgp-nbrgrp)#
    

The following example shows how to enter neighbor group address family configuration mode:


 Router(config)# router bgp 140
 Router(config-bgp)# neighbor-group nbrgroup1
 Router(config-bgp-nbrgrp)# address-family ipv4 unicast
 Router(config-bgp-nbrgrp-af)#
    
  • However, a neighbor does not inherit all of the configuration from the neighbor group if items are explicitly configured for the neighbor. In addition, some part of the configuration of the neighbor group could be hidden if a session group or address family group was also being used.

Configuration grouping has the following effects in Cisco IOS XR software:

  • Commands entered at the session group level define address family-independent commands (the same commands as in the neighbor submode).

  • Commands entered at the address family group level define address family-dependent commands for a specified address family (the same commands as in the neighbor-address family configuration submode).

  • Commands entered at the neighbor group level define address family-independent commands and address family-dependent commands for each address family (the same as all available neighbor commands), and define the use command for the address family group and session group commands.

Template Inheritance Rules

In Cisco IOS XR software, BGP neighbors or groups inherit configuration from other configuration groups.

For address family-independent configurations:

  • Neighbors can inherit from session groups and neighbor groups.

  • Neighbor groups can inherit from session groups and other neighbor groups.

  • Session groups can inherit from other session groups.

  • If a neighbor uses a session group and a neighbor group, the configurations in the session group are preferred over the global address family configurations in the neighbor group.

For address family-dependent configurations:

  • Address family groups can inherit from other address family groups.

  • Neighbor groups can inherit from address family groups and other neighbor groups.

  • Neighbors can inherit from address family groups and neighbor groups.

Configuration group inheritance rules are numbered in order of precedence as follows:

  1. If the item is configured directly on the neighbor, that value is used. In the example that follows, the advertisement interval is configured both on the neighbor group and neighbor configuration and the advertisement interval being used is from the neighbor configuration:

    
      Router(config)# router bgp 140
      Router(config-bgp)# neighbor-group AS_1
      Router(config-bgp-nbrgrp)# advertisement-interval 15
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# neighbor 10.1.1.1
      Router(config-bgp-nbr)# remote-as 1
      Router(config-bgp-nbr)# use neighbor-group AS_1
      Router(config-bgp-nbr)# advertisement-interval 20
    
    

    The following output from the show bgp neighbors command shows that the advertisement interval used is 20 seconds:

    
      Router# show bgp neighbors 10.1.1.1
      
      BGP neighbor is 10.1.1.1, remote AS 1, local AS 140, external link
       Remote router ID 0.0.0.0
        BGP state = Idle
        Last read 00:00:00, hold time is 180, keepalive interval is 60 seconds
        Received 0 messages, 0 notifications, 0 in queue
        Sent 0 messages, 0 notifications, 0 in queue
        Minimum time between advertisement runs is 20 seconds
      
       For Address Family: IPv4 Unicast
        BGP neighbor version 0
        Update group: 0.1
        eBGP neighbor with no inbound or outbound policy; defaults to 'drop'
        Route refresh request: received 0, sent 0
        0 accepted prefixes
        Prefix advertised 0, suppressed 0, withdrawn 0, maximum limit 524288
        Threshold for warning message 75%
      
        Connections established 0; dropped 0
        Last reset 00:00:14, due to BGP neighbor initialized
        External BGP neighbor not directly connected.
      
  2. Otherwise, if an item is configured to be inherited from a session-group or neighbor-group and on the neighbor directly, then the configuration on the neighbor is used. If a neighbor is configured to be inherited from session-group or af-group, but no directly configured value, then the value in the session-group or af-group is used. In the example that follows, the advertisement interval is configured on a neighbor group and a session group and the advertisement interval value being used is from the session group:

    
      Router(config)# router bgp 140
      Router(config-bgp)# session-group AS_2
      Router(config-bgp-sngrp)# advertisement-interval 15
      Router(config-bgp-sngrp)# exit
      Router(config-bgp)# neighbor-group AS_1
      Router(config-bgp-nbrgrp)# advertisement-interval 20
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# neighbor 192.168.0.1
      Router(config-bgp-nbr)# remote-as 1
      Router(config-bgp-nbr)# use session-group AS_2
      Router(config-bgp-nbr)# use neighbor-group AS_1
    
    The following output from the show bgp neighbors command shows that the advertisement interval used is 15 seconds:
    
      Router# show bgp neighbors 192.168.0.1
      
      BGP neighbor is 192.168.0.1, remote AS 1, local AS 140, external link
       Remote router ID 0.0.0.0
        BGP state = Idle
        Last read 00:00:00, hold time is 180, keepalive interval is 60 seconds
        Received 0 messages, 0 notifications, 0 in queue
        Sent 0 messages, 0 notifications, 0 in queue
        Minimum time between advertisement runs is 15 seconds
      
       For Address Family: IPv4 Unicast
        BGP neighbor version 0
        Update group: 0.1
        eBGP neighbor with no inbound or outbound policy; defaults to 'drop'
        Route refresh request: received 0, sent 0
        0 accepted prefixes
        Prefix advertised 0, suppressed 0, withdrawn 0, maximum limit 524288
        Threshold for warning message 75%
      
        Connections established 0; dropped 0
        Last reset 00:03:23, due to BGP neighbor initialized
        External BGP neighbor not directly connected.
      
  3. Otherwise, if the neighbor uses a neighbor group and does not use a session group or address family group, the configuration value can be obtained from the neighbor group either directly or through inheritance. In the example that follows, the advertisement interval from the neighbor group is used because it is not configured directly on the neighbor and no session group is used:

    
      Router(config)# router bgp 150
      Router(config-bgp)# session-group AS_2
      Router(config-bgp-sngrp)# advertisement-interval 20
      Router(config-bgp-sngrp)# exit
      Router(config-bgp)# neighbor-group AS_1
      Router(config-bgp-nbrgrp)# advertisement-interval 15
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# neighbor 192.168.1.1
      Router(config-bgp-nbr)# remote-as 1
      Router(config-bgp-nbr)# use neighbor-group AS_1
       
    The following output from the show bgp neighbors command shows that the advertisement interval used is 15 seconds:
    
       Router# show bgp neighbors 192.168.1.1
      
      BGP neighbor is 192.168.2.2, remote AS 1, local AS 140, external link
       Remote router ID 0.0.0.0
        BGP state = Idle
        Last read 00:00:00, hold time is 180, keepalive interval is 60 seconds
        Received 0 messages, 0 notifications, 0 in queue
        Sent 0 messages, 0 notifications, 0 in queue
        Minimum time between advertisement runs is 15 seconds
      
       For Address Family: IPv4 Unicast
        BGP neighbor version 0
        Update group: 0.1
        eBGP neighbor with no outbound policy; defaults to 'drop'
        Route refresh request: received 0, sent 0
        Inbound path policy configured
        Policy for incoming advertisements is POLICY_1
        0 accepted prefixes
        Prefix advertised 0, suppressed 0, withdrawn 0, maximum limit 524288
        Threshold for warning message 75%
      
        Connections established 0; dropped 0
        Last reset 00:01:14, due to BGP neighbor initialized
        External BGP neighbor not directly connected.
      
    To illustrate the same rule, the following example shows how to set the advertisement interval to 15 (from the session group) and 25 (from the neighbor group). The advertisement interval set in the session group overrides the one set in the neighbor group. The inbound policy is set to POLICY_1 from the neighbor group.
    
      Routerconfig)# router bgp 140
      Router(config-bgp)# session-group ADV
      Router(config-bgp-sngrp)# advertisement-interval 15
      Router(config-bgp-sngrp)# exit
      Router(config-bgp)# neighbor-group ADV_2
      Router(config-bgp-nbrgrp)# advertisement-interval 25
      Router(config-bgp-nbrgrp)# address-family ipv4 unicast
      Router(config-bgp-nbrgrp-af)# route-policy POLICY_1 in
      Router(config-bgp-nbrgrp-af)# exit
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# exit
      Router(config-bgp)# neighbor 192.168.2.2
      Router(config-bgp-nbr)# remote-as 1
      Router(config-bgp-nbr)# use session-group ADV
      Router(config-bgp-nbr)# use neighbor-group ADV_2
      
    The following output from the show bgp neighbors command shows that the advertisement interval used is 15 seconds:
    
      Router# show bgp neighbors 192.168.2.2
      
      BGP neighbor is 192.168.2.2, remote AS 1, local AS 140, external link
       Remote router ID 0.0.0.0
        BGP state = Idle
        Last read 00:00:00, hold time is 180, keepalive interval is 60 seconds
        Received 0 messages, 0 notifications, 0 in queue
        Sent 0 messages, 0 notifications, 0 in queue
        Minimum time between advertisement runs is 15 seconds
      
       For Address Family: IPv4 Unicast
        BGP neighbor version 0
        Update group: 0.1
        eBGP neighbor with no inbound or outbound policy; defaults to 'drop'
        Route refresh request: received 0, sent 0
        0 accepted prefixes
        Prefix advertised 0, suppressed 0, withdrawn 0, maximum limit 524288
        Threshold for warning message 75%
      
        Connections established 0; dropped 0
        Last reset 00:02:03, due to BGP neighbor initialized
        External BGP neighbor not directly connected.
      
  4. Otherwise, the default value is used. In the example that follows, neighbor 10.0.101.5 has the minimum time between advertisement runs set to 30 seconds (default) because the neighbor is not configured to use the neighbor configuration or the neighbor group configuration:

    
      Router(config)# router bgp 140
      Router(config-bgp)# neighbor-group AS_1
      Router(config-bgp-nbrgrp)# remote-as 1
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# neighbor-group adv_15
      Router(config-bgp-nbrgrp)# remote-as 10
      Router(config-bgp-nbrgrp)# advertisement-interval 15
      Router(config-bgp-nbrgrp)# exit
      Router(config-bgp)# neighbor 10.0.101.5
      Router(config-bgp-nbr)# use neighbor-group AS_1
      Router(config-bgp-nbr)# exit
      Router(config-bgp)# neighbor 10.0.101.10
      Router(config-bgp-nbr)# use neighbor-group adv_15
      
    The following output from the show bgp neighbors command shows that the advertisement interval used is 30 seconds:
    
      Router# show bgp neighbors 10.0.101.5
      
      BGP neighbor is 10.0.101.5, remote AS 1, local AS 140, external link
       Remote router ID 0.0.0.0
        BGP state = Idle
        Last read 00:00:00, hold time is 180, keepalive interval is 60 seconds
        Received 0 messages, 0 notifications, 0 in queue
        Sent 0 messages, 0 notifications, 0 in queue
        Minimum time between advertisement runs is 30 seconds
      
       For Address Family: IPv4 Unicast
        BGP neighbor version 0
        Update group: 0.2
        eBGP neighbor with no inbound or outbound policy; defaults to 'drop'
        Route refresh request: received 0, sent 0
        0 accepted prefixes
        Prefix advertised 0, suppressed 0, withdrawn 0, maximum limit 524288
        Threshold for warning message 75%
      Connections established 0; dropped 0
        Last reset 00:00:25, due to BGP neighbor initialized
        External BGP neighbor not directly connected.
      

The inheritance rules used when groups are inheriting configuration from other groups are the same as the rules given for neighbors inheriting from groups.

Viewing Inherited Configurations

You can use the following show commands to view BGP inherited configurations:

show bgp neighbors

Use the show bgp neighbors command to display information about the BGP configuration for neighbors.

  • Use the configuration keyword to display the effective configuration for the neighbor, including any settings that have been inherited from session groups, neighbor groups, or address family groups used by this neighbor.

  • Use the inheritance keyword to display the session groups, neighbor groups, and address family groups from which this neighbor is capable of inheriting configuration.

The show bgp neighbors command examples that follow are based on this sample configuration:


  Router(config)# router bgp 142
  Router(config-bgp)# af-group GROUP_3 address-family ipv4 unicast
  Router(config-bgp-afgrp)# next-hop-self
  Router(config-bgp-afgrp)# route-policy POLICY_1 in
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# session-group GROUP_2
  Router(config-bgp-sngrp)# advertisement-interval 15
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# neighbor-group GROUP_1
  Router(config-bgp-nbrgrp)# use session-group GROUP_2
  Router(config-bgp-nbrgrp)# ebgp-multihop 3
  Router(config-bgp-nbrgrp)# address-family ipv4 unicast
  Router(config-bgp-nbrgrp-af)# weight 100
  Router(config-bgp-nbrgrp-af)# send-community-ebgp
  Router(config-bgp-nbrgrp-af)# exit
  Router(config-bgp-nbrgrp)# exit
  Router(config-bgp)# neighbor 192.168.0.1
  Router(config-bgp-nbr)# remote-as 2
  Router(config-bgp-nbr)# use neighbor-group GROUP_1
  Router(config-bgp-nbr)# address-family ipv4 unicast
  Router(config-bgp-nbr-af)# use af-group GROUP_3
  Router(config-bgp-nbr-af)# weight 200
  
show bgp neighbors

Use the show bgp neighbors command to display information about the BGP configuration for neighbors.

  • Use the configuration keyword to display the effective configuration for the neighbor, including any settings that have been inherited from session groups, neighbor groups, or address family groups used by this neighbor.

  • Use the inheritance keyword to display the session groups, neighbor groups, and address family groups from which this neighbor is capable of inheriting configuration.

The show bgp neighbors command examples that follow are based on this sample configuration:


  Router(config)# router bgp 142
  Router(config-bgp)# af-group GROUP_3 address-family ipv4 unicast
  Router(config-bgp-afgrp)# next-hop-self
  Router(config-bgp-afgrp)# route-policy POLICY_1 in
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# session-group GROUP_2
  Router(config-bgp-sngrp)# advertisement-interval 15
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# neighbor-group GROUP_1
  Router(config-bgp-nbrgrp)# use session-group GROUP_2
  Router(config-bgp-nbrgrp)# ebgp-multihop 3
  Router(config-bgp-nbrgrp)# address-family ipv4 unicast
  Router(config-bgp-nbrgrp-af)# weight 100
  Router(config-bgp-nbrgrp-af)# send-community-ebgp
  Router(config-bgp-nbrgrp-af)# exit
  Router(config-bgp-nbrgrp)# exit
  Router(config-bgp)# neighbor 192.168.0.1
  Router(config-bgp-nbr)# remote-as 2
  Router(config-bgp-nbr)# use neighbor-group GROUP_1
  Router(config-bgp-nbr)# address-family ipv4 unicast
  Router(config-bgp-nbr-af)# use af-group GROUP_3
  Router(config-bgp-nbr-af)# weight 200
  
show bgp af-group

Use the show bgp af-group command to display address family groups:

  • Use the configuration keyword to display the effective configuration for the address family group, including any settings that have been inherited from address family groups used by this address family group.

  • Use the inheritance keyword to display the address family groups from which this address family group is capable of inheriting configuration.

  • Use the users keyword to display the neighbors, neighbor groups, and address family groups that inherit configuration from this address family group.

The show bgp af-group sample commands that follow are based on this sample configuration:


  Router(config)# router bgp 140
  Router(config-bgp)# af-group GROUP_3 address-family ipv4 unicast
  Router(config-bgp-afgrp)# remove-private-as
  Router(config-bgp-afgrp)# route-policy POLICY_1 in
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# af-group GROUP_1 address-family ipv4 unicast
  Router(config-bgp-afgrp)# use af-group GROUP_2
  Router(config-bgp-afgrp)# maximum-prefix 2500 75 warning-only
  Router(config-bgp-afgrp)# default-originate
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# af-group GROUP_2 address-family ipv4 unicast
  Router(config-bgp-afgrp)# use af-group GROUP_3
  Router(config-bgp-afgrp)# send-community-ebgp
  Router(config-bgp-afgrp)# send-extended-community-ebgp
  Router(config-bgp-afgrp)# capability orf prefix both
      

The following example displays sample output from the show bgp af-group command using the configuration keyword. This example shows from where each configuration item was inherited. The default-originate command was configured directly on this address family group (indicated by [ ]). The remove-private-as command was inherited from address family group GROUP_2, which in turn inherited from address family group GROUP_3:


  Router# show bgp af-group GROUP_1 configuration 
  
  af-group GROUP_1 address-family ipv4 unicast
    capability orf prefix-list both           [a:GROUP_2]
    default-originate                         []
    maximum-prefix 2500 75 warning-only       []
    route-policy POLICY_1 in                  [a:GROUP_2 a:GROUP_3]
    remove-private-AS                         [a:GROUP_2 a:GROUP_3]
    send-community-ebgp                       [a:GROUP_2]
    send-extended-community-ebgp              [a:GROUP_2]
  
  

The following example displays sample output from the show bgp af-group command using the users keyword:


  Router# show bgp af-group GROUP_2 users
  
  IPv4 Unicast: a:GROUP_1
  
  

The following example displays sample output from the show bgp af-group command using the inheritance keyword. This shows that the specified address family group GROUP_1 directly uses the GROUP_2 address family group, which in turn uses the GROUP_3 address family group:


  Router# show bgp af-group GROUP_1 inheritance 
  
  IPv4 Unicast: a:GROUP_2 a:GROUP_3
  
show bgp session-group

Use the show bgp session-group command to display session groups:

  • Use the configuration keyword to display the effective configuration for the session group, including any settings that have been inherited from session groups used by this session group.

  • Use the inheritance keyword to display the session groups from which this session group is capable of inheriting configuration.

  • Use the users keyword to display the session groups, neighbor groups, and neighbors that inherit configuration from this session group.

The output from the show bgp session-group command is based on the following session group configuration:


  Router(config)# router bgp 113
  Router(config-bgp)# session-group GROUP_1
  Router(config-bgp-sngrp)# use session-group GROUP_2
  Router(config-bgp-sngrp)# update-source Loopback 0
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# session-group GROUP_2
  Router(config-bgp-sngrp)# use session-group GROUP_3
  Router(config-bgp-sngrp)# ebgp-multihop 2
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# session-group GROUP_3
  Router(config-bgp-sngrp)# dmz-link-bandwidth
  

The following is sample output from the show bgp session-group command with the configuration keyword in session group configuration mode:


  Router# show bgp session-group GROUP_1 configuration 
  
  session-group GROUP_1
   ebgp-multihop 2         [s:GROUP_2]
   update-source Loopback0 []
   dmz-link-bandwidth      [s:GROUP_2 s:GROUP_3]
  

The following is sample output from the show bgp session-group command with the inheritance keyword showing that the GROUP_1 session group inherits session parameters from the GROUP_3 and GROUP_2 session groups:


  Router# show bgp session-group GROUP_1 inheritance 
  
  Session: s:GROUP_2 s:GROUP_3
  

The following is sample output from the show bgp session-group command with the users keyword showing that both the GROUP_1 and GROUP_2 session groups inherit session parameters from the GROUP_3 session group:


  Router# show bgp session-group GROUP_3 users 
  
  Session: s:GROUP_1 s:GROUP_2
  
show bgp session-group

Use the show bgp session-group command to display session groups:

  • Use the configuration keyword to display the effective configuration for the session group, including any settings that have been inherited from session groups used by this session group.

  • Use the inheritance keyword to display the session groups from which this session group is capable of inheriting configuration.

  • Use the users keyword to display the session groups, neighbor groups, and neighbors that inherit configuration from this session group.

The output from the show bgp session-group command is based on the following session group configuration:


  Router(config)# router bgp 113
  Router(config-bgp)# session-group GROUP_1
  Router(config-bgp-sngrp)# use session-group GROUP_2
  Router(config-bgp-sngrp)# update-source Loopback 0
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# session-group GROUP_2
  Router(config-bgp-sngrp)# use session-group GROUP_3
  Router(config-bgp-sngrp)# ebgp-multihop 2
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# session-group GROUP_3
  Router(config-bgp-sngrp)# dmz-link-bandwidth
  

The following is sample output from the show bgp session-group command with the configuration keyword in session group configuration mode:


  Router# show bgp session-group GROUP_1 configuration 
  
  session-group GROUP_1
   ebgp-multihop 2         [s:GROUP_2]
   update-source Loopback0 []
   dmz-link-bandwidth      [s:GROUP_2 s:GROUP_3]
  

The following is sample output from the show bgp session-group command with the inheritance keyword showing that the GROUP_1 session group inherits session parameters from the GROUP_3 and GROUP_2 session groups:


  Router# show bgp session-group GROUP_1 inheritance 
  
  Session: s:GROUP_2 s:GROUP_3
  

The following is sample output from the show bgp session-group command with the users keyword showing that both the GROUP_1 and GROUP_2 session groups inherit session parameters from the GROUP_3 session group:


  Router# show bgp session-group GROUP_3 users 
  
  Session: s:GROUP_1 s:GROUP_2
  
show bgp neighbor-group

Use the show bgp neighbor-group command to display neighbor groups:

  • Use the configuration keyword to display the effective configuration for the neighbor group, including any settings that have been inherited from neighbor groups used by this neighbor group.

  • Use the inheritance keyword to display the address family groups, session groups, and neighbor groups from which this neighbor group is capable of inheriting configuration.

  • Use the users keyword to display the neighbors and neighbor groups that inherit configuration from this neighbor group.

The examples are based on the following group configuration:


  Router(config)# router bgp 140
  Router(config-bgp)# af-group GROUP_3 address-family ipv4 unicast
  Router(config-bgp-afgrp)# remove-private-as
  Router(config-bgp-afgrp)# soft-reconfiguration inbound
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# af-group GROUP_2 address-family ipv4 unicast
  Router(config-bgp-afgrp)# use af-group GROUP_3
  Router(config-bgp-afgrp)# send-community-ebgp
  Router(config-bgp-afgrp)# send-extended-community-ebgp
  Router(config-bgp-afgrp)# capability orf prefix both
  Router(config-bgp-afgrp)# exit
  Router(config-bgp)# session-group GROUP_3
  Router(config-bgp-sngrp)# timers 30 90
  Router(config-bgp-sngrp)# exit
  Router(config-bgp)# neighbor-group GROUP_1
  Router(config-bgp-nbrgrp)# remote-as 1982
  Router(config-bgp-nbrgrp)# use neighbor-group GROUP_2
  Router(config-bgp-nbrgrp)# address-family ipv4 unicast
  Router(config-bgp-nbrgrp-af)# exit
  Router(config-nbrgrp)# exit
  Router(config-bgp)# neighbor-group GROUP_2
  Router(config-bgp-nbrgrp)# use session-group GROUP_3
  Router(config-bgp-nbrgrp)# address-family ipv4 unicast
  Routerconfig-bgp-nbrgrp-af)# use af-group GROUP_2
  Router(config-bgp-nbrgrp-af)# weight 100
   

The following is sample output from the show bgp neighbor-group command with the configuration keyword. The configuration setting source is shown to the right of each command. In the output shown previously, the remote autonomous system is configured directly on neighbor group GROUP_1, and the send community setting is inherited from neighbor group GROUP_2, which in turn inherits the setting from address family group GROUP_3:


  Router# show bgp neighbor-group GROUP_1 configuration 
  
     neighbor-group GROUP_1
      remote-as 1982                   []
      timers 30 90                     [n:GROUP_2 s:GROUP_3]
      address-family ipv4 unicast      []
       capability orf prefix-list both [n:GROUP_2 a:GROUP_2]
       remove-private-AS               [n:GROUP_2 a:GROUP_2 a:GROUP_3]
       send-community-ebgp             [n:GROUP_2 a:GROUP_2]
       send-extended-community-ebgp    [n:GROUP_2 a:GROUP_2]
       soft-reconfiguration inbound    [n:GROUP_2 a:GROUP_2 a:GROUP_3]
       weight 100                      [n:GROUP_2]
  
  

The following is sample output from the show bgp neighbor-group command with the inheritance keyword. This output shows that the specified neighbor group GROUP_1 inherits session (address family-independent) configuration parameters from neighbor group GROUP_2. Neighbor group GROUP_2 inherits its session parameters from session group GROUP_3. It also shows that the GROUP_1 neighbor group inherits IPv4 unicast configuration parameters from the GROUP_2 neighbor group, which in turn inherits them from the GROUP_2 address family group, which itself inherits them from the GROUP_3 address family group:


  Router# show bgp neighbor-group GROUP_1 inheritance 
  
      Session:      n:GROUP-2 s:GROUP_3
      IPv4 Unicast: n:GROUP_2 a:GROUP_2 a:GROUP_3
  
  

The following is sample output from the show bgp neighbor-group command with the users keyword. This output shows that the GROUP_1 neighbor group inherits session (address family-independent) configuration parameters from the GROUP_2 neighbor group. The GROUP_1 neighbor group also inherits IPv4 unicast configuration parameters from the GROUP_2 neighbor group:


  Router# show bgp neighbor-group GROUP_2 users 
  
  Session:      n:GROUP_1
  IPv4 Unicast: n:GROUP_1
  

No Default Address Family

BGP does not support the concept of a default address family. An address family must be explicitly configured under the BGP router configuration for the address family to be activated in BGP. Similarly, an address family must be explicitly configured under a neighbor for the BGP session to be activated under that address family. It is not required to have any address family configured under the BGP router configuration level for a neighbor to be configured. However, it is a requirement to have an address family configured at the BGP router configuration level for the address family to be configured under a neighbor.

Neighbor Address Family Combinations

For default VRF, both IPv4 Unicast and IPv4 Labeled-unicast address families are supported under the same neighbor.

For non-default VRF, both IPv4 Unicast and IPv4 Labeled-unicast address families are not supported under the same neighbor. However, the configuration is accepted on the router with the following error:

bgp[1051]: %ROUTING-BGP-4-INCOMPATIBLE_AFI : IPv4 Unicast and IPv4 Labeled-unicast Address families together are not supported under the same neighbor.

When one BGP session has both IPv4 unicast and IPv4 labeled-unicast AFI/SAF, then the routing behavior is nondeterministic. Therefore, the prefixes may not be correctly advertised. Incorrect prefix advertisement results in reachability issues. In order to avoid such reachability issues, you must explicitly configure a route policy to advertise prefixes either through IPv4 unicast or through IPv4 labeled-unicast address families.

Routing Policy Enforcement

External BGP (eBGP) neighbors must have an inbound and outbound policy configured. If no policy is configured, no routes are accepted from the neighbor, nor are any routes advertised to it. This added security measure ensures that routes cannot accidentally be accepted or advertised in the case of a configuration omission error.


Note


This enforcement affects only eBGP neighbors (neighbors in a different autonomous system than this router). For internal BGP (iBGP) neighbors (neighbors in the same autonomous system), all routes are accepted or advertised if there is no policy.


Table Policy

The table policy feature in BGP allows you to configure traffic index values on routes as they are installed in the global routing table. This feature is enabled using the table-policy command and supports the BGP policy accounting feature.

BGP policy accounting uses traffic indices that are set on BGP routes to track various counters.

Table policy also provides the ability to drop routes from the RIB based on match criteria. This feature can be useful in certain applications and should be used with caution as it can easily create a routing ‘black hole’ where BGP advertises routes to neighbors that BGP does not install in its global routing table and forwarding table.

BGP Update Group

When a change to the configuration occurs, the router automatically recalculates update group memberships and applies the changes.

For the best optimization of BGP update group generation, we recommend that the network operator keeps outbound routing policy the same for neighbors that have similar outbound policies. This feature contains commands for monitoring BGP update groups.

BGP Update Generation and Update Groups

The BGP Update Groups feature separates BGP update generation from neighbor configuration. The BGP Update Groups feature introduces an algorithm that dynamically calculates BGP update group membership based on outbound routing policies. This feature does not require any configuration by the network operator. Update group-based message generation occurs automatically and independently.

BGP Cost Community

The BGP cost community is a nontransitive extended community attribute that is passed to internal BGP (iBGP) and confederation peers but not to external BGP (eBGP) peers. The cost community feature allows you to customize the local route preference and influence the best-path selection process by assigning cost values to specific routes. The extended community format defines generic points of insertion (POI) that influence the best-path decision at different points in the best-path algorithm.

How BGP Cost Community Influences the Best Path Selection Process

The cost community attribute influences the BGP best-path selection process at the point of insertion (POI). By default, the POI follows the Interior Gateway Protocol (IGP) metric comparison. When BGP receives multiple paths to the same destination, it uses the best-path selection process to determine which path is the best path. BGP automatically makes the decision and installs the best path in the routing table. The POI allows you to assign a preference to a specific path when multiple equal cost paths are available. If the POI is not valid for local best-path selection, the cost community attribute is silently ignored.

Cost communities are sorted first by POI then by community ID. Multiple paths can be configured with the cost community attribute for the same POI. The path with the lowest cost community ID is considered first. In other words, all cost community paths for a specific POI are considered, starting with the one with the lowest cost community. Paths that do not contain the cost community cost (for the POI and community ID being evaluated) are assigned the default community cost value (2147483647). If the cost community values are equal, then cost community comparison proceeds to the next lowest community ID for this POI.

To select the path with the lower cost community, simultaneously walk through the cost communities of both paths. This is done by maintaining two pointers to the cost community chain, one for each path, and advancing both pointers to the next applicable cost community at each step of the walk for the given POI, in order of community ID, and stop when a best path is chosen or the comparison is a tie. At each step of the walk, the following checks are done:


  If neither pointer refers to a cost community,
       Declare a tie;
  
    Elseif a cost community is found for one path but not for the other,
       Choose the path with cost community as best path;
    Elseif the Community ID from one path is less than the other,
       Choose the path with the lesser Community ID as best path;
    Elseif the Cost from one path is less than the other,
       Choose the path with the lesser Cost as best path;
    Else Continue.
  

Note


Paths that are not configured with the cost community attribute are considered by the best-path selection process to have the default cost value (half of the maximum value [4294967295] or 2147483647).


Applying the cost community attribute at the POI allows you to assign a value to a path originated or learned by a peer in any part of the local autonomous system or confederation. The cost community can be used as a “tie breaker” during the best-path selection process. Multiple instances of the cost community can be configured for separate equal cost paths within the same autonomous system or confederation. For example, a lower cost community value can be applied to a specific exit path in a network with multiple equal cost exit points, and the specific exit path is preferred by the BGP best-path selection process. .


Note


The cost community comparison in BGP is enabled by default. Use the bgp bestpath cost-community ignore command to disable the comparison.


Cost Community Support for Aggregate Routes and Multipaths

The BGP cost community feature supports aggregate routes and multipaths. The cost community attribute can be applied to either type of route. The cost community attribute is passed to the aggregate or multipath route from component routes that carry the cost community attribute. Only unique IDs are passed, and only the highest cost of any individual component route is applied to the aggregate for each ID. If multiple component routes contain the same ID, the highest configured cost is applied to the route. For example, the following two component routes are configured with the cost community attribute using an inbound route policy:

  • 10.0.0.1
    • POI=IGP

    • cost community ID=1

    • cost number=100

  • 192.168.0.1
    • POI=IGP

    • cost community ID=1

    • cost number=200

    If these component routes are aggregated or configured as a multipath, the cost value 200 is advertised, because it has the highest cost.

    If one or more component routes do not carry the cost community attribute or the component routes are configured with different IDs, then the default value (2147483647) is advertised for the aggregate or multipath route. For example, the following three component routes are configured with the cost community attribute using an inbound route policy. However, the component routes are configured with two different IDs.

  • 10.0.0.1
    • POI=IGP

    • cost community ID=1

    • cost number=100

  • 172.16.0.1
    • POI=IGP

    • cost community ID=2

    • cost number=100

  • 192.168.0.1
    • POI=IGP

    • cost community ID=1

    • cost number=200

    The single advertised path includes the aggregate cost communities as follows:

    {POI=IGP, ID=1, Cost=2147483647} {POI-IGP, ID=2, Cost=2147483647}

Influencing Route Preference in a Multiexit IGP Network

This figure shows an IGP network with two autonomous system boundary routers (ASBRs) on the edge. Each ASBR has an equal cost path to network 10.8/16.

Figure 6. Multiexit Point IGP Network

Both paths are considered to be equal by BGP. If multipath loadsharing is configured, both paths to the routing table are installed and are used to balance the load of traffic. If multipath load balancing is not configured, the BGP selects the path that was learned first as the best path and installs this path to the routing table. This behavior may not be desirable under some conditions. For example, the path is learned from ISP1 PE2 first, but the link between ISP1 PE2 and ASBR1 is a low-speed link.

The configuration of the cost community attribute can be used to influence the BGP best-path selection process by applying a lower-cost community value to the path learned by ASBR2. For example, the following configuration is applied to ASBR2:


Router(config)# route-policy ISP2_PE1
Router(config-rpl)# set extcommunity cost (1:1)

The preceding route policy applies a cost community number of 1 to the 10.8.0.0 route. By default, the path learned from ASBR1 is assigned a cost community number of 2147483647. Because the path learned from ASBR2 has a lower-cost community number, the path is preferred.

Adding Routes to the Routing Information Base

If a nonsourced path becomes the best path after the best-path calculation, BGP adds the route to the Routing Information Base (RIB) and passes the cost communities along with the other IGP extended communities.

When a route with paths is added to the RIB by a protocol, RIB checks the current best paths for the route and the added paths for cost extended communities. If cost-extended communities are found, the RIB compares the set of cost communities. If the comparison does not result in a tie, the appropriate best path is chosen. If the comparison results in a tie, the RIB proceeds with the remaining steps of the best-path algorithm. If a cost community is not present in either the current best paths or added paths, then the RIB continues with the remaining steps of the best-path algorithm.

BGP DMZ Aggregate Bandwidth

Table 11. Feature History Table

Feature Name

Release Information

Feature Description

Removal of Link-Bandwidth Extended Community to iBGP Peers

Release 7.3.2

The demilitarized zone (DMZ) link-bandwidth extended community allows BGP to send traffic over multiple internal BGP (iBGP) learned paths. The traffic that is sent is proportional to the bandwidth of the links that are used to exit the autonomous system. By default, iBGP propagates DMZ link-bandwidth community. This feature minimizes the risk of exposure of the community parameters, which are used to control the routing policy in the service provider network, to networks zones where they are not recognized or not required.

BGP supports aggregating dmz-link bandwidth values of external BGP (eBGP) multipaths when advertising the route to interior BGP (iBGP) peer.

There is no explicit command to aggregate bandwidth. The bandwidth is aggregated if following conditions are met:

  • The network has multipaths and all the multipaths have link-bandwidth values.

  • The next-hop attribute set to next-hop-self. The next-hop attribute for all routes advertised to the specified neighbor to the address of the local router.

  • There is no out-bound policy configured that might change the dmz-link bandwidth value.

  • If the dmz-link bandwidth value is not known for any one of the multipaths (eBGP or iBGP), the dmz-link value for all multipaths including the best path is not downloaded to routing information base (RIB).

  • The dmz-link bandwidth value of iBGP multipath is not considered during aggregation.

  • The route that is advertised with aggregate value can be best path or add-path.

  • Add-path does not qualify for DMZ link bandwidth aggregation as next hop is preserved. Configuring next-hop-self for add-path is not supported.

  • For VPNv4 and VPNv6 afi, if dmz link-bandwidth value is configured using outbound route-policy, specify the route table or use the additive keyword. Else, this will lead to routes not imported on the receiving end of the peer.

extcommunity-set bandwidth dmz_ext
   1:8000
 end-set
 !
 route-policy dmz_rp_vpn
   set extcommunity bandwidth dmz_ext additive     <<< 'additive' keyword.
   pass
 end-policy

Removal of Link-Bandwidth Extended Community to iBGP Peers

The demilitarized zone (DMZ) link-bandwidth extended community allows BGP to send traffic over multiple internal BGP (iBGP) learned paths. The traffic that is sent is proportional to the bandwidth of the links that are used to exit the autonomous system. By default, iBGP propagates DMZ link-bandwidth community. The Removal of Link-Bandwidth Extended Community to iBGP Peers feature provides the flexibility to remove the DMZ link-bandwidth community to minimize the risk of exposure of the community parameters to networks zones where they are not recognized or unnecessary.

Configuration Example

Perform the following steps to allow users to be able to configure route-policy to remove the extended communities.


/* Delete all the extended communities. */
Router(config)# route-policy dmz_del_all 
Router(config-rpl)# delete extcommunity bandwidth all
Router(config-rpl)# pass
Router(config-rpl)# end-policy

/* Delete only the extended communities that match an extended community mentioned in the list. */ 
Router(config)# route-policy dmz_CE1_del_non_match
Router(config-rpl)# if destination in (10.9.9.9/32) then 
Router(config-rpl-if)# delete extcommunity bandwidth in (10:7000)
Router(config-rpl-if)# endif
Router(config-rpl)# pass
Router(config-rpl)# end-policy

/* Delete all the extended communities. */
Router(config)# route-policy dmz_del_param2($a,$b)
Router(config-rpl)# if destination in (10.9.9.9/32) then 
Router(config-rpl-if)# delete extcommunity bandwidth in ($a:$b)
Router(config-rpl-if)# endif
Router(config-rpl)# pass
Router(config-rpl)# end-policy

Verification

Verify the configuration that allows the user to remove a particular extended community.

Router# show bgp 10.9.9.9/32
Fri Aug 27 13:15:05.833 EDT
BGP routing table entry for 10.9.9.9/32
Versions:
Process bRIB/RIB SendTblVer
Speaker 15 15
Last Modified: Aug 27 13:06:45.000 for 00:08:21
Paths: (3 available, best #1)
Advertised IPv4 Unicast paths to peers (in unique update groups):
13.13.13.5
Path #1: Received by speaker 0
Advertised IPv4 Unicast paths to peers (in unique update groups):
13.13.13.5
10
10.10.10.1 from 10.10.10.1 (192.168.0.1)
Origin incomplete, metric 0, localpref 100, valid, external, best, group-best, multipath
Received Path ID 0, Local Path ID 1, version 15
Extended community: LB:10:48
Origin-AS validity: (disabled)
Path #2: Received by speaker 0
Not advertised to any peer
10
11.11.11.3 from 11.11.11.3 (192.168.0.3)
Origin incomplete, metric 0, localpref 100, valid, external, multipath
Received Path ID 0, Local Path ID 0, version 0
Extended community: LB:10:48
Origin-AS validity: (disabled)
Path #3: Received by speaker 0
Not advertised to any peer
10
12.12.12.4 from 12.12.12.4 (192.168.0.4)
Origin incomplete, metric 0, localpref 100, valid, external, multipath
Received Path ID 0, Local Path ID 0, version 0
Extended community: LB:10:48
Origin-AS validity: (disabled)

22:35 30-09-2021

Configuring BGP DMZ Aggregate Bandwidth: Example

This is a sample configuration for Border Gateway Protocol Demilitarized Zone (BGP DMZ) link bandwidth. Consider the topology, R1---(iBGP)---R2---(iBGP)---R3:

  1. On R1:
    bgp: prefix p/n has:
    path 1(bestpath)       with LB value 100
    path 2(ebgp multipath) with LB value 30
    path 3(ebgp multipath) with LB value 50
    
    When best path is advertised to R2, send aggregated dmz-link bandwidth value of 180; aggregated value of paths 1, 2 and 3.
  2. On R2:
    bgp: prefix p/n has:
    path 1(bestpath)       with LB value 60
    path 2(ebgp multipath) with LB value 200
    path 3(ebgp multipath) with LB value 50
    
    When best path is advertised to R3, send aggregated dmz-link bandwidth value of 310; aggregated value of paths 1, 2 and 3.
  3. On R3:
    bgp: prefix p/n has:
    path 1(bestpath)       with LB 180 {learned from R1}
    path 2(ibgp multipath) with LB 310 {learned from R2}
    

Configuring Policy-based Link Bandwidth: Example

This is a sample configuration for policy-based DMZ link bandwidth. The link-bandwidth ext-community can be set on a per-path basis either at the neighbor-in or neighbor-out policy attach-points. The dmz-link-bandwidth knob is configured under eBGP neighbor configuration mode. All paths received from that particular neighbor will be marked with the link-bandwidth extended community when sent to iBGP peers.

  1. Configure inbound or outbound route-policy.
    extcommunity-set bandwidth dmz_ext
      1:1290400000
    end-set
    !
    route-policy dmz_rp
      set extcommunity bandwidth dmz_ext
      pass
    end-policy
    !
    
     neighbor 10.0.101.1
      remote-as 1001
      address-family ipv4 unicast
       route-policy dmz_rp in          <<< Inbound route-policy.
       route-policy pass out
      !
    
  2. Configure dmz-link-bandwidth under BGP neighbor.
    neighbor 10.0.101.2
      remote-as 1001
      dmz-link-bandwidth               <<< Under neighbor.
      address-family ipv4 unicast
       route-policy pass in
       route-policy pass out
      !
    

64-ECMP Support for BGP

IOS XR supports configuration of up to 64 equal cost multipath (ECMP) next hops for BGP. 64-ECMP is required in networks, where overloaded routers can load balance the traffic over as many as 64 LSPs.

BGP Best Path Algorithm

BGP routers typically receive multiple paths to the same destination. The BGP best-path algorithm determines the best path to install in the IP routing table and to use for forwarding traffic. This section describes the Cisco IOS XR software implementation of BGP best-path algorithm, as specified in Section 9.1 of the Internet Engineering Task Force (IETF) Network Working Group draft-ietf-idr-bgp4-24.txt document.

The BGP best-path algorithm implementation is in three parts:

  • Part 1—Compares two paths to determine which is better.

  • Part 2—Iterates over all paths and determines which order to compare the paths to select the overall best path.

  • Part 3—Determines whether the old and new best paths differ enough so that the new best path should be used.


Note


The order of comparison determined by Part 2 is important because the comparison operation is not transitive; that is, if three paths, A, B, and C exist, such that when A and B are compared, A is better, and when B and C are compared, B is better, it is not necessarily the case that when A and C are compared, A is better. This nontransitivity arises because the multi exit discriminator (MED) is compared only among paths from the same neighboring autonomous system (AS) and not among all paths.


Comparing Pairs of Paths

Perform the following steps to compare two paths and determine the better path:

  1. If either path is invalid (for example, a path has the maximum possible MED value or it has an unreachable next hop), then the other path is chosen (provided that the path is valid).

  2. If the paths have unequal pre-bestpath cost communities, the path with the lower pre-bestpath cost community is selected as the best path.

  3. If the paths have unequal weights, the path with the highest weight is chosen.

    Note


    The weight is entirely local to the router, and can be set with the weight command or using a routing policy.


  4. If the paths have unequal local preferences, the path with the higher local preference is chosen.


    Note


    If a local preference attribute was received with the path or was set by a routing policy, then that value is used in this comparison. Otherwise, the default local preference value of 100 is used. The default value can be changed using the bgp default local-preference command.


  5. If one of the paths is a redistributed path, which results from a redistribute or network command, then it is chosen. Otherwise, if one of the paths is a locally generated aggregate, which results from an aggregate-address command, it is chosen.


    Note


    Step 1 through Step 4 implement the “Path Selection with BGP”of RFC 1268.


  6. If the paths have unequal AS path lengths, the path with the shorter AS path is chosen. This step is skipped if bgp bestpath as-path ignore command is configured.


    Note


    When calculating the length of the AS path, confederation segments are ignored, and AS sets count as 1.



    Note


    eiBGP specifies internal and external BGP multipath peers. eiBGP allows simultaneous use of internal and external paths.


  7. If the paths have different origins, the path with the lower origin is selected. Interior Gateway Protocol (IGP) is considered lower than EGP, which is considered lower than INCOMPLETE.

  8. If appropriate, the MED of the paths is compared. If they are unequal, the path with the lower MED is chosen.

    A number of configuration options exist that affect whether or not this step is performed. In general, the MED is compared if both paths were received from neighbors in the same AS; otherwise the MED comparison is skipped. However, this behavior is modified by certain configuration options, and there are also some corner cases to consider.

    If the bgp bestpath med always command is configured, then the MED comparison is always performed, regardless of neighbor AS in the paths. Otherwise, MED comparison depends on the AS paths of the two paths being compared, as follows:

    • If a path has no AS path or the AS path starts with an AS_SET, then the path is considered to be internal, and the MED is compared with other internal paths.

    • If the AS path starts with an AS_SEQUENCE, then the neighbor AS is the first AS number in the sequence, and the MED is compared with other paths that have the same neighbor AS.

    • If the AS path contains only confederation segments or starts with confederation segments followed by an AS_SET, then the MED is not compared with any other path unless the bgp bestpath med confed command is configured. In that case, the path is considered internal and the MED is compared with other internal paths.

    • If the AS path starts with confederation segments followed by an AS_SEQUENCE, then the neighbor AS is the first AS number in the AS_SEQUENCE, and the MED is compared with other paths that have the same neighbor AS.


    Note


    If no MED attribute was received with the path, then the MED is considered to be 0 unless the bgp bestpath med missing-as-worst command is configured. In that case, if no MED attribute was received, the MED is considered to be the highest possible value.


  9. If one path is received from an external peer and the other is received from an internal (or confederation) peer, the path from the external peer is chosen.

  10. If the paths have different IGP metrics to their next hops, the path with the lower IGP metric is chosen.

  11. If the paths have unequal IP cost communities, the path with the lower IP cost community is selected as the best path.

  12. If all path parameters in Step 1 through Step 10 are the same, then the router IDs are compared. If the path was received with an originator attribute, then that is used as the router ID to compare; otherwise, the router ID of the neighbor from which the path was received is used. If the paths have different router IDs, the path with the lower router ID is chosen.


    Note


    Where the originator is used as the router ID, it is possible to have two paths with the same router ID. It is also possible to have two BGP sessions with the same peer router, and therefore receive two paths with the same router ID.


  13. If the paths have different cluster lengths, the path with the shorter cluster length is selected. If a path was not received with a cluster list attribute, it is considered to have a cluster length of 0.

  14. Finally, the path received from the neighbor with the lower IP address is chosen. Locally generated paths (for example, redistributed paths) are considered to have a neighbor IP address of 0.

Order of Comparisons

The second part of the BGP best-path algorithm implementation determines the order in which the paths should be compared. The order of comparison is determined as follows:

  1. The paths are partitioned into groups such that within each group the MED can be compared among all paths. The same rules as in Comparing Paths section are used to determine whether MED can be compared between any two paths. Normally, this comparison results in one group for each neighbor AS. If the bgp bestpath med always command is configured, then there is just one group containing all the paths.

  2. The best path in each group is determined. Determining the best path is achieved by iterating through all paths in the group and keeping track of the best one seen so far. Each path is compared with the best-so-far, and if it is better, it becomes the new best-so-far and is compared with the next path in the group.

  3. A set of paths is formed containing the best path selected from each group in Step 2. The overall best path is selected from this set of paths, by iterating through them as in Step 2.

Best Path Change Suppression

The third part of the implementation is to determine whether the best-path change can be suppressed or not—whether the new best path should be used, or continue using the existing best path. The existing best path can continue to be used if the new one is identical to the point at which the best-path selection algorithm becomes arbitrary (if the router-id is the same). Continuing to use the existing best path can avoid churn in the network.


Note


This suppression behavior does not comply with the IETF Networking Working Group draft-ietf-idr-bgp4-24.txt document, but is specified in the IETF Networking Working Group draft-ietf-idr-avoid-transition-00.txt document.


The suppression behavior can be turned off by configuring the bgp bestpath compare-routerid command. If this command is configured, the new best path is always preferred to the existing one.

Otherwise, the following steps are used to determine whether the best-path change can be suppressed:

  1. If the existing best path is no longer valid, the change cannot be suppressed.

  2. If either the existing or new best paths were received from internal (or confederation) peers or were locally generated (for example, by redistribution), then the change cannot be suppressed. That is, suppression is possible only if both paths were received from external peers.

  3. If the paths were received from the same peer (the paths would have the same router-id), the change cannot be suppressed. The router ID is calculated using rules in Comparing Pairs of Paths section.

  4. If the paths have different weights, local preferences, origins, or IGP metrics to their next hops, then the change cannot be suppressed. Note that all these values are calculated using the rules in Comparing Pairs of Paths section..

  5. If the paths have different-length AS paths and the bgp bestpath as-path ignore command is not configured, then the change cannot be suppressed. Again, the AS path length is calculated using the rules in Comparing Pairs of Paths section.

  6. If the MED of the paths can be compared and the MEDs are different, then the change cannot be suppressed. The decision as to whether the MEDs can be compared is exactly the same as the rules in Comparing Pairs of Paths section, as is the calculation of the MED value.

  7. If all path parameters in Step 1 through Step 6 do not apply, the change can be suppressed.

Administrative Distance

An administrative distance is a rating of the trustworthiness of a routing information source. In general, the higher the value, the lower the trust rating.

Normally, a route can be learned through more than one protocol. Administrative distance is used to discriminate between routes learned from more than one protocol. The route with the lowest administrative distance is installed in the IP routing table. By default, BGP uses the administrative distances shown in BGP Default Administrative Distances section.

Table 12. BGP Default Administrative Distances

Distance

Default Value

Function

External

20

Applied to routes learned from eBGP.

Internal

200

Applied to routes learned from iBGP.

Local

200

Applied to routes originated by the router.


Note


Distance does not influence the BGP path selection algorithm, but it does influence whether BGP-learned routes are installed in the IP routing table.


In most cases, when a route is learned through eBGP, it is installed in the IP routing table because of its distance (20). Sometimes, however, two ASs have an IGP-learned back-door route and an eBGP-learned route. Their policy might be to use the IGP-learned path as the preferred path and to use the eBGP-learned path when the IGP path is down.

Figure 7. Back Door Example

In Back Door Example section, Routers A and C and Routers B and C are running eBGP. Routers A and B are running an IGP (such as Routing Information Protocol [RIP], Interior Gateway Routing Protocol [IGRP], Enhanced IGRP, or Open Shortest Path First [OSPF]). The default distances for RIP, IGRP, Enhanced IGRP, and OSPF are 120, 100, 90, and 110, respectively. All these distances are higher than the default distance of eBGP, which is 20. Usually, the route with the lowest distance is preferred.

Router A receives updates about 160.10.0.0 from two routing protocols: eBGP and IGP. Because the default distance for eBGP is lower than the default distance of the IGP, Router A chooses the eBGP-learned route from Router C. If you want Router A to learn about 160.10.0.0 from Router B (IGP), establish a BGP back door. See .

In the following example, a network back-door is configured:


Router(config)# router bgp 100
Router(config-bgp)# address-family ipv4 unicast
Router(config-bgp-af)# network 160.10.0.0/16 backdoor

Router A treats the eBGP-learned route as local and installs it in the IP routing table with a distance of 200. The network is also learned through Enhanced IGRP (with a distance of 90), so the Enhanced IGRP route is successfully installed in the IP routing table and is used to forward traffic. If the Enhanced IGRP-learned route goes down, the eBGP-learned route is installed in the IP routing table and is used to forward traffic.

Although BGP treats network 160.10.0.0 as a local entry, it does not advertise network 160.10.0.0 as it normally would advertise a local entry.

Route Dampening

Route dampening is a BGP feature that minimizes the propagation of flapping routes across an internetwork. A route is considered to be flapping when it is repeatedly available, then unavailable, then available, then unavailable, and so on.

For example, consider a network with three BGP autonomous systems: autonomous system 1, autonomous system 2, and autonomous system 3. Suppose the route to network A in autonomous system 1 flaps (it becomes unavailable). Under circumstances without route dampening, the eBGP neighbor of autonomous system 1 to autonomous system 2 sends a withdraw message to autonomous system 2. The border router in autonomous system 2, in turn, propagates the withdrawal message to autonomous system 3. When the route to network A reappears, autonomous system 1 sends an advertisement message to autonomous system 2, which sends it to autonomous system 3. If the route to network A repeatedly becomes unavailable, then available, many withdrawal and advertisement messages are sent. Route flapping is a problem in an internetwork connected to the Internet, because a route flap in the Internet backbone usually involves many routes.

Minimize Flapping

The route dampening feature minimizes the flapping problem as follows. Suppose again that the route to network A flaps. The router in autonomous system 2 (in which route dampening is enabled) assigns network A a penalty of 1000 and moves it to history state. The router in autonomous system 2 continues to advertise the status of the route to neighbors. The penalties are cumulative. When the route flaps so often that the penalty exceeds a configurable suppression limit, the router stops advertising the route to network A, regardless of how many times it flaps. Thus, the route is dampened.

The penalty placed on network A is decayed until the reuse limit is reached, upon which the route is once again advertised. At half of the reuse limit, the dampening information for the route to network A is removed.


Note


No penalty is applied to a BGP peer reset when route dampening is enabled, even though the reset withdraws the route.


BGP Routing Domain Confederation

One way to reduce the iBGP mesh is to divide an autonomous system into multiple sub-autonomous systems and group them into a single confederation. To the outside world, the confederation looks like a single autonomous system. Each autonomous system is fully meshed within itself and has a few connections to other autonomous systems in the same confederation. Although the peers in different autonomous systems have eBGP sessions, they exchange routing information as if they were iBGP peers. Specifically, the next hop, MED, and local preference information is preserved. This feature allows you to retain a single IGP for all of the autonomous systems.

BGP Optimal Route Reflector

BGP-ORR (optimal route reflector) enables virtual route reflector (vRR) to calculate the best path from a route reflector (RR) client's point of view.

BGP ORR calculates the best path by:

  1. Running SPF multiple times in the context of its RR clients or RR clusters (set of RR clients)

  2. Saving the result of different SPF runs in separate databases

  3. Using these databases to manipulate BGP best path decision and thereby allowing BGP to use and announce best path that is optimal from the client’s point of view


Note


Enabling the ORR feature increases the memory footprint of BGP and RIB. With increased number of vRR configured in the network, ORR adversely impacts convergence for BGP.


In an autonomous system, a BGP route reflector acts as a focal point and advertises routes to its peers (RR clients) along with the RR's computed best path. Since the best path advertised by the RR is computed from the RR's point of view, the RR's placement becomes an important deployment consideration.

With network function virtualization (NFV) becoming a dominant technology, service providers (SPs) are hosting virtual RR functionality in a cloud using servers. A vRR can run on a control plane device and can be placed anywhere in the topology or in a SP data center. Cisco IOS XRv 9000 Router can be implemented as vRR over a NFV platform in a SP data center. vRR allows SPs to scale memory and CPU usage of RR deployments significantly. Moving a RR out of its optimal placement requires vRRs to implement ORR functionality that calculates the best path from a RR client's point of view.

BGP ORR offers these benefits:

  • Calculates the bestpath from the point of view of a RR client.

  • Enables vRR to be placed anywhere in the topology or in a SP data center.

  • Allows SPs to scale memory and CPU usage of RR deployments.

Use Case

Consider a BGP Route Reflector topology where:

  • Router R1, R2, R3, R4, R5 and R6 are route reflector clients

  • Router R1 and R4 advertise 6/8 prefix to vRR

Figure 8. BGP-ORR Topology


vRR receives prefix 6/8 from R1 and R4. Without BGP ORR configured in the network, the vRR selects R4 as the closest exit point for RR clients R2, R3, R5, and R6, and reflects the 6/8 prefix learned from R4 to these RR clients R2, R3, R5, and R6. From the topology, it is evident that for R2 the best path is R1 and not R4. This is because the vRR calculates best path from the RR's point of view.

When the BGP ORR is configured in the network, the vRR calculates the shortest exit point in the network from R2’s point of view (ORR Root: R2) and determines that R1 is the closest exit point to R2. vRR then reflects the 6/8 prefix learned from R1 to R2.

Configuring BGP ORR includes:

  • enabling ORR on the RR for the client whose shortest exit point is to be determined

  • applying the ORR configuration to the neighbor

Enabling ORR on vRR for R2 (RR client)
For example to determine shortest exit point for R2; configure ORR on vRR with an IP address of R2 that is 192.0.2.2. Use 6500 as AS number and g1 as orr (root) policy name:

router bgp 6500
 address-family ipv4 unicast
   optimal-route-reflection g1 192.0.2.2 
commit

Applying the ORR configuration to the neighbor
Next, apply the ORR policy to BGP neighbor R2 (this enables RR to advertise best path calculated using the root IP address, 192.0.2.2, configured in orr (root) policy g1 to R2):

router bgp 6500
 neighbor 192.0.2.2
  address-family ipv4 unicast
   optimal-route-reflection g1  
commit

Configuring MPLS Traffic-Engineering on Root Router

The root routers advertise the Multi Protocol Label Switching (MPLS) TE router-ID that matches with the configured root address on the RR. So, you must configure the root router with a minimal MPLS TE configuration to advertise this MPLS TE router-ID. The minimal set of commands that you need to configure depends on the operating system of the root router.

The following is a sample configuration on the root router:

router isis 100

is-type level-2-only

net 49.0001.0000.0000.0001.00

distribute link-state

  metric-style wide

  mpls traffic-eng level-2-only

  mpls traffic-eng router-id Loopback0

!

mpls traffic-eng
Verification

To verify whether R2 received the best exit, execute the show bgp <prefix> command (from R2) in EXEC mode. In the above example, R1 and R4 advertise the 6/8 prefix; run the show bgp 6.0.0.0/8 command:

R2# show bgp 6.0.0.0/8
Tue Apr  5 20:21:58.509 UTC
BGP routing table entry for 6.0.0.0/8
Versions:
  Process           bRIB/RIB  SendTblVer
  Speaker                  8           8
Last Modified: Apr  5 20:00:44.022 for 00:21:14
Paths: (1 available, best #1)
  Not advertised to any peer
  Path #1: Received by speaker 0
  Not advertised to any peer
  Local
    192.0.2.1 (metric 20) from 203.0.113.1 (192.0.2.1)
      Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best
      Received Path ID 0, Local Path ID 1, version 8
      Originator: 192.0.2.1, Cluster list: 203.0.113.1

The above show output states that the best path for R2 is through R1, whose IP address is 192.0.2.1 and the metric of the path is 20.

Execute the show bgp command from the vRR to determine the best path calculated for R2 by ORR. R2 has its own update-group because it has a different best path (or different policy configured) than those of other peers:

VRR#show bgp 6.0.0.0/8
Thu Apr 28 13:36:42.744 UTC
BGP routing table entry for 6.0.0.0/8
Versions:
Process bRIB/RIB SendTblVer
Speaker 13 13
Last Modified: Apr 28 13:36:26.909 for 00:00:15
Paths: (2 available, best #2)
Advertised to update-groups (with more than one peer):
0.2
Path #1: Received by speaker 0
ORR bestpath for update-groups (with more than one peer):
0.1
Local, (Received from a RR-client)
192.0.2.1 (metric 30) from 192.0.2.1 (192.0.2.1)
Origin incomplete, metric 0, localpref 100, valid, internal, add-path
Received Path ID 0, Local Path ID 2, version 13
Path #2: Received by speaker 0
Advertised to update-groups (with more than one peer):
0.2
ORR addpath for update-groups (with more than one peer):
0.1
Local, (Received from a RR-client)
192.0.2.4 (metric 20) from 192.0.2.4 (192.0.2.4)
Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best
Received Path ID 0, Local Path ID 1, version 13


Note


Path #1 is advertised to update-group 0.1. R2 is in update-group 0.1.


Execute the show bgp command for update-group 0.1 verify whether R2 is in update-group 0.1.

VRR#show bgp update-group 0.1
Thu Apr 28 13:38:18.517 UTC

Update group for IPv4 Unicast, index 0.1:
Attributes:
Neighbor sessions are IPv4
Internal
Common admin
First neighbor AS: 65000
Send communities
Send GSHUT community if originated
Send extended communities
Route Reflector Client
ORR root (configured): g1; Index: 0
4-byte AS capable
Non-labeled address-family capable
Send AIGP
Send multicast attributes
Minimum advertisement interval: 0 secs
Update group desynchronized: 0
Sub-groups merged: 0
Number of refresh subgroups: 0
Messages formatted: 5, replicated: 5
All neighbors are assigned to sub-group(s)
Neighbors in sub-group: 0.2, Filter-Groups num:1
Neighbors in filter-group: 0.2(RT num: 0)
192.0.2.2

For further verification, check the contents of the table created on vRR as a result of configuring the g1 policy. From R2’s point of view, the cost of reaching R1 is 20 and the cost of reaching R4 is 30. Therefore, the closest and best exit for R2 is through R1:

VRR#show orrspf database g1
Thu Apr 28 13:39:20.333 UTC

ORR policy: g1, IPv4, RIB tableid: 0xe0000011
Configured root: primary: 192.0.2.2, secondary: NULL, tertiary: NULL
Actual Root: 192.0.2.2, Root node: 2000.0100.1002.0000

Prefix Cost
203.0.113.1 30
192.0.2.1 20
192.0.2.2 0
192.0.2.3 30
192.0.2.4 30
192.0.2.5 10
192.0.2.6 20

Number of mapping entries: 8

RPL - if prefix is-best-path/is-best-multipath

Border Gateway Protocol (BGP) routers receive multiple paths to the same destination. As a standard, by default the BGP best path algorithm decides the best path to install in IP routing table. This is used for traffic forwarding.

BGP assigns the first valid path as the current best path. It then compares the best path with the next in the list. This process continues, until BGP reaches the end of the list of valid paths. This contains all rules used to determine the best path. When there are multiple paths for a given address prefix, BGP:
  • Selects one of the paths as the best path as per the best-path selection rules.

  • Installs the best path in its forwarding table. Each BGP speaker advertises only the best-path to its peers.


Note


The advertisement rule of sending only the best path does not convey the full routing state of a destination, present on a BGP speaker to its peers.

After the BGP speaker receives a path from one of its peers; the path is used by the peer for forwarding packets. All other peers receive the same path from this peer. This leads to a consistent routing in a BGP network. To improve the link bandwidth utilization, most BGP implementations choose additional paths satisfy certain conditions, as multi-path, and install them in the forwarding table. Incoming packets for such are load-balanced across the best-path and the multi-path(s). You can install the paths in the forwarding table that are not advertised to the peers. The RR route reflector finds out the best-path and multi-path. This way the route reflector uses different communities for best-path and multi-path. This feature allows BGP to signal the local decision done by RR or Border Router. With this new feature, selected by RR using community-string (if is-best-path then community 100:100). The controller checks which best path is sent to all R's. Border Gateway Protocol routers receive multiple paths to the same destination. While carrying out best path computation there will be one best path, sometimes equal and few non-equal paths. Thus, the requirement for a best-path and is-equal-best-path.

The BGP best path algorithm decides the best path in the IP routing table and used for forwarding traffic. This enhancement within the RPL allows creating policy to take decisions. Adding community-string for local selection of best path. With introduction of BGP Additional Path (Add Path), BGP now signals more than the best Path. BGP can signal the best path and the entire path equivalent to the best path. This is in accordance to the BGP multi-path rules and all backup paths.

Remotely Triggered Blackhole Filtering with RPL Next-hop Discard Configuration

Remotely triggered black hole (RTBH) filtering is a technique that provides the ability to drop undesirable traffic before it enters a protected network. RTBH filtering provides a method for quickly dropping undesirable traffic at the edge of the network, based on either source addresses or destination addresses by forwarding it to a null0 interface. RTBH filtering based on a destination address is commonly known as Destination-based RTBH filtering. Whereas, RTBH filtering based on a source address is known as Source-based RTBH filtering.

RTBH filtering is one of the many techniques in the security toolkit that can be used together to enhance network security in the following ways:

  • Effectively mitigate DDoS and worm attacks

  • Quarantine all traffic destined for the target under attack

  • Enforce blocklist filtering

Configure Destination-based RTBH Filtering

RTBH is implemented by defining a route policy (RPL) to discard undesirable traffic at next-hop using set next-hop discard command.

RTBH filtering sets the next-hop of the victim's prefix to the null interface. The traffic destined to the victim is dropped at the ingress.

The set next-hop discard configuration is used in the neighbor inbound policy. When this config is applied to a path, though the primary next-hop is associated with the actual path but the RIB is updated with next-hop set to Null0. Even if the primary received next-hop is unreachable, the RTBH path is considered reachable and will be a candidate in the bestpath selection process. The RTBH path is readvertised to other peers with either the received next-hop or nexthop-self based on normal BGP advertisement rules.

A typical deployment scenario for RTBH filtering would require running internal Border Gateway Protocol (iBGP) at the access and aggregation points and configuring a separate device in the network operations center (NOC) to act as a trigger. The triggering device sends iBGP updates to the edge, that cause undesirable traffic to be forwarded to a null0 interface and dropped.

Consider below topology, where a rogue router is sending traffic to a border router.

Figure 9. Topology to Implement RTBH Filtering

Configurations applied on the Trigger Router

Configure a static route redistribution policy that sets a community on static routes marked with a special tag, and apply it in BGP:

route-policy RTBH-trigger
  if tag is 777 then
    set community (1234:4321, no-export) additive
    pass
  else
    pass
  endif
  end-policy

router bgp 65001
 address-family ipv4 unicast
  redistribute static route-policy RTBH-trigger
 !
 neighbor 192.168.102.1 
  remote-as 65001
  address-family ipv4 unicast
   route-policy bgp_all in
   route-policy bgp_all out

Configure a static route with the special tag for the source prefix that has to be block-holed:

router static
 address-family ipv4 unicast
 10.7.7.7/32 Null0 tag 777

Configurations applied on the Border Router

Configure a route policy that matches the community set on the trigger router and configure set next-hop discard:

route-policy RTBH
  if community matches-any (1234:4321) then
    set next-hop discard
  else
    pass
  endif
end-policy

Apply the route policy on the iBGP peers:

router bgp 65001
 address-family ipv4 unicast
 !
 neighbor 192.168.102.2 
  remote-as 65001
  address-family ipv4 unicast
   route-policy RTBH in
   route-policy bgp_all out

Default Address Family for show Commands

Most of the show commands provide address family (AFI) and subaddress family (SAFI) arguments (see RFC 1700 and RFC 2858 for information on AFI and SAFI). The Cisco IOS XR software parser provides the ability to set the afi and safi so that it is not necessary to specify them while running a show command. The parser commands are:

  • set default-afi { ipv4 | ipv6 | all }

  • set default-safi { unicast | multicast | all }

The parser automatically sets the default afi value to ipv4 and default safi value to unicast . It is necessary to use only the parser commands to change the default afi value from ipv4 or default safi value from unicast . Any afi or safi keyword specified in a show command overrides the values set using the parser commands. Use the following show default-afi-safi-vrf command to check the currently set value of the afi and safi.

TCP Maximum Segment Size

Maximum Segment Size (MSS) is the largest amount of data that a computer or a communication device can receive in a single, unfragmented TCP segment. All TCP sessions are bounded by a limit on the number of bytes that can be transported in a single packet; this limit is MSS. TCP breaks up packets into chunks in a transmit queue before passing packets down to the IP layer.

The TCP MSS value is dependent on the maximum transmission unit (MTU) of an interface, which is the maximum length of data that can be transmitted by a protocol at one instance. The maximum TCP packet length is determined by both the MTU of the outbound interface on the source device and the MSS announced by the destination device during the TCP setup process. The closer the MSS is to the MTU, the more efficient is the transfer of BGP messages. Each direction of data flow can use a different MSS value.

Per Neighbor TCP MSS

The per neighbor TCP MSS feature allows you to create unique TCP MSS profiles for each neighbor. Per neighbor TCP MSS is supported in two modes: neighbor group and session group. Before, TCP MSS configuration was available only at the global level in the BGP configuration.

The per neighbor TCP MSS feature allows you to:

  • Enable per neighbor TCP MSS configuration.

  • Disable TCP MSS for a particular neighbor in the neighbor group or session group using the inheritance-disable command.

  • Unconfigure TCP MSS value. On unconfiguration, TCP MSS value in the protocol control block (PCB) is set to the default value.


    Note


    The default TCP MSS value is 536 (in octets) or 1460 (in bytes). The MSS default of 1460 means that TCP segments the data in the transmit queue into 1460-byte chunks before passing the packets to the IP layer.


To configure per neighbor TCP MSS, use the tcp mss command under per neighbor, neighbor group or session group configuration.

For detailed configuration steps, see the Configuring Per Neighbor TCP MSS section.

For detailed steps to disable per neighbor TCP MSS, see the Disabling Per Neighbor TCP MSS section.

BGP Keychains

BGP keychains enable keychain authentication between two BGP peers. The BGP endpoints must both comply with draft-bonica-tcp-auth-05.txt and a keychain on one endpoint and a password on the other endpoint does not work.

BGP is able to use the keychain to implement hitless key rollover for authentication. Key rollover specification is time based, and in the event of clock skew between the peers, the rollover process is impacted. The configurable tolerance specification allows for the accept window to be extended (before and after) by that margin. This accept window facilitates a hitless key rollover for applications (for example, routing and management protocols).

The key rollover does not impact the BGP session, unless there is a keychain configuration mismatch at the endpoints resulting in no common keys for the session traffic (send or accept).

BGP Nonstop Routing

The Border Gateway Protocol (BGP) Nonstop Routing (NSR) with Stateful Switchover (SSO) feature enables all bgp peerings to maintain the BGP state and ensure continuous packet forwarding during events that could interrupt service. Under NSR, events that might potentially interrupt service are not visible to peer routers. Protocol sessions are not interrupted and routing states are maintained across process restarts and switchovers.

BGP NSR provides nonstop routing during the following events:

  • Route processor switchover

  • Process crash or process failure of BGP or TCP


    Note


    BGP NSR is enabled by default. Use the nsr disable command to turn off BGP NSR. The no nsr disable command can also be used to turn BGP NSR back on if it has been disabled.

    In case of process crash or process failure, NSR will be maintained only if nsr process-failures switchover command is configured. In the event of process failures of active instances, the nsr process-failures switchover configures failover as a recovery action and switches over to a standby route processor (RP) or a standby distributed route processor (DRP) thereby maintaining NSR. An example of the configuration command is RP/0/RSP0/CPU0:router(config) # nsr process-failures switchover

    The nsr process-failures switchover command maintains both the NSR and BGP sessions in the event of a BGP or TCP process crash. Without this configuration, BGP neighbor sessions flap in case of a BGP or TCP process crash. This configuration does not help if the BGP or TCP process is restarted in which case the BGP neighbors are expected to flap.

    When the l2vpn_mgr process is restarted, the NSR client (te-control) flaps between the Ready and Not Ready state. This is the expected behavior and there is no traffic loss.


During route processor switchover and In-Service System Upgrade (ISSU), NSR is achieved by stateful switchover (SSO) of both TCP and BGP.

NSR does not force any software upgrades on other routers in the network, and peer routers are not required to support NSR.

When a route processor switchover occurs due to a fault, the TCP connections and the BGP sessions are migrated transparently to the standby route processor, and the standby route processor becomes active. The existing protocol state is maintained on the standby route processor when it becomes active, and the protocol state does not need to be refreshed by peers.

Events such as soft reconfiguration and policy modifications can trigger the BGP internal state to change. To ensure state consistency between active and standby BGP processes during such events, the concept of post-it is introduced that act as synchronization points.

BGP NSR provides the following features:

  • NSR-related alarms and notifications

  • Configured and operational NSR states are tracked separately

  • NSR statistics collection

  • NSR statistics display using show commands

  • XML schema support

  • Auditing mechanisms to verify state synchronization between active and standby instances

  • CLI commands to enable and disable NSR

  • Support for 5000 NSR sessions

BGP Best-External Path

The best–external path functionality supports advertisement of the best–external path to the iBGP and Route Reflector peers when a locally selected bestpath is from an internal peer. BGP selects one best path and one backup path to every destination. By default, selects one best path . Additionally, BGP selects another bestpath from among the remaining external paths for a prefix. Only a single path is chosen as the best–external path and is sent to other PEs as the backup path. BGP calculates the best–external path only when the best path is an iBGP path. If the best path is an eBGP path, then best–external path calculation is not required.

The procedure to determine the best–external path is as follows:

  1. Determine the best path from the entire set of paths available for a prefix.

  2. Eliminate the current best path.

  3. Eliminate all the internal paths for the prefix.

  4. From the remaining paths, eliminate all the paths that have the same next hop as that of the current best path.

  5. Rerun the best path algorithm on the remaining set of paths to determine the best–external path.

BGP considers the external and confederations BGP paths for a prefix to calculate the best–external path. BGP advertises the best path and the best–external path as follows:

  • On the primary PE—advertises the best path for a prefix to both its internal and external peers

  • On the backup PE—advertises the best path selected for a prefix to the external peers and advertises the best–external path selected for that prefix to the internal peers

BGP Prefix Independent Convergence

BGP Prefix Independent Convergence (PIC) feature enables the activation of a backup path in the event of the primary path failure.

Networks use Fast reroute (FRR) to calculate the next best path (backup path) and store it in BGP and IP Routing Information Bases (RIBs). The RIBs share the backup path information with the Forwarding Information Base (FIB). BGP PIC feature uses the backup path information in the FIB to quickly switch to this path during network failure, provided the line cards are enabled for PIC.

Drawbacks of Using Prefix-Dependent Convergence

In a standard BGP network, a BGP router advertises only its best path to a destination prefix. Hence, in an autonomous system, routers running BGP are not aware of all the possible paths to a destination prefix. In the event of a link or network failure that causes the best path to fail, the following process takes place:

  1. The affected BGP router advertising the failed best path, announces a withdrawal of the path.

  2. The BGP routers receiving the best path withdrawal from the affected BGP router, withdraw their own best paths, and recalculate their best paths to the destination prefix.

  3. The BGP routers advertise their recalculated best paths to all neighboring routers.

  4. Each BGP router that receives a new best path from its neighboring BGP router, again evaluates its own best path, and possibly withdraws and recalculates its best path.

  5. The BGP routers that recalculate their best paths, again advertise the new paths in the network.

Because this process repeats until all the BGP routers have the best path to the destination prefix, convergence of the network takes a lot of time. This form of convergence is known as prefix-dependent convergence. If route reflectors are configured in the network, then convergence takes even longer.

Benefits of Using Prefix-Independent Convergence

When prefix-independent convergence is configured in a BGP network, all BGP routers advertise their best external paths to a destination prefix. This indicates that all BGP routers are aware of multiple best external paths to a destination prefix.

Each BGP router selects a backup path from the available best external paths, and downloads it to its FIB. Hence, the FIB on each BGP router contains a best path and a best external path to a destination prefix. In the event of a link or network failure that causes the best path to fail, the FIB on the affected BGP router can switch all its routes using the failed path to the best external path, in a single operation. Because this form of convergence takes minimal time, it is preferred in large scale network deployments.

Using Prefix-Independent Convergence with Route Reflectors

For traffic from the customer edge router to a remote provider edge router, the BGP local-pref attribute is used to select the primary path (from a primary PE) and the backup path (from the backup PE). Even though the remote provider edge router receives the backup (best external) path from the backup PE, when the backup PE receives the iBGP best path from the primary PE, it withdraws the backup path from the core network. Hence, the primary and backup (best external) paths must be pre-programmed in the network for PIC to work.

When the primary path fails, the delay in convergence is because of the following process that takes place:

  1. The primary PE sends a request to the provider core network for withdrawing the primary path.

  2. The backup PE advertises the backup (best external) path as the new primary (best) path.

  3. The remote PE recalculates its primary paths on receiving the withdrawal request from the primary PE, and the new primary path from the backup PE.

  4. Traffic resumes in the network after all prefixes in the FIB are updated with the new primary path.

Hence, convergence is slow because it depends on prefixes advertised by the PE routers.

By introducing prefix-independent convergence, the following changes take place:

  • Primary and backup paths are pre-programmed in the RIB and FIB.

  • All provider edge routers receive the backup path from the FIB.

  • In the event of primary path failure, the FIB modifies LDIs to include the backup path and instantly divert traffic along this route.


Note


To use BGP PIC feature with route reflectors, the provider edge routers must be configured with unique route distinguishers (RDs) within the context of a VRF. Else, the paths from different PEs are considered to be belonging to the same network, and the route reflector cannot accurately calculate the best backup path.


Backup Path Selection Process

Use the following procedure to identify the best backup path to be programmed in the RIB and FIB.

  1. Use the best path algorithm to identify the best path from the available set of paths for a prefix.

  2. Eliminate the best path.

  3. Eliminate all paths that have the same next hop as the best path.

  4. Rerun the best path algorithm on the remaining set of paths to identify the best backup path.

Configure BGP PIC in Provider Edge Networks

This section describes the procedure to configure BGP PIC for provider edge networks.

Topology

Consider the topology shown in the following illustration.

Figure 10. Prefix Independent Convergence in Provider Edge Networks

For traffic from the customer edge router CE to the provider edge router PE3, the BGP local-pref attribute is used to select CE-PE1-PE3 as the primary path, and CE-PE2-PE3 as the backup path. PE1-P-PE2 is the best internal path for the provider core network.

Before you Begin

Before you can configure the BGP PIC feature, ensure that you have configured the following:

  1. The loopback and network interfaces as per the topology.

  2. The VRFs for the provider core network.

Configuration

Use the configuration in this section to configure BGP PIC feature for the illustrated topology.

Router PE1

For traffic from Router CE to Router PE3, the eBGP path from Router CE is stored as the primary path on Router PE1.

Configure Router PE1 to install the backup (best external) path advertised by Router PE2, and the period for which the local label must be retained on convergence, as shown.

Router(config)# router bgp 10
Router(config-bgp)# vrf foo
Router(config-bgp-vrf)# address-family ipv4 unicast
Router(config-bgp-vrf-af)# additional-path install
Router(config-bgp-vrf-af)# label-retention 10
Router PE2

Configure Router PE2 to install and advertise the backup CE-PE2 path as the best external path.

Router(config)# router bgp 10
Router(config-bgp)# vrf foo
Router(config-bgp-vrf)# address-family ipv4 unicast
Router(config-bgp-vrf-af)# advertise-best-external label-alloc-mode
Router(config-bgp-vrf-af)# additional-path install
Router PE3

The iBGP path from Router PE1 (CE-PE1) is stored as the primary path on Router PE3. Configure the iBGP backup path CE-PE2 as shown.

Router(config)# router bgp 10
Router(config-bgp)# vrf foo
Router(config-bgp-vrf)# address-family ipv4 unicast
Router(config-bgp-vrf-af)# additional-path install
Verify BGP PIC

Run the following commands on Router PE3 to verify the BGP PIC feature in operation.

  1. Verify the presence of the backup path in the FIB.

    Router# show cef 1.1.1.1/32 detail
    Fri Oct 10 10:24:33.079 UTC
    1.1.1.1/32, version 1, internal 0x40000001 (0xa94c0574) [1], 0x0 (0x0), 0x0
    (0x0)
    Updated Oct 9 16:49:06.795
    Prefix Len 32, traffic index 0, precedence routine (0)
    gateway array (0xa8d9b130) reference count 4, flags 0x80200, source rib
    (3),
    [1 type 3 flags 0x901101 (0xa8ec6b90) ext 0x0 (0x0)]
    LW-LDI[type=0, refc=0, ptr=0x0, sh-ldi=0x0]
    Level 1 - Load distribution: 0
    [0] via 12.24.0.1, recursive
    via 12.24.0.1, 3 dependencies, recursive
    next hop 12.24.0.1 via 12.24.0.1/32
    via 12.24.0.2, 3 dependencies, recursive, backup
    next hop 12.24.0.2 via 12.24.0.2/32
    Load distribution: 0 (refcount 1)
    Hash OK Interface Address
    0 Y MgmtEth0/RP0/CPU0/0 12.24.0.1
  2. Verify the presence of the backup (best external) path for BGP.

    Router# show bgp vrf foo 206.1.1.1/32
    BGP routing table entry for 206.1.1.1/32
    Versions:
    Process bRIB/RIB SendTblVer
    Speaker 6 6
    Local Label: 3
    Paths: (1 available, best #1)
    Advertised to peers (in unique update groups):
    100.100.100.1
    Path #1: Received by speaker 0
    1.1.1.1 from 1.1.1.1 (200.200.200.1)
    Origin incomplete, metric 0, localpref 100, weight 32768, valid,
    internal, best
    2.2.2.2 from 2.2.2.2 (100.100.100.1)
    Origin incomplete, metric 0, localpref 100, weight 32768, valid,
    external, backup, best-external

Configure BGP PIC between Autonomous Systems

This section describes the procedure to configure BGP PIC between autonomous systems. .


Note


BGP PIC is supported only for Option A and Option B scenarios. The following section describes a sample configuration for Option B.
Topology

For example, consider the topology shown in the following illustration.

Figure 11. Prefix-Independent Convergence between Autonomous Systems

For traffic from Router PE1 to Router PE2, ASBR1 is the primary router and ASBR2 is the backup router. The ASBR1-ASBR3 eBGP path is the primary path. The ASBR2-ASBR4 eBGP path is the backup path. For traffic from Router PE2 to Router PE1, ASBR3 is the primary router and ASBR4 is the backup router. The ASBR3-ASBR1 eBGP path is the primary path and the ASBR4-ASBR2 eBGP path is the backup path.

Before you Begin

Before you can configure the BGP PIC feature, ensure that you have configured the loopback and network interfaces as per the illustrated topology.

Configuration

Use the configuration in this section to configure BGP PIC feature for the illustrated topology.

Router ASBR1

Configure Router ASBR1 to install the backup (best external) path advertised by Router ASBR2, and the period for which the local label must be retained on convergence, as shown.

Router(config)# router bgp 10
Router(config-bgp)# address-family vpnv4 unicast
Router(config-bgp-af)# additional-path install
Router(config-bgp-af)# label-retention 10

The provided configuration is for traffic from Router PE1 to Router PE2. Similarly, configure Router ASBR3 for traffic from Router PE2 to Router PE1.

Router ASBR2

Configure Router ASBR2 to install and advertise the ASBR2-ASBR4 backup (best external) path, as shown.

Router(config)# router bgp 10
Router(config-bgp)# address-family vpnv4 unicast
Router(config-bgp-af)# advertise-best-external label-alloc-mode
Router(config-bgp-af)# additional-path install

The provided configuration is for traffic from Router PE1 to Router PE2. Similarly, configure Router ASBR4 for traffic from Router PE2 to Router PE1.

Verify BGP PIC

Run the following commands on Router PE2 (for traffic from Router PE1 to Router PE2) or on Router PE1 (for traffic from Router PE2 to Router PE1) to verify the BGP PIC feature in operation.

  1. Verify the presence of the backup path in the FIB.

    Router# show cef 1.1.1.1/32 detail
    
    Fri Oct 10 10:24:33.079 UTC
    1.1.1.1/32, version 1, internal 0x40000001 (0xa94c0574) [1], 0x0 (0x0), 0x0
    (0x0)
    Updated Oct 9 16:49:06.795
    Prefix Len 32, traffic index 0, precedence routine (0)
    gateway array (0xa8d9b130) reference count 4, flags 0x80200, source rib
    (3),
    [1 type 3 flags 0x901101 (0xa8ec6b90) ext 0x0 (0x0)]
    LW-LDI[type=0, refc=0, ptr=0x0, sh-ldi=0x0]
    Level 1 - Load distribution: 0
    [0] via 12.24.0.1, recursive
    via 12.24.0.1, 3 dependencies, recursive
    next hop 12.24.0.1 via 12.24.0.1/32
    via 12.24.0.2, 3 dependencies, recursive, backup
    next hop 12.24.0.2 via 12.24.0.2/32
    Load distribution: 0 (refcount 1)
    Hash OK Interface Address
    0 Y MgmtEth0/RP0/CPU0/0 12.24.0.1
    
  2. Verify the presence of the backup (best external) path for BGP.

    Router# show bgp vrf foo 206.1.1.1/32
    
    BGP routing table entry for 206.1.1.1/32
    Versions:
    Process bRIB/RIB SendTblVer
    Speaker 6 6
    Local Label: 3
    Paths: (1 available, best #1)
    Advertised to peers (in unique update groups):
    100.100.100.1
    Path #1: Received by speaker 0
    1.1.1.1 from 1.1.1.1 (200.200.200.1)
    Origin incomplete, metric 0, localpref 100, weight 32768, valid,
    internal, best
    2.2.2.2 from 2.2.2.2 (100.100.100.1)
    Origin incomplete, metric 0, localpref 100, weight 32768, valid,
    external, backup, best-external
    

Command Line Interface (CLI) Consistency for BGP Commands

The Border Gateway Protocol (BGP) commands use disable keyword to disable a feature. The keyword inheritance-disable disables the inheritance of the feature properties from the parent level.

BGP Additional Paths

Table 13. Feature History Table

Feature Name

Release Information

Feature Description

Additonal path control per neighbor

Release 7.3.15

This features allows flexibility and granular control of the advertisement of additional paths based on the neighbor outbound policy configuration.

This is done by allowing configuration of combinations diff erent path selection procedures unlike singular path selection, and extending neighbor outpound policy to have finer control of the path types to be advertised.

This feature enables operational efficiency to manage additional paths and reduce scale of the paths in a typical clustered network architecture.

Without this feature, the path scale limitation of the memory is impacted, and control plane convergence issues develop because of the excessive number of paths.

The Border Gateway Protocol (BGP) Additional Paths feature modifies the BGP protocol machinery for a BGP speaker to be able to send multiple paths for a prefix. This gives 'path diversity' in the network. The add path enables BGP prefix independent convergence (PIC) at the edge routers.

BGP add path enables add path advertisement in an iBGP network and advertises the following types of paths for a prefix:

  • Backup paths—to enable fast convergence and connectivity restoration.

  • Group-best paths—to resolve route oscillation.

  • All paths—to emulate an iBGP full-mesh.

iBGP Multipath Load Sharing

When a Border Gateway Protocol (BGP) speaking router that has no local policy configured, receives multiple network layer reachability information (NLRI) from the internal BGP (iBGP) for the same destination, the router will choose one iBGP path as the best path. The best path is then installed in the IP routing table of the router. The iBGP Multipath Load Sharing feature enables the BGP speaking router to select multiple iBGP paths as the best paths to a destination. The best paths or multipaths are then installed in the IP routing table of the router.

Configure iBGP Multipath Load Sharing

Perform this task to configure the iBGP Multipath Load Sharing:

SUMMARY STEPS

  1. configure
  2. router bgp as-number
  3. address-family {ipv4 |ipv6 } {unicast |multicast }
  4. maximum-paths ibgp number
  5. Use the commit or end command.

DETAILED STEPS


Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

router bgp as-number

Example:
Router(config)# router bgp 100

Specifies the autonomous system number and enters the BGP configuration mode, allowing you to configure the BGP routing process.

Step 3

address-family {ipv4 |ipv6 } {unicast |multicast }

Example:
Router(config-bgp)# address-family ipv4 multicast

Specifies either the IPv4 or IPv6 address family and enters address family configuration submode.

Step 4

maximum-paths ibgp number

Example:
Router(config-bgp-af)# maximum-paths ibgp 30

Configures the maximum number of iBGP paths for load sharing.

Step 5

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.


iBGP Multipath Loadsharing Configuration: Example

The following is a sample configuration where 30 paths are used for loadsharing:


router bgp 100
 address-family ipv4 multicast
  maximum-paths ibgp 30
 !
!
end

Accumulated IGP Attribute for BGP

Table 14. Feature History Table

Feature Name

Release Information

Feature Description

Accumulated IGP Attribute for BGP

Release 7.3.2

This feature enables you to implement multiple contiguous BGP Autonomous Systems under a single administration.

You can allow BGP to make its routing decisions based on the IGP metric just as an IGP would do.

Overview of BGP AIGP

The Accumulated IGP (AIGP) Attribute for BGP is an optional non-transitive BGP path Attribute. IANA assigned the attribute type code for the AIGP attribute. The value field of the AIGP attribute is defined as a set of Type/Length/Value elements (TLVs). The AIGP TLV contains the Accumulated IGP metric.

The AIGP feature is required in the network to simulate the current OSPF behavior of computing the distance associated with a path. OSPF or LDP carries the prefix or label information only in the local area. Then, BGP carries the prefix label to all the remote areas by redistributing the routes into BGP at area boundaries. The routes or labels are then advertised using LSPs. The next hop for the route is changed at each ABR to local router which removes the need to leak OSPF routes across area boundaries. The bandwidth available on each of the core links is mapped to OSPF cost, hence it is imperative that BGP carries this cost correctly between each of the PEs. This functionality is achieved by using the AIGP.

Originate Prefixes with AIGP

Origination of routes with the accumulated interior gateway protocol (AIGP) metric is controlled by configuration. AIGP attributes are attached to redistributed routes that satisfy following conditions.

  • The protocol redistributing the route is enabled for AIGP.

  • The route is an interior gateway protocol (IGP) route redistributed into border gateway protocol (BGP). The value assigned to the AIGP attribute is the value of iGP next hop to the route or as set by a route-policy.

  • The route is a static route redistributed into BGP. The value assigned is the value of next hop to the route or as set by a route-policy.

  • The route is imported into BGP through network statement. The value assigned is the value of next hop to the route or as set by a route-policy.

Configuration Examples

Originate prefixes with AIGP.


Router(config)# route-policy aip_policy
Router(config-rpl)# set aigp-metric igp-cost
Router(config-rpl)# exit
Router(config)# router bgp 100
Router(config-bgp)# address-family ipv4 unicast
Router(config-bgp-af)# redistribute ospf route-policy aip_policy

Running Configuration

route-policy aip_policy
 set aigp-metric igp-cost
!
router bgp 100
 address-family ipv4 unicast
  redistribute ospf route-policy aip_policy

Verification

Verify the status of the AIGP attribute.

Router# show bgp 10.0.0.1
Thu Sep 30 21:21:15.279 EDT
BGP routing table entry for 10.0.0.1/32
Versions:
Process bRIB/RIB SendTblVer
Speaker 4694 4694
Last Modified: Sep 30 21:20:09.000 for 00:01:06
Paths: (2 available, best #1)
Not advertised to any peer
Path #1: Received by speaker 0
Not advertised to any peer
Local
192.168.0.1 (metric 2) from 192.168.0.1 (192.168.0.6)
Received Label 24000
Origin IGP, localpref 80, aigp metric 900, valid, internal, best, group-best, labeled-unicast
Received Path ID 1, Local Path ID 1, version 4694
Originator: 192.168.0.6, Cluster list: 192.168.0.1
Total AIGP metric 902 <-- AIGP attribute received. 

Accumulated Interior Gateway Protocol Attribute

The Accumulated Interior Gateway Protocol (AiGP)Attribute is an optional non-transitive BGP Path Attribute. The attribute type code for the AiGP Attribute is to be assigned by IANA. The value field of the AiGP Attribute is defined as a set of Type/Length/Value elements (TLVs). The AiGP TLV contains the Accumulated IGP Metric.

The AiGP feature is required in the 3107 network to simulate the current OSPF behavior of computing the distance associated with a path. OSPF/LDP carries the prefix/label information only in the local area. Then, BGP carries the prefix/lable to all the remote areas by redistributing the routes into BGP at area boundaries. The routes/labels are then advertised using LSPs. The next hop for the route is changed at each ABR to local router which removes the need to leak OSPF routes across area boundaries. The bandwidth available on each of the core links is mapped to OSPF cost, hence it is imperative that BGP carries this cost correctly between each of the PEs. This functionality is achieved by using the AiGP.

BGP Accept Own

The BGP Accept Own feature enables handling of self-originated VPN routes, which a BGP speaker receives from a route-reflector (RR). A "self-originated" route is one which was originally advertized by the speaker itself. As per BGP protocol [RFC4271], a BGP speaker rejects advertisements that were originated by the speaker itself. However, the BGP Accept Own mechanism enables a router to accept the prefixes it has advertised, when reflected from a route-reflector that modifies certain attributes of the prefix. A special community called ACCEPT-OWN is attached to the prefix by the route-reflector, which is a signal to the receiving router to bypass the ORIGINATOR_ID and NEXTHOP/MP_REACH_NLRI check. Generally, the BGP speaker detects prefixes that are self-originated through the self-origination check (ORIGINATOR_ID, NEXTHOP/MP_REACH_NLRI) and drops the received updates. However, with the Accept Own community present in the update, the BGP speaker handles the route.

One of the applications of BGP Accept Own is auto-configuration of extranets within MPLS VPN networks. In an extranet configuration, routes present in one VRF is imported into another VRF on the same PE. Normally, the extranet mechanism requires that either the import-rt or the import policy of the extranet VRFs be modified to control import of the prefixes from another VRF. However, with Accept Own feature, the route-reflector can assert that control without the need for any configuration change on the PE. This way, the Accept Own feature provides a centralized mechanism for administering control of route imports between different VRFs.

BGP Accept Own is supported only for VPNv4 and VPNv6 address families in neighbor configuration mode.

Route-Reflector Handling Accept Own Community and RTs

The ACCEPT_OWN community is originated by the InterAS route-reflector (InterAS-RR) using an outbound route-policy. To minimize the propagation of prefixes with the ACCEPT_OWN community attribute, the attribute will be attached on the InterAS-RR using an outbound route-policy towards the originating PE. The InterAs-RR adds the ACCEPT-OWN community and modifies the set of RTs before sending the new Accept Own route to the attached PEs, including the originator, through intervening RRs. The route is modified via route-policy.

Accept Own Configuration Example

In this configuration example:

  • PE11 is configured with Customer VRF and Service VRF.

  • OSPF is used as the IGP.

  • VPNv4 unicast and VPNv6 unicast address families are enabled between the PE and RR neighbors and IPv4 and IPv6 are enabled between PE and CE neighbors.

The Accept Own configuration works as follows:
  1. CE1 originates prefix X.

  2. Prefix X is installed in customer VRF as (RD1:X).

  3. Prefix X is advertised to IntraAS-RR11 as (RD1:X, RT1).

  4. IntraAS-RR11 advertises X to InterAS-RR1 as (RD1:X, RT1).

  5. InterAS-RR1 attaches RT2 to prefix X on the inbound and ACCEPT_OWN community on the outbound and advertises prefix X to IntraAS-RR31.

  6. IntraAS-RR31 advertises X to PE11.

  7. PE11 installs X in Service VRF as (RD2:X,RT1, RT2, ACCEPT_OWN).

Remote PE: Handling of Accept Own Routes

Remote PEs (PEs other than the originator PE), performs bestpath calculation among all the comparable routes. The bestpath algorithm has been modified to prefer an Accept Own path over non-Accept Own path. The bestpath comparison occurs immediately before the IGP metric comparison. If the remote PE receives an Accept Own path from route-reflector 1 and a non-Accept Own path from route-reflector 2, and if the paths are otherwise identical, the Accept Own path is preferred. The import operates on the Accept Own path.

Configuring BGP Accept Own

Perform this task to configure BGP Accept Own:

SUMMARY STEPS

  1. configure
  2. router bgp as-number
  3. neighbor ip-address
  4. remote-as as-number
  5. update-source type interface-path-id
  6. address-family {vpnv4 unicast | vpnv6 unicast }
  7. accept-own [inheritance-disable ]

DETAILED STEPS

  Command or Action Purpose

Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

router bgp as-number

Example:
Router(config)#router bgp 100

Specifies the autonomous system number and enters the BGP configuration mode, allowing you to configure the BGP routing process.

Step 3

neighbor ip-address

Example:
Router(config-bgp)#neighbor 10.1.2.3

Places the router in neighbor configuration mode for BGP routing and configures the neighbor IP address as a BGP peer.

Step 4

remote-as as-number

Example:
Router(config-bgp-nbr)#remote-as 100

Assigns a remote autonomous system number to the neighbor.

Step 5

update-source type interface-path-id

Example:
Router(config-bgp-nbr)#update-source Loopback0

Allows sessions to use the primary IP address from a specific interface as the local address when forming a session with a neighbor.

Step 6

address-family {vpnv4 unicast | vpnv6 unicast }

Example:
Router(config-bgp-nbr)#address-family vpnv6 unicast

Specifies the address family as VPNv4 or VPNv6 and enters neighbor address family configuration mode.

Step 7

accept-own [inheritance-disable ]

Example:
Router(config-bgp-nbr-af)#accept-own

Enables handling of self-originated VPN routes containing Accept_Own community.

Use the inheritance-disable keyword to disable the "accept own" configuration and to prevent inheritance of "acceptown" from a parent configuration.

BGP Link-State

BGP Link-State (LS) is an Address Family Identifier (AFI) and Sub-address Family Identifier (SAFI) originally defined to carry interior gateway protocol (IGP) link-state information through BGP. The BGP Network Layer Reachability Information (NLRI) encoding format for BGP-LS and a new BGP Path Attribute called the BGP-LS attribute are defined in RFC7752. The identifying key of each Link-State object, namely a node, link, or prefix, is encoded in the NLRI and the properties of the object are encoded in the BGP-LS attribute.


Note


IGPs do not use BGP LS data from remote peers. BGP does not download the received BGP LS data to any other component on the router.

An example of a BGP-LS application is the Segment Routing Path Computation Element (SR-PCE). The SR-PCE can learn the SR capabilities of the nodes in the topology and the mapping of SR segments to those nodes. This can enable the SR-PCE to perform path computations based on SR-TE and to steer traffic on paths different from the underlying IGP-based distributed best-path computation.

The following figure shows a typical deployment scenario. In each IGP area, one or more nodes (BGP speakers) are configured with BGP-LS. These BGP speakers form an iBGP mesh by connecting to one or more route-reflectors. This way, all BGP speakers (specifically the route-reflectors) obtain Link-State information from all IGP areas (and from other ASes from eBGP peers).

Exchange Link State Information with BGP Neighbor

The following example shows how to exchange link-state information with a BGP neighbor:


Router# configure
Router(config)# router bgp 1
Router(config-bgp)# neighbor 10.0.0.2
Router(config-bgp-nbr)# remote-as 1
Router(config-bgp-nbr)# address-family link-state link-state
Router(config-bgp-nbr-af)# exit

IGP Link-State Database Distribution

A given BGP node may have connections to multiple, independent routing domains. IGP link-state database distribution into BGP-LS is supported for both OSPF and IS-IS protocols in order to distribute this information on to controllers or applications that desire to build paths spanning or including these multiple domains.

To distribute OSPFv2 link-state data using BGP-LS, use the distribute link-state command in router configuration mode.


Router# configure
Router(config)# router ospf 100
Router(config-ospf)# distribute link-state instance-id 32

Usage Guidelines and Limitations

  • BGP-LS supports IS-IS and OSPFv2.

  • The identifier field of BGP-LS (referred to as the Instance-ID) identifies the IGP routing domain where the NLRI belongs. The NLRIs representing link-state objects (nodes, links, or prefixes) from the same IGP routing instance must use the same Instance-ID value.

  • When there is only a single protocol instance in the network where BGP-LS is operational, we recommend configuring the Instance-ID value to 0.

  • Assign consistent BGP-LS Instance-ID values on all BGP-LS Producers within a given IGP domain.

  • NLRIs with different Instance-ID values are considered to be from different IGP routing instances.

  • Unique Instance-ID values must be assigned to routing protocol instances operating in different IGP domains. This allows the BGP-LS Consumer (for example, SR-PCE) to build an accurate segregated multi-domain topology based on the Instance-ID values, even when the topology is advertised via BGP-LS by multiple BGP-LS Producers in the network.

  • If the BGP-LS Instance-ID configuration guidelines are not followed, a BGP-LS Consumer may see duplicate link-state objects for the same node, link, or prefix when there are multiple BGP-LS Producers deployed. This may also result in the BGP-LS Consumers getting an inaccurate network-wide topology.

Configuring BGP Link-state

To exchange BGP link-state (LS) information with a BGP neighbor, perform these steps:

Procedure

Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

router bgp as-number

Example:

Router(config)# router bgp 100

Specifies the BGP AS number and enters the BGP configuration mode, allowing you to configure the BGP routing process.

Step 3

neighbor ip-address

Example:

Router(config-bgp)# neighbor 10.0.0.2

Configures a CE neighbor. The ip-address argument must be a private address.

Step 4

remote-as as-number

Example:

Router(config-bgp-nbr)# remote-as 1

Configures the remote AS for the CE neighbor.

Step 5

address-family link-state link-state

Example:

Router(config-bgp-nbr)# address-family link-state link-state

Distributes BGP link-state information to the specified neighbor.

Step 6

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.


Configuring Domain Distinguisher

To configure unique identifier four-octet ASN, perform these steps:

Procedure

Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

router bgp as-number

Example:

Router(config)# router bgp 100

Specifies the BGP AS number and enters the BGP configuration mode, allowing you to configure the BGP routing process.

Step 3

address-family link-state link-state

Example:

Router(config-bgp)# address-family link-state link-state

Enters address-family link-state configuration mode.

Step 4

domain-distinguisher unique-id

Example:

Router(config-bgp-af)# domain-distinguisher 1234

Configures unique identifier four-octet ASN. Range is from 1 to 4294967295.

Step 5

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.


BGP Permanent Network

BGP permanent network feature supports static routing through BGP. BGP routes to IPv4 or IPv6 destinations (identified by a route-policy) can be administratively created and selectively advertised to BGP peers. These routes remain in the routing table until they are administratively removed. A permanent network is used to define a set of prefixes as permanent, that is, there is only one BGP advertisement or withdrawal in upstream for a set of prefixes. For each network in the prefix-set, a BGP permanent path is created and treated as less preferred than the other BGP paths received from its peer. The BGP permanent path is downloaded into RIB when it is the best-path.

The permanent-network command in global address family configuration mode uses a route-policy to identify the set of prefixes (networks) for which permanent paths is to be configured. The advertise permanent-network command in neighbor address-family configuration mode is used to identify the peers to whom the permanent paths must be advertised. The permanent paths is always advertised to peers having the advertise permanent-network configuration, even if a different best-path is available. The permanent path is not advertised to peers that are not configured to receive permanent path.

The permanent network feature supports only prefixes in IPv4 unicast and IPv6 unicast address-families under the default Virtual Routing and Forwarding (VRF).

Restrictions

These restrictions apply while configuring the permanent network:

  • Permanent network prefixes must be specified by the route-policy on the global address family.

  • You must configure the permanent network with route-policy in global address family configuration mode and then configure it on the neighbor address family configuration mode.

  • When removing the permanent network configuration, remove the configuration in the neighbor address family configuration mode and then remove it from the global address family configuration mode.

Configuring BGP Permanent Network

Perform this task to configure BGP permanent network. You must configure at least one route-policy to identify the set of prefixes (networks) for which the permanent network (path) is to be configured.

Procedure

Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

prefix-set prefix-set-name

Example:

Router(config)# prefix-set PERMANENT-NETWORK-IPv4
Router(config-pfx)# 1.1.1.1/32,
Router(config-pfx)# 2.2.2.2/32,
Router(config-pfx)# 3.3.3.3/32
Router(config-pfx)# end-set

Enters prefix set configuration mode and defines a prefix set for contiguous and non-contiguous set of bits.

Step 3

exit

Example:

Router(config-pfx)# exit

Exits prefix set configuration mode and enters global configuration mode.

Step 4

route-policy route-policy-name

Example:

Router(config)# route-policy POLICY-PERMANENT-NETWORK-IPv4
Router(config-rpl)# if destination in PERMANENT-NETWORK-IPv4 then
Router(config-rpl)# pass
Router(config-rpl)# endif 

Creates a route policy and enters route policy configuration mode, where you can define the route policy.

Step 5

end-policy

Example:

Router(config-rpl)# end-policy

Ends the definition of a route policy and exits route policy configuration mode.

Step 6

router bgp as-number

Example:

Router(config)# router bgp 100

Specifies the autonomous system number and enters the BGP configuration mode.

Step 7

address-family { ipv4 | ipv6 } unicast

Example:

Router(config-bgp)# address-family ipv4 unicast

Specifies either an IPv4 or IPv6 address family unicast and enters address family configuration submode.

Step 8

permanent-network route-policy route-policy-name

Example:

Router(config-bgp-af)# permanent-network route-policy POLICY-PERMANENT-NETWORK-IPv4

Configures the permanent network (path) for the set of prefixes as defined in the route-policy.

Step 9

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.

Step 10

show bgp {ipv4 | ipv6} unicast prefix-set

Example:

show bgp ipv4 unicast 

(Optional) Displays whether the prefix-set is a permanent network in BGP.


Advertise Permanent Network

Perform this task to identify the peers to whom the permanent paths must be advertised.

Procedure

Step 1

configure

Example:

RP/0/RP0/CPU0:router# configure

Enters mode.

Step 2

router bgp as-number

Example:
Router(config)# router bgp 100

Specifies the autonomous system number and enters the BGP configuration mode.

Step 3

neighbor ip-address

Example:

Router(config-bgp)# neighbor 10.255.255.254

Places the router in neighbor configuration mode for BGP routing and configures the neighbor IP address as a BGP peer.

Step 4

remote-as as-number

Example:
Router(config-bgp-nbr)# remote-as 4713

Assigns the neighbor a remote autonomous system number.

Step 5

address-family { ipv4 | ipv6 } unicast

Example:

Router(config-bgp-nbr)# address-family ipv4 unicast

Specifies either an IPv4 or IPv6 address family unicast and enters address family configuration submode.

Step 6

advertise permanent-network

Example:

Router(config-bgp-nbr-af)# advertise permanent-network

Specifies the peers to whom the permanent network (path) is advertised.

Step 7

Use the commit or end command.

commit —Saves the configuration changes and remains within the configuration session.

end —Prompts user to take one of these actions:
  • Yes — Saves configuration changes and exits the configuration session.

  • No —Exits the configuration session without committing the configuration changes.

  • Cancel —Remains in the configuration session, without committing the configuration changes.

Step 8

show bgp {ipv4 | ipv6} unicast neighbor ip-address

Example:

Router# show bgp ipv4 unicast neighbor 10.255.255.254

(Optional) Displays whether the neighbor is capable of receiving BGP permanent networks.


BGP-RIB Feedback Mechanism for Update Generation

The Border Gateway Protocol-Routing Information Base (BGP-RIB) feedback mechanism for update generation feature avoids premature route advertisements and subsequent packet loss in a network. This mechanism ensures that routes are installed locally, before they are advertised to a neighbor.

BGP waits for feedback from RIB indicating that the routes that BGP installed in RIB are installed in forwarding information base (FIB) before BGP sends out updates to the neighbors. RIB uses the the BCDL feedback mechanism to determine which version of the routes have been consumed by FIB, and updates the BGP with that version. BGP will send out updates of only those routes that have versions up to the version that FIB has installed. This selective update ensures that BGP does not send out premature updates resulting in attracting traffic even before the data plane is programmed after router reload, LC OIR, or flap of a link where an alternate path is made available.

To configure BGP to wait for feedback from RIB indicating that the routes that BGP installed in RIB are installed in FIB, before BGP sends out updates to neighbors, use the update wait-install command in router address-family IPv4 or router address-family VPNv4 configuration mode. The show bgp , show bgp neighbors , and show bgp process performance-statistics commands display the information from update wait-install configuration.

Delay BGP Route Advertisements

Table 15. Feature History Table

Feature Name

Release Information

Feature Description

Delay BGP Route Advertisements

Release 7.5.3

You can now prevent traffic loss due to premature advertising of BGP routes and subsequent packet loss in a network. You can achieve this by setting the delay time of the BGP start-up in the router until the Routing Information Base (RIB) is synchronized with the Forward Information Base (FIB) in the routing table. This delays the BGP update generation and prevents traffic loss in a network.

You can configure a minimum delay of 1 second and a maximum delay of 600 seconds.

This feature introduces the update wait-install delay startup command.

When BGP forwards traffic, it waits for feedback from the RIB until the RIB is ready to forward traffic. Once the RIB is ready, BGP sends the route updates to the BGP neighbors and peer-groups. Advertising routes before the RIB is synchronized in the FIB results in traffic loss. To avoid this problem, the router must delay the BGP start-up process to delay the BGP update generation so that no traffic loss happens.

To accomplish this, you must configure the update wait-install delay startup command to delay the generation of BGP updates. The show bgp process command displays the delay of the BGP process update since the last router reload.

This feature allows you to configure the minimum and maximum delay periods. The range of the delay is from 1 second to 600 seconds. As a result, network traffic loss is avoided.

Restrictions

This feature is applicable for the following Address Family Indicators (AFIs):

  • IPv4 unicast

  • IPv6 unicast

  • VPNv4 unicast

  • VPNv6 unicast

Configuration

  1. Enter the IOS XR configuration mode.

    Router# configure
  2. Specify the BGP Autonomous System Number (AS Number).

    Router(config)# router bgp 1
  3. Specify the IP address from the address-family (Pv4, IPv6, VPNv4, or VPNv6) options.

    Router(config-bgp)# address-family {ipv4| ipv6| vpnv4| vpn6} unicast
    For example,
    Router(config-bgp)# address-family ipv4 unicast
  4. Schedule the delay of the BGP process to prevent routes from being advertised to peers until RIB is synchronized.

    Router(config-bgp-af)# update wait-install delay startup (time in seconds) 
    For example,
    Router(config-bgp-af)# update wait-install delay startup 10
  5. Commit the changes.

    Router(config-bgp-af)#commit

Note


The delay time ranges from 1 second to 600 seconds.


Running Configuration

configure
router bgp 1
 address-family ipv4 unicast
  update wait-install delay startup 10
!

Verification Example

The following command displays the delay of the BGP process update:

Router# show running-config router bgp 1
router bgp 1
address-family ipv4 unicast
update wait-install delay startup 10


What to do next

You can then run the show bgp process command. The Update wait-install enabled section in the show bgp process command displays the delay of the BGP process update since the last router reload.
Router# show bgp process
Wed Aug 24 00:40:48.649 PDT

BGP Process Information:
BGP is operating in STANDALONE mode
Autonomous System number format: ASPLAIN
Autonomous System: 100
Router ID: 192.168.0.2 (manually configured)
Default Cluster ID: 192.168.0.2
Active Cluster IDs:  192.168.0.2
------------------------------
------------------------------
Update wait-install enabled:
  ack request 2, ack rcvd 2, slow ack 0
  startup delay 10 secs

--More—

Default-originate Under VRF

BGP advertises default routes to provider-edge neighbors, based on per-VRF configuration.

User-Defined Martian Address Check

When you configure BGP on a Cisco 8000 Series Router, you can prevent routers from accessing certain sites with certain IP address prefixes. These routers drop packets from such IP addresses, and such IP addresses are known as Martian addresses. However, you can enable routers with BGP IPv4 address-family or BGP IPv6 address-family configuration to access these sites by configuring the command default-martian-check disable . These sites are sites with certain IPv4 and IPv6 prefixes as follows:

  • IPv4 address prefixes

    • 0.0.0.0/8

    • 127.0.0.0/8

    • 224.0.0.0/4

  • IPv6 address prefixes

    • ::

    • ::0002 - ::ffff

    • ::ffff:a.b.c.d

    • fe80:xxxx

    • ffxx:xxxx

Restrictions

Routers with OSPF or IS-IS Protocols cannot access these sites even by having the default-martian-check disable command configured.

Configuration Example

To allow routes from Martian addresses, use the following steps:

  1. Enter BGP IPv4 or BGP IPv6 address-family configuration mode.

  2. Configure the address-family modifier as a unicast address.

  3. Disable the Martian address check.

Configuration

/* Enter BGP IPv4 or BGP IPv6 address-family configuration mode. */
Router# configure
Router(config)# router bgp 100

/* Configure the address-family modifier as unicast. */
Router(config-bgp)# address-family ipv4 unicast

/* Disable the martian address check. */
Router(config-bgp-af)# default-martian-check disable
Router(config-bgp-af)# commit

Verification

To verify if you have enabled or disabled a Martian address check, you can use the show bgp ipv4 unicast command or show bgp ipv6 unicast command:

Router# show bgp ipv6 unicast
BGP router identifier 2.2.2.1, local AS number 1
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0800000 RD version: 29
BGP main routing table version 29
BGP NSR Initial initsync version 4 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
Dampening enabled
BGP scan interval 60 secs

Status codes: s suppressed, d damped, h history, * valid, > best
i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
Network                   Next Hop           Metric   LocPrf     Weight Path
*>i::/0               1:1:1:1:1:1:1:1         100        0            i
* i192:1::/112        1.1.1.1                   0      100            0 ?
*>i                   1:1:1:1:1:1:1:1           0      100            0 ?
* iff11:1123::/64     1.1.1.1                   2      100            0 ?
*>i                   1:1:1:1:1:1:1:1           2      100            0 ?

BGP Multipath Enhancements

  • Overwriting of next-hop calculation for multipath prefixes is not allowed. The next-hop-unchanged multipath command disables overwriting of next-hop calculation for multipath prefixes.

  • The ability to ignore as-path onwards while computing multipath is added. The bgp multipath as-path ignore onwards command ignores as-path onwards while computing multipath.

When multiple connected routers start ignoring as-path onwards while computing multipath, it causes routing loops. Therefore, you should not configure the bgp multipath as-path ignore onwards command on routers that can form a loop.

Figure 12. Topology to illustrate formation of loops

Consider three routers R1, R2 and R3 in different autonomous systems (AS-1, AS-2, and AS-3). The routers are connected with each other. R1 announces a prefix to R2 and R3. Both R2 and R3 are configured with multipath and also with bgp multipath as-path ignore onwards command. Since R3 is configured as multipath, R2 will send part of its traffic to R3. Similarly, R3 will send part of its traffic to R2. This creates a forwarding loop between R3 and R2. Therefore, to avoid such forwarding loops you should not configure the bgp multipath as-path ignore onwards command on connected routers.

Overview of BGP Monitoring Protocol

The BGP Monitoring Protocol (BMP) feature enables monitoring of BGP speakers (called BMP clients). You can configure a device to function as a BMP server, which monitors either one or several BMP clients, which in turn, has several active peer sessions configured. You can also configure a BMP client to connect to one or more BMP servers. The BMP feature enables configuration of multiple BMP servers (configured as primary servers) to function actively and independent of each other, simultaneously to monitor BMP clients.

The BMP Protocol provides access to the Adjacent Routing Information Base, Incoming (Adj-RIB-In) table of a peer on an ongoing basis and a periodic dump of certain statistics that the monitoring station can use for further analysis. The BMP provides pre-policy view of the Adj-RIB-In table of a peer.

There can be several BMP servers configured globally across all the BGP instances. The BMP severs configured are common across multiple speaker instances and each BGP peer in an instance can be configured for monitoring by all or a subset of the BMP servers, giving a 'any-to-any' map between BGP peers and BMP servers from the point of view of a BGP speaker. If a BMP server is configured before any of the BGP peers come up, then the monitoring will start as soon as the BGP peers come up. A BMP server configuration can be removed only when there are no BGP peers configured to be monitored by that particular BMP server.

Sessions between BMP clients and BMP servers operate over plain TCP (no encryption/encapsulation). If a TCP session with the BMP server is not established, the client retries to connect every 7 seconds.

The BMP server does not send any messages to its clients (BGP speakers). The message flow is in one direction only—from BGP speakers to the BMP servers

A maximum of eight BMP servers can be configured on the router. Each BMP server is specified by a server ID and certain parameters such as IP address, port number, etc are configurable. Upon successful configuration of a BMP server with host and port details, the BGP speaker attempts to connect to BMP Server. Once the TCP connection is setup, an Initiation message is sent as first message.

The bmp server command enables the user to configure multiple—independent and asynchronous—BMP server connections.

All neighbors for a BGP speaker need not necessarily be BMP clients. BMP clients are the ones that have direct TCP connection with a BMP server. Each of these BGP speakers can have many BGP neighbors or peers. Under a BGP speaker, if any of its neighbors are configured for BMP monitoring, only that particular peer router's messages are sent to BMP servers.

The session connection to BMP server is attempted after an initial-delay at the BMP client. This initial-delay can be configured. If the initial-delay is not configured, then the default connection delay of 7 seconds is used. Configuring the initial delay becomes significant under certain circumstances where, if multiple BMP servers' states toggle closely and refresh delay is so small, then this might result in redundant route-refreshes being generated. This causes considerable network traffic and load on the device. Having different initial delays can reduce the load spike on the network and router.

After the initial delay, TCP connection to BMP servers are attempted. Once the server connections are up, it is checked if there are any peers enabled for monitoring. Once a BGP peer that is already being monitored is in the “ESTAB” state, speaker sends a “peer-up” message for that peer to the BMP server. After the BGP peer receives a route-refresh request, neighbor sends the updates. This route refresh is initiated based on a delay configured for each BMP server. This is called route refresh delay. When there are multiple neighbors to be monitored, each neighbor is set a refresh delay based upon the BMP server they are enabled for. Once all the BGP neighbors have sent the updates in response to the refresh requests, the tables will be up to date in the BMP Server. If a neighbor establishes connection after BMP monitoring has begun, it does not require a route-refresh request. All received routes from that neighbor is sent to BMP servers.


Note


In the case of BMP Pre Inbound Policy Route monitoring, when a new BMP server comes up, route refresh requests are sent to the peer router by the BGP speaker. However, in the case of BMP Post Inbound Policy Route Monitoring route refresh request are not sent to the peer routers when the new BMP server comes up because the BMP table is used for update generation.


It is advantageous to batch up refresh requests to BGP peers, if several BMP servers are activated in quick succession. Use the bmp server initial-refresh-delay command to configure a delay in triggering the refresh mechanism when the first BMP server comes up. If other BMP servers come online within this time-frame, only one set of refresh requests is sent to the BGP peers. You can also configure the bmp server initial-refresh-delay skip command to skip all refresh requests from BGP speakers and just monitor all incoming messages from the peers.

In a client-server configuration, it is recommended that the resource load of the devices be kept minimal and adding excessive network traffic must be avoided. In the BMP configuration, you can configure various delay timers on the BMP server to avoid flapping during connection between the server and client.

Adj-RIB-In Post-Policy View for L3VPN Address Families

Table 16. Feature History Table

Feature Name

Release

Description

Adj-RIB-In Post-Policy View for L3VPN Address Families

Release 7.5.4

After applying policy filters, you can now monitor BGP events and collect BGP route information and statistics for L3VPN address families for unprocessed routing information.

This is made possible because this feature enables the BGP Monitoring Protocol (BMP) to allow a BGP router to advertise the BGP Adj-RIB-In post-policy for L3VPN address families.

This feature introduces these changes:

  • CLI: This feature modifies the following commands:

    • show bgp bmp

    • route-monitoring inbound post-policy

  • YANG Data Model: New XPaths for

The BGP Monitoring Protocol (BMP), defined in RFC 7854, is a protocol to monitor BGP events as well as BGP route information and statistics. Using this protocol, a BMP collector can monitor various routing information bases within a BGP speaker such as Adj-RIB-In (Pre-Policy and Post-Policy), Local RIB and Adj-RIB-Out (Pre-Policy and Post-Policy). This provides comprehensive insights into real-time and historical operation of a BGP network which can be used for route monitoring, routing analytics, and traffic engineering analytics. BMP can additionally send information on peer state change events, including why a peer went down in the case of a BGP event.

The Adj-RIB-In pre-policy (also referred to as Inbound pre-policy) conveys to a BMP receiver all unprocessed routing information that has been advertised to the local BGP speaker by its peers before any inbound policy has been applied. The Adj-RIB-In post-policy (also referred to as Inbound post-policy) conveys to a BMP receiver all routing information after policy filters and/or modifications (such as addition or deletion of BGP attributes) have been applied.

BMP provides access to the Adjacent Routing Information Base - Inbound (Adj-RIB-In) table of a peer on an ongoing basis and statistics that the monitoring station can use for further analysis. BMP allows a BGP router to advertise the pre-policy or post-policy BGP Adj-RIB-In from the specific BGP peers to a monitoring station.

BGP Adj-RIB-In post-policy (inbound post-policy) view for L3VPN traffic shows the routing information that a BGP peer gets from another peer BGP speaker after applying a BGP input policy and exports the route information to BMP server. The policy instructs the router to inspect routes, filter them, and potentially modify their attributes as they are accepted from a peer, advertised to a peer, or redistributed from one routing protocol to another.

To enable the Adj-RIB-In post-policy (inbound post-policy) for L3VPN address families, you must run configure the route-monitoring inbound post-policy command.

In addition to the existing RIB views available for monitoring (see Overview of BGP Monitoring Protocol), Cisco IOS XR Release 7.5.4 adds the following address families in the Adj-RIB-In Post-Policy view for monitoring L3VPN BGP network:

  • Default VRF

    • VPNv4 Unicast

    • VPNv6 Unicast

  • Non-Default VRF

    • IPv4 Unicast

    • IPv6 Unicast

Configuration

Configure the route-monitoring inbound post-policy command to enable the Adj-RIB-In post-policy (inbound post-policy) view by performing the following actions:
Router# config
Router(config)#bmp server all
Router(config-bgp-bmp)#route-monitoring inbound post-policy 
Router(config-bgp-bmp-rmon)#commit

Running Configuration


bmp server all
 route-monitoring inbound post-policy
 !
!

Verification

Verify whether the Adj-RIB-In post-policy (inbound post-policy) configuration is done by running the show bgp bmp server <server ID> command.

Router# show bgp bmp server 1 
Tue Nov 29 19:02:27.837 IST
BMP server 1
Host 12.1.2.1 Port 16001
Connected for 05:51:09
Last Disconnect event received : 00:00:00
Precedence:  internet
BGP neighbors: 7
VRF: - (0x60000000)
Update Source: - (-)
Update Source Vrf ID: 0x0
Update Mode                       : In-Post-Policy
  In-Post-Policy
   Advertisement interval         : 15 secs
   Scanner interval               : 60 secs
Flapping Delay                    : 300 secs
Initial Delay                     : 0 secs
Initial Refresh Delay             : 1 secs
Initial Refresh Spread            : 1 secs
Stats Reporting Period            : 0 secs
Queue Route Mon Msg buffer limit  : 133693 KB (Current Server Up Count: 2)
Queue Route Mon Msg buffer usage  : 0 B
Queue write pulse sent            : Nov 29 13:13:15.484, Nov 29 13:11:53.478 (all)
Queue write pulse received        : Nov 29 13:13:15.484
Update Generation in Progress     : No
Reset Walk in Progress            : No
------More----

You can then configure the following commands:

  • bmp advertisement-interval to set the minimum interval between the sending of BMP routing updates.

  • bmp scan-time to configure scanning intervals of BMP-speaking networking devices.

Local-RIB view for IP and L3VPN Address Families

Table 17. Feature History Table

Feature Name

Release

Description

Local-RIB view for IP and L3VPN Address Families

Release 7.5.4

After applying policy filters, you can now monitor BGP events and collect BGP best path information and statistics for IP and L3VPN address families for unprocessed routing information.

This is made possible because this feature enables BMP to allow a BGP router to advertise the BGP Local-RIB for IP and L3VPN address families.

Operators may wish to validate the impact of policies applied to the Adj-RIB-In by analysing the final decision made by the router when installing into the Loc-RIB.

This feature introduces these changes:

  • CLI: Modifies the show bgp bmp command.

  • YANG Data Model: New XPaths for

The Local-RIB (Loc-RIB) contains the routes that are received from the BGP peers and selected by the local BGP speaker's decision process. The Adj-RIB-In may contain hundreds of thousands of routes for per peer. But only a few of routes are selected and installed in the Loc-RIB after the best-path selection.

The Loc-RIB contains the routes selected by the local BGP speaker's Decision Process and are considered valid to it.

For example, the Adj-RIB-In for a given peer post-policy (inbound post-policy) may contain thousands of routes per peer. But only a few of routes are selected and installed in the Loc-RIB after the best-path selection.

The monitoring application that requires to correlate flow records to Loc-RIB entries, needs to collect and monitor the routes that are actually selected and used. The Loc-RIB includes all selected received routes from BGP peers in addition to locally originated routes. It also contains the address family, the prefixes, attributes, and prefixes for address families.

Starting from Release 7.5.4, the Loc-RIB view (best-path only) is available for monitoring for the following address families:

  • Default VRF

    • IPv4 Unicast

    • IPv4 Labeled Unicast

    • IPv6 Unicast

    • IPv6 Labeled Unicast

    • VPNv4 Unicast

    • VPNv6 Unicast

  • Non-Default VRF

    • IPv4 Unicast

    • IPv6 Unicast

This feature complies with RFC 9069.

Configuration

Configure the route-monitoring local-rib command to enable the local-RIB view by performing the following actions:
Router# config
Router(config)#bmp server all
Router(config-bgp-bmp)#route-monitoring local-rib 
Router(config-bgp-bmp-rmon)#commit

Running Configuration


bmp server all
 route-monitoring local-rib
 !
!

Verification

Verify whether the Local RIB (Loc-RIB) configuration is done by running the show bgp bmp server <server ID> command.
Router#show bgp bmp server 1
BMP server 1
Host 12.1.2.1 Port 16001
Connected for 06:00:39
Last Disconnect event received : 00:00:00
Precedence:  internet
BGP neighbors: 10
VRF: - (0x60000000)
Update Source: - (-)
Update Source Vrf ID: 0x0
Update Mode                       : In-Post-Policy, Local-RIB
  In-Post-Policy
   Advertisement interval         : 15 secs
   Scanner interval               : 60 secs
  Local-RIB
   Advertisement interval         : 15 secs
   Scanner interval
     Global                       : 60 secs
     IPv4 Unicast                 : 60 secs
     VPNv4 Unicast                : 60 secs
     IPv6 Unicast                 : 60 secs
     VPNv6 Unicast                : 60 secs
Flapping Delay                    : 300 secs
Initial Delay                     : 0 secs
Initial Refresh Delay             : 1 secs
Initial Refresh Spread            : 1 secs
Stats Reporting Period            : 0 secs
Queue Route Mon Msg buffer limit  : 133693 KB (Current Server Up Count: 2)
Queue Route Mon Msg buffer usage  : 0 B
Queue write pulse sent            : Nov 29 19:08:32.826, Nov 29 13:11:53.478 (all)
Queue write pulse received        : Nov 29 19:08:32.826
Update Generation in Progress     : No
Reset Walk in Progress            : No
----More-----

You can then configure bmp advertisement-interval command to set the minimum interval between the sending of BMP routing updates.

BGP—Multiple Cluster IDs

The BGP—Multiple Cluster IDs feature allows an iBGP neighbor (usually a route reflector) to have multiple cluster IDs: a global cluster ID and additional cluster IDs that are assigned to clients (neighbors). Prior to the introduction of this feature, a device could have a single, global cluster ID.

When a network administrator configures per-neighbor cluster IDs:

  • The loop prevention mechanism based on a CLUSTER_LIST is automatically modified to take into account multiple cluster IDs.

  • A network administrator can disable client-to-client route reflection based on cluster ID.

Restriction

The BGP Multiple Cluster-IDs feature only works in default VRF.

BGP Flowspec Overview

Table 18. Feature History Table

Feature Name

Release Information

Feature Description

Scaling BGP Flowspec to 6000 Rules

Release 7.5.2

You can now assign 6000 BGP Flowspec rules for Cisco 8800 series routers and 3000 BGP Flowspec rules for Cisco 8100 and 8200 series routers. This feature thus provide enhanced mitigation against Distributed Denial-of-Service (DDoS) attacks.

In earlier releases, you could assign 2000 BGP Flowspec rules. These are one dimensional scale numbers; the numbers vary based on other intersecting features like AccessList (ACL), Quality of Service (QoS), and Local Path Transport Switching (LPTS).

The BGP flow specification (flowspec) feature allows you to rapidly deploy and propagate filtering and policing functionality among many BGP peer routers to mitigate the effects of a distributed denial-of-service (DDoS) attack over your network.

BGP Flowspec feature allows you to construct instructions to match a particular flow with IPv4 and IPv6 source, IPv4 and IPv6 destination, L4 parameters and packet specifics such as length, fragment, destination port and source port, actions that must be taken, such as dropping the traffic, or policing it at a definite rate, or redirect the traffic, through a BGP update. In the BGP update, the flowspec matching criteria is represented by Network Layer Reachability Information (BGP NLRI) and the actions are represented by BGP extended communities.

You can use the BGP Flowspec feature for mitigation of DDoS attack. When a DDoS attack occurs on a particular host inside a network, you can send a flowspec update to the border routers so that the attack traffic can be policed or dropped, or even redirected elsewhere. For example, to an appliance that cleans the traffic by filtering out the bad traffic and forward only the good traffic toward the affected host.

Once flowspecs have been received by a router and programmed in applicable line cards, any active L3 ports on those line cards start processing ingress traffic according to flowspec rules.

The BGP Flowspec feature cannot coexist with MAP-E and PBR on a given interface. If you configure BGP Flowspec with PBR, the router does not display any error or system message. The router ignores the BGP Flowspec configuration and the feature will not function.

Flow Specifications

A flow specification is an n-tuple consisting of several matching criteria that can be applied to IP traffic. A given IP packet matches the defined flow if it matches all the specified criteria.

Every flow-spec route is effectively a rule, consisting of a matching part (encoded in the NLRI field) and an action part (encoded as a BGP extended community). The BGP flowspec rules are converted internally to equivalent C3PL policy representing match and action parameters. The match and action support can vary based on underlying platform hardware capabilities. Sections Supported Matching Criteria and Actions and Traffic Filtering Actions provide information on the supported match (tuple definitions) and action parameters.


Note


  • Cisco 8800 series routers support up to 6,000 flowspec rules.

  • Cisco 8200 and 8100 series routers support up to 3,000 flowspec rules.


Supported Matching Criteria and Actions

Table 19. Feature History Table

Feature Name

Release Name

Description

Additional BGP FlowSpec Actions for Enhanced Security

Release 7.3.3

This release introduces additional BGP FlowSpec actions for enhanced security against distributed denial-of-service (DDoS) attacks.

  • Redirect Nexthop VRF only: Redirects the traffic to a different Autonomous System Number (ASN).

  • Rate Limit and Redirect IPv4 or IPv6 Nexthop: Redirects the traffic to the indicated nexthop IPv4 or IPv6 address. Policer rate regulates the traffic.

  • Rate Limit and Redirect Nexthop VRF: Redirects the traffic to the next hop IPv4 address through a VRF. Policer rate regulates the traffic. This action is supported only on Q200 Silicon One ASIC.

Table 20. Feature History Table

Feature Name

Release Name

Description

BGP FlowSpec NLRI types

Release 7.3.15

A BGP flow specification consists of several matching criteria encoded in the NLRI that is applied to IP traffic. A given IP packet must match all the specified criteria. Network layer reachability information (NLRI) exchanges routing information and matching criteria between BGP peers, indicating how to reach the destination.

The following NLRI types are supported:

  • Type 7: IPv4 or IPv6 ICMP type

  • Type 8: IPv4 or IPv6 ICMP code

  • Type 9: IPv4 TCP flags (2 bytes include reserved bits)

  • Type 10: IPv4 Packet length

  • Type 11: IPv4 or IPv6 DSCP

  • Type 12: IPv4 fragmentation bits

BGP FlowSpec Actions

Release 7.3.15

This feature provides information on the actions that can be associated with a BGP flow. The traffic filtering flow specification is applied based on the specified rule. The following extended community values that can be used to specify particular action:

  • Set DSCP

  • Redirect IPv4 or IPv6 next hop

Overview

A flow specification NLRI type may include several components such as destination prefix, source prefix, protocol, ports, and so on. This NLRI is treated as an opaque bit string prefix by BGP. Each bit string identifies a key to a database entry with which a set of attributes can be associated. This NLRI information is encoded using MP_REACH_NLRI and MP_UNREACH_NLRI attributes. Whenever the corresponding application does not require Next-Hop information, this is encoded as a 0-octet length Next Hop in the MP_REACH_NLRI attribute, and ignored. The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as a 1- or 2-octet NLRI length field followed by a variable-length NLRI value. The NLRI length is expressed in octets.

The flow specification NLRI type consists of several optional sub-components. A specific packet is considered to match the flow specification when it matches the intersection and of all the components present in the specification. The following are the supported component types or tuples that you can define:

BGP Flowspec NLRI type

QoS Match Fields

Description and Syntax Construction

Value Input Method

Type 1

IPv4 or IPv6 destination address

Defines the destination prefix to match. Prefixes are encoded in the BGP UPDATE messages as a length in bits followed by enough octets to contain the prefix information.

Encoding: <type (1 octet), prefix length (1 octet), prefix>

Syntax:

match destination-address {ipv4 | ipv6} address/mask length

Prefix length

Type 2

IPv4 or IPv6 source address

Defines the source prefix to match.

Encoding: <type (1 octet), prefix-length (1 octet), prefix>

Syntax:

match source-address {ipv4 | ipv6} address/mask length

Prefix length

Type 3

IPv4 or IPv6 protocol

Contains a set of {operator, value} pairs that are used to match the IP protocol value byte in IP packets.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match protocol {protocol-value | [min-value - max-value]}

Single value

Note

 

Multi-value range is not supported

Type 4

IPv4 or IPv6 source or destination port

Defines a list of {operation, value} pairs that matches source or destination TCP or UDP ports. Values are encoded as 1- or 2-byte quantities. Port, source port, and destination port components evaluate to FALSE if the IP protocol field of the packet has a value other than TCP or UDP. If the packet is fragmented and this is not the first fragment, or if the system in unable to locate the transport header.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match source-port{ source-port-value | min-value - max-value}

match destination-port{ destination-port-value | min-value - max-value}

Multi-value range

Type 5

IPv4 or IPv6 destination port

Defines a list of {operation, value} pairs used to match the destination port of a TCP or UDP packet. Values are encoded as 1- or 2-byte quantities.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match destination-port {destination-port-value | [min-value - max-value]}

Multi-value range

Type 6

IPv4 or IPv6 Source port

Defines a list of {operation, value} pairs used to match the source port of a TCP or UDP packet. Values are encoded as 1- or 2-byte quantities.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match source-port {source-port-value | [min-value - max-value]}

Multi-value range

Type 7

IPv4 or IPv6 ICMP type

Defines a list of {operation, value} pairs used to match the type field of an ICMP packet. Values are encoded using a single byte. The ICMP type and code specifiers evaluate to FALSE whenever the protocol value is not ICMP.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match{ipv4 | ipv6}icmp-type {value | min-value -max-value}

Single value

Note

 

Multi-value range is not supported

Type 8

IPv4 or IPv6 ICMP code

Defines a list of {operation, value} pairs used to match the code field of an ICMP packet. Values are encoded using a single byte.

Syntax:

Encoding: <type (1 octet), [op, value]+>

match{ipv4 | ipv6}icmp-type {value | min-value -max-value}

Single value

Note

 

Multi-value range is not supported

Type 9

IPv4 or IPv6 TCP flags (2 bytes include reserved bits)

Note

 

Reserved and NS bit not supported

Bitmask values can be encoded as a 1- or 2-byte bitmask. When a single byte is specified, it matches byte 13 of the TCP header, which contains bits 8 through 15 of the 4th 32-bit word. When a 2-byte encoding is used, it matches bytes 12 and 13 of the TCP header with the data offset field having a "don't care" value. As with port specifier, this component evaluates to FALSE for packets that are not TCP packets. This type uses the bitmask operand format, which differs from the numeric operator format in the lower nibble.

Encoding: <type (1 octet), [op, bitmask]+>

Syntax:

match tcp-flag value bit-mask mask_value

Bit mask

Type 10

IPv4 or IPv6 Packet length

Starting from Release 7.10.1, the IPv6 packet length is supported.

Note

 
  • Reserved and NS bit not supported

  • IPv4 or IPv6 support is available for the packets that are not the first fragment packets.

Match on the total IP packet length (excluding Layer 2, but including IP header). Values are encoded using 1- or 2-byte quantities.

Encoding: <type (1 octet), [op, value]+>

Syntax:

matchpacket length {packet-length-value |min-value -max-value}

Multi-value range

Type 11

IPv4 or IPv6 DSCP

Defines a list of (operation, value) pairs used to match the 6-bit DSCP field. Values are encoded using a single byte, whereas the two most significant bits are zero and the six least significant bits contain the DSCP value.

Note

 

The DSCP does not contain Flowspec statistics.

Encoding: <type (1 octet), [op, value]+>

Syntax:

match dscp {dscp-value | min-value - max-value}

Multi-value range

Type 12

IPv4 Fragmentation bits

Note

 

IPv4 support is available for the packets that are not the first fragment packets.

IPv6 BGP flowspec does not supports Type 12 NRLI.

Identifies a fragment-type as the match criterion for a class map.

Encoding: <type (1 octet), [op, bitmask]+>

Syntax:

match fragment type [is-fragment]

Bit mask

In a given flowspec rule, 2-tuple action combinations can be specified without restrictions. However, mixing address family between matching criterion and actions are not allowed. For example, IPv4 matches cannot be combined with IPv6 actions and vice versa.

Limitations for BGP FlowSpec

These limitations apply to the BGP FlowSpec feature.

  • BGP Flowspec statistics are supported when there is a policer rate limit.

    The policer action scale is limited to a maximum of 128 per slice.

  • BGP Flowspec statistics are supported in Redirect action only when a policer is attached. BGP Flowspec statistics is not supported for Redirect action alone.

  • VRF to default VRF redirect is not supported.

BGP Flowspec Redirect from Global VRF to L3VPN and Segment Routing Policy

Table 21. Feature History Table

Feature Name

Release Information

Feature Description

BGP Flowspec Redirect from Global VRF to L3VPN and Segment Routing policy

Release 24.2.11 You can now enhance network routing efficiency by enabling BGP Flowspec to dynamically redirect traffic to the VRF table, where the traffic searches for the destination IP address either within the L3VPN or via a segment routing policy. This improvement boosts routing adaptability and service continuity. Additionally, the protocol extension equips you to execute precise traffic actions, optimizing network performance and security.

The BGP Flowspec Redirect from Global VRF to L3VPN and Segment Routing policy feature from Global VRF to L3VPN feature allows traffic to be dynamically redirected to the VRF table, where the traffic searches for the destination IP address either within the L3VPN or via a segment routing policy. This improvement boosts routing adaptability and service continuity. Additionally, the protocol extension equips you to execute precise traffic actions, optimizing network performance and security.

BGP Flowspec Topology
Figure 13. Forwarding based on SR-Policy
Figure 14. Forwarding based on MPLS

A network traffic arrives from an interface VRFA. However, this interface is not specifically designated for customers. The incoming packet has a destination IP address of 10.0.0.1/8 and this IP address is not available in the global routing table—it exists only in the VRF routing table. As a result, the packet is dropped.

To address this issue, we apply certain criteria. We ensure that the IP address lookup for such packets occurs within the customer VRF rather than the global VRF. By doing so, we direct the packet to the correct routing context, allowing successful forwarding.

The forwarding process in this scenario typically happens through L3VPN or SR-Policy, which provide an effective mechanism for managing routing and forwarding in complex network environments. For those kind of route we match certain criteria and make sure the lookup for the IP address happens in the customer VRF and not the global VRF.

The BGP flowspec server is where the rule is initially programmed. These rules are then propagated to the BGP flowspec neighbor through This should be BGP Network Layer Reachability Information (NLRI). Once the client receives the rule, it is stored in the database. When the rule becomes active, it starts taking effect. Incoming packets are matched against this active rule.

If a packet meets the criteria specified by the rule, it is redirected to the appropriate VRF instance. In the redirect action, it is crucial to specify the correct route target. This ensures that the packet is correctly routed to the intended VRF.

The BGP flowspec server plays a pivotal role in defining and enforcing traffic rules, allowing for fine-grained control over packet handling within the network.

Configure BGP Flowspec Redirect from Global VRF
Configuration Examples

Perform the steps gven below on the BGP Flowspec controller to enable BGP Flowspec redirect from global VRF.

  • Create a Class Map - Create a class map to be used for matching packets to the class whose name you specify and enters the class map configuration mode.

  • Build a Policy Map - Create a policy map that can be attached to a flowspec to specify a service policy and enters the policy map configuration mode.

  • Link the Class Map to the Policy Map - The policy map configuration mode, the `class type traffic` command is used to associate the previously configured traffic class with the policy map.

  • Define Policy actions- Defines the actions that you want to perform.


/* Create a Class Map */
Router# config
Router(config)# class-map type traffic match-all ipv4_CM1
Router(config-cmap)# match destination-address ipv4 10.0.0.1. 255.255.255.0
Router(config-cmap)# end-class-map
Router(config)# exit
Router(config)# class-map type traffic match-all ipv6_CM1
Router(config-cmap)# match destination-address ipv6 2000:0:0:1::/64
Router(config-cmap)# end-class-map
Router(config)# exit 

/* Build a Policy Map */
Router(config)# policy-map type pbr ipv4_PM1
Router(config-pmap)# class type traffic ipv4_CM1 
Router(config-pmap-c)# redirect nexthop route-target 1:1
Router(config-pmap-c)# exit 
Router(config-pmap)# class type traffic class-default 
Router(config-pmap)# end-policy-map
Router(config)# exit

/* Link the Class Map to the Policy Map */
Router(config)# policy-map type pbr ipv6_PM1
Router(config-pmap)# class type traffic ipv6_CM1 
Router(config-pmap-c)# redirect nexthop route-target 1:1
Router(config-pmap-c)# exit  
Router(config-pmap)# class type traffic class-default 
Router(config-pmap-c)# end-policy-map
Router(config)# exit 

/* Define Policy actions */
Router(config)# flowspec
Router(config)# address-family ipv4
Router(config-af)# service-policy type pbr ipv4_PM1
Router(config)# flowspec
Router(config)# address-family ipv6
Router(config-af)#  service-policy type pbr ipv6_PM1


For information on how to configure SR-Policy refer the chapter "Configure SRv6 Traffic Engineering" in the "Segment Routing Configuration Guide for Cisco 8000 Series Routers".

Similarly, for information on how to configure L3VPN refer the chapter "Implementing MPLS Layer 3 VPNs" in the "L3VPN Configuration Guide for Cisco 8000 Series Routers".

Running Configuration
class-map type traffic match-all ipv4_CM1
 match destination-address ipv4 10.0.0.1. 255.255.255.0
 end-class-map
! 
class-map type traffic match-all ipv6_CM1
 match destination-address ipv6 2000:0:0:1::/64
 end-class-map
! 

policy-map type pbr ipv4_PM1
 class type traffic ipv4_CM1 
  redirect nexthop route-target 1:1
   
  ! 
 ! 
 class type traffic class-default 
 ! 
 end-policy-map
! 
policy-map type pbr ipv6_PM1
 class type traffic ipv6_CM1 
  redirect nexthop route-target 1:1
   
  ! 
 ! 
 class type traffic class-default 
 ! 
 end-policy-map
! 

flowspec
 address-family ipv4
  service-policy type pbr ipv4_PM1
 address-family ipv6
  service-policy type pbr ipv6_PM1


flowspec config on PE1:

flowspec
 local-install interface-all
Verification

Note


BGP Flowspec statistics will not be available until a policer action is configured.


Verify the number of BGP Flowspec entries present in the OFA object.

Router# show ofa objects pbr object-count location 0/RP0/CPU0 
Table [PBR] has 4200 entries in DB
Table [PBR] had 4200 as highest count @ Tue Feb  6 20:08:04 2024 

Verify the BGP Flowspec rules and statistics.

Router# show flowspec ipv4 detail 
Thu Jan 25 09:10:14.965 UTC

AFI: IPv4
  Flow           :Dest:10.0.0.1/8
    Actions      :Traffic-rate: 5000000 bps Redirect: VRF vpn1 Route-target: ASN2-1:1  (bgp.1)
    Statistics                        (packets/bytes)
      Matched             :                 200/25600              
      Transmitted         :                 200/25600              
      Dropped             :                   0/0                  
  Flow           :Dest:10.0.0.2/8
    Actions      :Traffic-rate: 5000000 bps Redirect: VRF vpn1 Route-target: ASN2-1:1  (bgp.1)
    Statistics                        (packets/bytes)
      Matched             :                 200/25600              
      Transmitted         :                 200/25600              
      Dropped             :                   0/0   

Traffic Filtering Actions

The default action for a traffic filtering flow specification is to accept IP traffic that matches that particular rule. The following extended community values can be used to specify particular actions:


Note


The BGP flowspec actions rate limit and redirect are not supported together.

The BGP flowspec action redirect is supported only for nexthop IPv4 and IPv6 not with nexthop VRF IPv4 and IPv6.


Type

Extended Community

PBR Action

Description

0x8006

traffic-rate 0

traffic-rate <rate>

Drop

Police

The traffic-rate extended community is a non-transitive extended community across the autonomous-system boundary and uses following extended community encoding:

The first two octets carry the 2-octet id, which can be assigned from a 2-byte AS number. When a 4-byte AS number is locally present, the 2 least significant bytes of such an AS number can be used. This value is informational. The remaining 4 octets carry the rate information in IEEE floating point [IEEE.754.1985] format, bytes per second. A traffic-rate of 0 should result on all traffic for the particular flow to be discarded.

Command syntax

police rate < > | drop

0x8009

traffic-marking

Set DSCP

The traffic marking extended community instructs a system to modify the differentiated service code point (DSCP) bits of a transiting IP packet to the corresponding value. This extended community is encoded as a sequence of 5 zero bytes followed by the DSCP value encoded in the 6 least significant bits of 6th byte.

Command syntax

set dscp <6 bit value>

0x0800

Redirect IP NH

Redirect IPv4 or IPv6 Nexthop

Announces the reachability of one or more flowspec NLRI. When a BGP speaker receives an UPDATE message with the redirect-to- IP extended community it is expected to create a traffic filtering rule for every flow-spec NLRI in the message that has this path as its best path. The filter entry matches the IP packets described in the NLRI field and redirects them or copies them towards the IPv4 or IPv6 address specified in the Network Address of Next-Hop field of the associated MP_REACH_NLRI.

Note

 

The redirect-to-IP extended community is valid with any other set of flow-spec extended communities except if that set includes a redirect-to-VRF extended community (type 0x8008) and in that case the redirect-to-IP extended community should be ignored.

Note

 

Redirect IP NH is supported only in default VRF.

Command syntax

redirect {ipv4 | ipv6} next-hop {ipv4-address | ipv6-address}

BGP Flowspec Client-Server Controller Model

The BGP Flowspec model comprises of a client and a server Controller. The Controller is responsible for sending or injecting the flowspec NRLI entry. The client (acting as a BGP speaker) receives that NRLI and programs the hardware forwarding to act on the instruction from the Controller. An illustration of this model is provided below.

BGP Flowspec Client

Here, the Controller on the left-hand side injects the flowspec NRLI, and the client on the right-hand side receives the information, sends it to the flowspec manager, configures the ePBR (Enhanced Policy-based Routing) infrastructure, which in turn programs the hardware from the underlaying platform in use.

BGP Flowspec Controller

The Controller is configured using CLI to provide an entry for NRLI injection.

Configure BGP Flowspec

The following sections show how to configure BGP Flowspec feature.

Figure 15. BGP Flowspec

The controller or the server with IP address 10.2.3.4 sends the Flowspec NLRI to the client with IP address 10.2.3.3. The NLRI consists of matching criteria, the client processes based on this criteria. Traffic is dropped or accepted based on the configured criteria.

The following section describes how you can configure BGP Flowspec on the client:



/* Define a Virtual Routing and Forwarding (VRF) instance named vrf1 and set up 
import and export route targets for different address families. */

Router(config)# router bgp 140
Router(config-bgp)# vrf vrf1
Router(config-bgp-vrf)# address-family ipv4 unicast
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit

Router(config-bgp-vrf)# address-family ipv4 flowspec
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit

Router(config-bgp-vrf)# address-family ipv6 unicast
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit

Router(config-bgp-vrf)# address-family ipv6 flowspec
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit

Router(config-bgp-vrf)# address-family vpnv4 flowspec
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit

Router(config-bgp-vrf)# address-family vpnv6 flowspec
Router(config-bgp-vrf-af)# import route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# export route-target
Router(config-bgp-vrf-af)# 101:2000
Router(config-bgp-vrf-af)# 201:2000
Router(config-bgp-vrf-af)# exit 


/* Configure BGP Flowspec both on the server and client side. */
Router(config)# flowspec
Router(config-flowspec)# address-family ipv4
Router(config-flowspec-af)# local-install interface-all
Router(config-flowspec-af)# exit
Router(config-flowspec)# address-family ipv6
Router(config-flowspec-af)# local-install interface-all
Router(config-flowspec-af)# exit
Router(config-flowspec)# address-family vpnv4
Router(config-flowspec-af)# local-install interface-all
Router(config-flowspec-af)# exit
Router(config-flowspec)# address-family vpnv6
Router(config-flowspec-af)# local-install interface-all
Router(config-flowspec-af)# exit


/* Configure the policy to accept all presented routes without modifying the routes */
Router(config)# route-policy pass-all
Router(config)# pass
Router(config)# end-policy

/* Configure the policy to reject all presented routes without modifying the routes */
Router(config)# route-policy drop-all
Router(config)# drop
Router(config)# end-policy

/* Configure BGP towards flowspec server */
Router(config)# router bgp 1
Router(config-bgp)# nsr
Router(config-bgp)# bgp router-id 10.2.3.3
Router(config-bgp)# address-family ipv4 flowspec
Router(config-bgp-af)# exit
Router(config-bgp)#