This document describes how to migrate two Cisco Aggregation Services Router (ASR) 9000 (9K) single-chassis systems to a Network Virtualization (nV) Edge system.
In order to cluster two routers together, there are several requirements that must be met.
You must have Cisco IOS® XR Release 4.2.1 or later.
- ASR 9006 and 9010 that started in Release 4.2.1
- ASR 9001 support that started in Release 4.3.0
- ASR 9001-S and 9922 support that started in Release 4.3.1
- ASR 9904 and 9912 support that started in Release 5.1.1
Line Card (LC) and Route Switch Processor (RSP):
- Dual RSP440 for 9006/9010/9904
- Dual Route Processor (RP) for 9912/9922
- Single RSP for 9001/9001-S
- Typhoon-based LC or SPA Interface Processor (SIP)-700
Control Links (Ethernet Out of Band Control (EOBC)/Cluster ports) supported optics:
- Small Form-Factor Pluggabble (SFP)-GE-S, Release 4.2.1
- GLC-SX-MMD, Release 4.3.0
- GLC-LH-SMD, Release 4.3.0
Data Links / IRL supported optics:
- Optics support is as per LC support
- 10G IRL support that started in Release 4.2.1
- 40G IRL support that started in Release 5.1.1
- 100G IRL support that started in Release 5.1.1
The example in this document is based upon two 9006 routers with an RSP440 that run XR Release 4.2.3.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
The IRLs are the data plane connection between the two routers in the cluster.
The control link or EOBC ports are the control plane connection between the two routers.
- Turboboot or upgrade to the desired XR software release on both routers (minimum of Release 4.2.1).
- Ensure that the XR software is up to date with Software Maintenance Upgrades (SMUs) as well as the Field Programmable Device (FPD) firmware.
- Determine the serial number of each chassis. You need this information in later steps.
RP/0/RSP0/CPU0:ASR9006#admin show inventory chass
NAME: "chassis ASR-9006-AC-E", DESCR: "ASR 9006 AC Chassis with PEM Version 2"
PID: ASR-9006-AC-V2, VID: V01, SN: FOX1613G35U
- On Rack 1 only, configure the router config-register to use rom-monitor boot mode.
admin config-register boot-mode rom-monitor location all
- Power off Rack 1.
- On Rack 0, configure the cluster serial numbers acquired in Step 3 from each router:
nv edge control serial FOX1613G35U rack 0
nv edge control serial FOX1611GQ5H rack 1
- Reload Rack 0.
- Power on rack 1 and apply these commands to both RSP 0 and RSP 1.
- Power off Rack 1.
- Connect the control link cables as shown in the figure in the Network Diagram section.
- Power on Rack 1.
The RSPs on Rack 1 sync all of the packages and files from Rack 0.
Expected output on Rack 1 during boot up
Cisco IOS XR Software for the Cisco XR ASR9K, Version 4.2.3
Copyright (c) 2013 by Cisco Systems, Inc.
Aug 16 17:15:16.903 : Install (Node Preparation): Initializing VS Distributor...
Media storage device /harddisk: was repaired. Check fsck log at
Could not connect to /dev/chan/dsc/cluster_inv_chan:
Aug 16 17:15:42.759 : Local port RSP1 / 12 Remote port RSP1 /
Aug 16 17:15:42.794 : Lport 12 on RSP1[Priority 2] is selected active
Aug 16 17:15:42.812 : Local port RSP1 / 13 Remote port RSP0 /
Aug 16 17:15:42.847 : Lport 13 on RSP1[Priority 1] is selected active
Aug 16 17:16:01.787 : Lport 12 on RSP0[Priority 0] is selected active
Aug 16 17:16:20.823 : Install (Node Preparation): Install device root from dSC
Aug 16 17:16:20.830 : Install (Node Preparation): Trying device disk0:
Aug 16 17:16:20.841 : Install (Node Preparation): Checking size of device disk0:
Aug 16 17:16:20.843 : Install (Node Preparation): OK
Aug 16 17:16:20.844 : Install (Node Preparation): Cleaning packages on device disk0:
Aug 16 17:16:20.844 : Install (Node Preparation): Please wait...
Aug 16 17:17:42.839 : Install (Node Preparation): Complete
Aug 16 17:17:42.840 : Install (Node Preparation): Checking free space on disk0:
Aug 16 17:17:42.841 : Install (Node Preparation): OK
Aug 16 17:17:42.842 : Install (Node Preparation): Starting package and meta-data sync
Aug 16 17:17:42.846 : Install (Node Preparation): Syncing package/meta-data contents:
Aug 16 17:17:42.847 : Install (Node Preparation): Please wait...
Aug 16 17:18:42.301 : Install (Node Preparation): Completed syncing:
Aug 16 17:18:42.302 : Install (Node Preparation): Syncing package/meta-data contents:
Aug 16 17:18:42.302 : Install (Node Preparation): Please wait...
Aug 16 17:19:43.340 : Install (Node Preparation): Completed syncing:
Aug 16 17:19:43.341 : Install (Node Preparation): Syncing package/meta-data contents:
Aug 16 17:19:43.341 : Install (Node Preparation): Please wait...
Aug 16 17:20:42.501 : Install (Node Preparation): Completed syncing:
Aug 16 17:20:42.502 : Install (Node Preparation): Syncing package/meta-data contents:
- Configure the data link ports as nV Edge ports from Rack 0 (the dSC):
- Verify the data plane:
show nv edge data forwarding location all
nV Edge Data interfaces in forwarding state: 4
TenGigE0_0_1_3 <--> TenGigE1_0_0_3
TenGigE0_1_1_3 <--> TenGigE1_1_0_3
TenGigE0_2_1_3 <--> TenGigE1_2_0_3
TenGigE0_3_1_3 <--> TenGigE1_3_0_3
In this output, the IRLs should be in the Forwarding state.
- Verify the Control Plane:
show nv edge control control-link-protocols location 0/RSP0/CPU0
Port enable administrative configuration setting: Enabled
Port enable operational state: Enabled
Current bidirectional state: Bidirectional
Current operational state: Advertisement - Single neighbor detected
Priority lPort Remote_lPort UDLD STP
======== ===== ============ ==== ========
0 0/RSP0/CPU0/0 1/RSP0/CPU0/0 UP Forwarding
1 0/RSP0/CPU0/1 1/RSP1/CPU0/1 UP Blocking
2 0/RSP1/CPU0/0 1/RSP1/CPU0/0 UP On Partner RSP
3 0/RSP1/CPU0/1 1/RSP0/CPU0/1 UP On Partner RSP
From this output, the Current bidirectional state should be Bidirectional and only one of the ports should be in the Forwarding state.
- Verify the Cluster Status:
RP/0/RSP0/CPU0:ASR9006#admin show dsc
Node ( Seq) Role Serial State
0/RSP0/CPU0 ( 0) ACTIVE FOX1613G35U PRIMARY-DSC
0/RSP1/CPU0 (10610954) STANDBY FOX1613G35U NON-DSC
1/RSP0/CPU0 ( 453339) STANDBY FOX1611GQ5H NON-DSC
1/RSP1/CPU0 (10610865) ACTIVE FOX1611GQ5H BACKUP-DSC
This command displays both the dSC (inter-rack) status and the redundancy role (intra-rack) for all RSPs in the system.
This example has these:
- RSP0 on Rack 0 is the primary-dSC and the active RSP for the rack
- RSP1 on Rack 0 is a non-dSC and the standby RSP for the rack
- RSP0 on Rack 1 is a non-dSC and the standby RSP for the rack
- RSP1 on Rack 1 is the backup-dSC and the active RSP for the rack
Link Aggregation Group (LAG) & Bridge Virtual Interface (BVI) Optimizations
Manually configure the system and interface MAC addresses. This additional step ensures that if there is a rack failure, the LAG bundle continues to communicate with the same MAC address, instead of the local address from the local chassis. This step is also required for shared Layer 2 interfaces, such as a BVI.
- Identify the MAC addresses that are in use:
show lacp system-id
show int bundle-ether 1
show interface BVI 1
- Manually configure the MAC addresses. You should use the same MAC addresses from the show captures in Step 1.
lacp system mac 8478.ac2c.7805
interface bundle-ether 1
- Apply a supress-flap delay in order to prevent the bundle manager process from flapping LAG link during failover.
Int bundle-ether 1
lacp switchover suppress-flaps 15000
Layer 3 Equal-Cost Multi-Path (ECMP) Optimizations
- Bidirectional Forwarding Detection (BFD) and Non-Stop Forwarding (NSF) for Fast Convergence
router isis LAB
bfd minimum-interval 50
bfd multiplier 3
bfd fast-detect ipv4
bfd minimum-interval 50
bfd multiplier 3
bfd fast-detect ipv4
- Loop Free Alternate Fast Reroute (LFA-FRR) for Fast Convergence
In order to change the Cisco Express Forwarding (CEF) tables before the Routing Information Base (RIB) is able to reconverge, you can use LFA-FRR in order to further reduce any traffic loss in a failover situation.
router isis Cluster-L3VPN
address-family ipv4 unicast
address-family ipv4 unicast
nV IRL Threshold Monitor
If the number of IRL links available for forwarding drops below a certain threshold, then the IRLs that remain might become congested and cause inter-rack traffic to be dropped.
In order to prevent traffic drops or traffic blackholes, one of three preventative actions should be taken.
- Shut down all interfaces on the backup-dSC.
- Shut down selected interfaces.
- Shut down all interfaces on a specific rack.
RP/0/RSP0/CPU0:ios(admin-config)#nv edge data minimum <minimum threshold> ?
backup-rack-interfaces Disable ALL interfaces on backup-DSC rack
selected-interfaces Disable only interfaces with nv edge min-disable config
specific-rack-interfaces Disable ALL interfaces on a specific rack
With this configuration, if the number of IRLs drops below the minimum threshold configured, all of the interfaces on whichever chassis hosts the backup-DSC RSP will be shut down.
With this configuration, if the number of IRLs drops below the minimum threshold configured, the interfaces on any of the racks that are explicitly configured to be brought down will be shut down.
The interfaces chosen for such an event can be explicitly configured via this configuration:
interface gigabitEthernet 0/1/1/0
nv edge min-disable
With this configuration, if the number of IRLs drops below the minimum threshold configured, all of the interfaces on the specified rack (0 or 1) will be shut down.
The default configuration is the equivalent of having configured nv edge data minimum 1 backup-rack-interfaces. This means that if the number of IRLs in the forwarding state drops below 1 (at least 1 forwarding IRL), then all of the interfaces on whichever rack has the backup-DSC will get shut down. All traffic on that rack stops being forwarded.
This section covers common error messages encountered when nV Edge is deployed.
PLATFORM-DSC_CTRL-3-MULTIPLE_PRIMARY_DSC_NODES : Primary DSC state declared
by 2 nodes: 0/RSP1/CPU0 1/RSP0/CPU0 . Local state is BACKUP-DSC
This message is caused by unsupported SFPs on the EOBC ports. This can also be triggered by mismatched FPD firmware versions on the two routers. Make sure that FPDs are upgraded prior to the migration.
PLATFORM-CE_SWITCH-6-BADSFP : Front panel nV Edge Control Port 0 has unsupported
SFP plugged in. Port is disabled, please plug in Cisco support 1Gig SFP for port
to be enabled
This message appears if an unsupported optic is inserted. The optic should be replaced with a supported EOBC Cisco optic.
Front Panel port 0 error disabled because of UDLD uni directional forwarding.
If the cause of the underlying media error has been corrected, issue this CLI
to being it up again. clear nv edge control switch error 0 <location> <location>
is the location (rsp) where this error originated
This message appears if a particular control Ethernet link has a fault and is flapping too frequently. If this happens, then this port is disabled and will not be used for control link packet forwarding.
PLATFORM-CE_SWITCH-6-UPDN : Interface 12 (SFP+_00_10GE) is up
PLATFORM-CE_SWITCH-6-UPDN : Interface 12 (SFP+_00_10GE) is down
These messages appear whenever the Control Plane link physical state changes. This is similar to a data port up/down notification. These messages also appear anytime an RSP reloads or boots. These messages are not expected during normal operation.
PLATFORM-NVEDGE_DATA-3-ERROR_DISABLE : Interface 0x40001c0 has been uni
directional for 10 seconds, this might be a transient condition if a card
bootup / oir etc.. is happening and will get corrected automatically without
any action. If its a real error, then the IRL will not be available fo forwarding
inter-rack data and will be missing in the output of show nv edge data
On bootup, this message might be seen. In regular production, this means that the IRL will be unavailable for forwarding inter-rack data. In order to determine the interface, enter the show im database ifhandle <interface handle> command. The link will restart Unidirectional Link Detection (UDLD) every 10 seconds until it comes up.
PLATFORM-NVEDGE_DATA-6-IRL_1SLOT : 3 Inter Rack Links configured all on one slot.
Recommended to spread across at least two slots for better resiliency
All of the IRL links are present on the same LC. For resiliency, IRLs should be configured on at least two LCs.
INFO: %d Inter Rack Links configured on %d slots. Recommended to spread across maximum 5 slots for better manageability and troubleshooting
The total number of IRLs in the system (maximum 16) is recommended to be spread across two to five LCs.
PLATFORM-NVEDGE_DATA-6-ONE_IRL : Only one Inter Rack Link is configured. For
Inter Rack Link resiliency, recommendation is to have at least two links spread
across at least two slots
It is recommended to have at least two IRL links configured for resiliency reasons.