Disaster recovery for a multitenant Cisco SD-WAN Manager cluster
Summary
If a Multitenant Cisco SD-WAN Manager cluster or the data center hosting the SD-WAN Manager nodes in the cluster fail, you can recover from the failure by activating a standby SD-WAN Manager cluster. You can perform disaster recovery as follows:
Workflow
- Deploy and configure a standby SD-WAN Manager cluster. The standby SD-WAN Managercluster is not part of the overlay network and is not active.
- Back up the configuration database of the active SD-WAN Manager cluster periodically. Choose a SD-WAN Manager node in the cluster that hosts the configuration database service and back up the configuration database.
- If the active SD-WAN Manager cluster fails, restore the most recent configuration database on the standby SD-WAN Manager cluster, activate the standby SD-WAN Manager cluster, and remove the previously active CSD-WAN Manager cluster from the overlay network.
- Choose a SD-WAN Manager node in the cluster that hosts the configuration database service and restore the configuration database backed up from the previously active SD-WAN Manager cluster.
What’s next
To test disaster recovery, you can simulate a scenario in which the active SD-WAN Manager cluster fails. One way to simulate such a failure would be by disabling the tunnel interface as described in this document.
Prerequisites for a multitenant disaster recovery
-
The number of SD-WAN Manager nodes in the active and standby clusters must match.
-
Each SD-WAN Manager node in the active and standby clusters must run the same SD-WAN Manager software release.
-
Each SD-WAN Manager node in the active and standby clusters must connect to the WAN transport IP address of the SD-WAN Validator in the overlay network.
-
Initially, disable the tunnel interfaces of the SD-WAN Manager nodes in the standby cluster.
-
Certify the SD-WAN Manager nodes in the standby cluster.
-
Synchronize the clock of every SD-WAN Manager node in the standby cluster with the clocks of the SD-WAN Controller and WAN edge devices in the overlay network. If NTP is configured on the overlay, configure the same on the standby SD-WAN Manager nodes.
-
Use identical Neo4j credentials on the SD-WAN Manager nodes in the active and standby clusters.
Restrictions for a multitenant disaster recover
Defines restrictions to backup and restore process during disaster recovery of a SD-WAN Manager cluster.
-
Do not interrupt any active processes while backing up the configuration database.
-
Enable SD-AVC before restoring the configuration database on the standby SD-WAN Manager node.
Configure a standby SD-WAN Manager cluster
To prepare standby SD-WAN Manager nodes with a unique yet synchronized configuration for disaster recovery without impacting the active overlay network.
Procedure
|
Step 1 |
Configure the standby SD-WAN Manager nodes with a running configuration similar to the active SD-WAN Manager nodes and install local certificates. The running configuration on a standby node is usually identical to an active node, but ensure settings such as system IP address and tunnel interface IP address are unique. |
|
Step 2 |
On the standby nodes, shut down the transport interface in VPN 0 using the CLI shutdown command in the transport interface configuration. |
|
Step 3 |
Create a standby cluster using the configured standby SD-WAN Manager nodes. |
|
Step 4 |
With this configuration, the overlay network remains unaware of the standby SD-WAN Manager cluster. |
Back up the active SD-WAN Manager cluster configuration
Back up the full configuration database of the active Cisco vManange cluster periodically. Additionally, take snapshots of the active SD-WAN Manager virtual machines.
Procedure
|
Step 1 |
Choose an active SD-WAN Manager node that hosts the configuration database service. |
|
Step 2 |
On the CLI of the selected node, run the following command to back up the configuration database: request nms configuration-db backup path <file-path> The command saves the configuration database as a Example:In the example below, the database is backed up to a file named db_backup.tar.gz in the /home/admin/ directory.
|
|
Step 3 |
Choose a standby SD-WAN Manager node that hosts the configuration database service and copy the configuration database backup to this node. Example:In the following example, db_backup.tar.gz is copied from the active SD-WAN Managernode to the /home/admin/ directory of a standby SD-WAN Manager node.
|
Restore SD-WAN Manager cluster using the configuration database backup
Restore the most recent backup of the configuration database from the active SD-WAN Manager cluster on the standby SD-WAN Manager node to which the backup was copied.
-
The restore operation does not restore all configuration details. Settings such as users and repositories must be configured on the standby SD-WAN Manager node after restoring the backup.
-
When you complete the steps that follow, the previously active SD-WAN Manager nodes cannot be reused. To reuse the nodes, you must perform additional steps that are beyond the scope of this document.
Procedure
|
Step 1 |
On the CLI of the standby SD-WAN Manager node, run the following command: request nms configuration-db restore path file-path . Example:In the following example, the configuration database is restored using the backup file db_backup.tar.gz.
|
|
Step 2 |
Verify standby SD-WAN Manager nodes. |
|
Step 3 |
On the standby SD-WAN Manager nodes, enable the transport interface on VPN 0 using one of these two methods: |
|
Step 4 |
Add each standby SD-WAN Manager node to the overlay network.
|
|
Step 5 |
Disconnect the active SD-WAN Manager nodes from the overlay network using one of these methods. In a lab environment, where you are simulating a disaster scenario, you can perform this step. However, if you cannot reach SD-WAN Manager instances in an actual disaster scenario, you may not be able to perform this step and can omit the step. |
|
Step 6 |
From the standby SD-WAN Manager send the updated controller and device list to the SD-WAN Validator, including the list of controllers: |
|
Step 7 |
Verify configuration and connectivity |
|
Step 8 |
Invalidate the previously active SD-WAN Manager nodes. After you invalidate the SD-WAN Manager, the nodes cannot be reused without performing additional steps that are beyond the scope of this document.
|
Verify the valid SD-WAN Manager nodes
Procedure
|
Step 1 |
Log in to the CLI of each SD-WAN Validator and run the show orchestrator valid-vmanage-id command. In the command output, verify that the chassis numbers of only the active SD-WAN Manager nodes are listed. |
|
Step 2 |
Log in to the CLI of WAN edge device and run the show control valid-vmanage-id command. In the command output, verify that the chassis numbers of only the active SD-WAN Manager nodes are listed. Also, check whether the device is connected to the active SD-WAN Manager nodes and the Cisco Catalyst SD-WAN Controller. |
Feedback