Introduction
This document describes the supported network connectivity for various Network involved in Cisco DNA Center 3-node cluster deployment.
Prerequisites
Please familiarize yourself with the basic information about 3-node Cisco DNA Center cluster and High availability by reading following articles:
Cisco DNA Center Install Guide - This guide describes step-by-step how to bring up 3-node cluster.
Administrator Guide for Cisco DNA Center 1.2.x
Administrator Guide for Cisco DNA Center 1.2.10
Description
Starting for Cisco DNA Center 1.2.8 version, 3-node High Availabilty cluster is supported for Base Automation and SD-Access Automation. In 1.2.8/1.2.10 High availability is still in Beta release for Assurance.
Cisco DNA Center’s high availability (HA) offers more resiliency and reduces the downtime when node or services or a network link goes down. When a failure occurs, this framework helps to restore your network to its previous operational state. If this is not possible, Cisco DNA Center will indicate that there is an issue requiring your attention.
Any time Cisco DNA Center’s HA framework determines that a change on a cluster node has taken place, it synchronizes this change with the other nodes. The supported synchronization types include:
-
Database changes, such as updates related to configuration, performance and monitoring data.
-
File changes, such as report configurations, configuration templates, TFTP-root directory, administration settings, licensing files, and the key store.
Current Cisco DNA Center software supports minimum 3-node cluster for High Availability to work. Once the cluster is setup, it can manage single node failure. Minimum 2 nodes are required to set the quorum. Without a 2 node quorum the cluster will be declared down. If you are using SD-Access Fabric then cluster failure will only result in Automation provisioning failure but still your SD-Access fabric user network traffic will continue to forward as DNA Center is not responsible for any Control or Data traffic.
In this document we will look at various failure points and how cluster mitigates the downtime to keep Cisco DNA Center operational at all the time. We will mainly focus on network connectivity aspects of 3-node cluster. For services and all other information please refer to the install and admin guide.
Network Connectivtiy:
Cisco DNA Center uses the following types of network connectivity:
1. 10 Gbps Cluster Link
2. 1 Gbps GUI/Management Link
3. 1 Gbps Cloud Link (Optional)
4. 10 Gbps Enterprise Link
5. 1 Gbps CIMC Link
It is assumed that proper intra-cluster-IP-ARP resolution happens and connectivity is ensured between all 3 nodes. In addition, it is recommended to have <10ms RTT between cluster links for all the scenarios.
Failure scenarios and cluster behavior:
In general, Cluster services redistribution happens under following condition:
1. Single node goes down: Services will be distributed to remaining 2 nodes and cluster will be still operational.
2. Enterprise network link goes down for single node: No service redistribution. Only reachability to enterprise network from failed node will not work.
3. Cluster Network link goes down: Services will be redistributed to remaining 2 nodes and cluster will be still operational.
4. All other network links go down except cluster link for single node: node will not be able to service the expected functions but all service and cluster will operate normal.
5. Service failure on single node: Service will try to restart. In most of the scenarios it will try to restart on same node but currently there is no affinity to the node so it can start on any node.
6. Network switch goes down: Depending upon different types of topology, cluster will operate normally or service will be redistributed or everything will be down.
Physical Topology Option-1
Initially Engineering suggested following network connectivity. Both Picture-1 and Picture-2 provides the connectivity where each type of Network link from all node is connected to same physical switch. For example: Enterprise network link from all 3 node is connected to same physical switch.
Picture-1

Picture-2

Above topology provides following types of failure scenario where cluster will still be operational.
1. Single Node failure
2. Enterprise network link failure
3. Cluster Link Failure
4. Service Failure
Above topology will not be able to manage complete switch down for any of the network link.
Failure Condition |
Impact / Cluster State |
Single Node Down |
Cluster will be still operational with remaining 2 node. |
Single Link down for any of the network link |
Cluster will continue to operate normal. Services will be distributed only if cluster link goes down. |
Switch goes down |
Cluster will be unusable for automation. |
Physical Topology Option-2 (Most Recommended)
Picture-3 provides the connectivity where all of the Network link from same node is connected to same physical switch. All links from a node are connected to same physical switch with separation using VLANs, or it can be connected to different switches. For example: Link from Node-1 is connected to Switch-1, Links from Node-2 is connected to Switch-2 and so forth.
Picture-3

Above topology provides following types of failure scenario where cluster will still be operational.
1. Single Node failure
2. Enterprise network link failure for single node
3. Cluster Link Failure for single node
4. Service Failure for single node
5. Single Network Switch failure for single node
Failure Condition |
Impact / Cluster State |
Single Node Down |
Cluster will be still operational with remaining 2 node. |
Single Link down for any of the network link |
Cluster will continue to operate normal. Services will be distributed only if cluster link goes down. |
Single switch goes down |
Cluster will be still operational with remaining 2 node. |
Physical Topology Option-3 (For Data Center type environment)
This topology is similar to Option-2 except you can have 3 Layer-2 switch connecting to gateway. All information is similar to Option-2.

Physical Topology Option-4 (Not Recommended)
Picture-4 provides the connectivity where 2 nodes connected to same switch while other node is connected to a different switch. This topology is least recommended as failure to switch which has multiple links connected can bring down the cluster.

Above topology provides following types of failure scenario where cluster will still be operational.
1. Single Node failure
2. Enterprise network link failure for single node
3. Cluster Link Failure for single node
4. Service Failure for single node
Above topology will not be able to manage complete switch down for any of the network link.
Failure Condition |
Impact / Cluster State |
Single Node Down |
Cluster will be still operational with remaining 2 node. |
Single Link down for any of the network link except cluster link |
Cluster will continue to operate normal. |
Single Cluster Link down |
Services will be distributed to other two nodes and continue operation. |
Single switch goes down |
Cluster can go down if switch which has multiple link goes down. |
Some additional failure scenarios and state covered in Administrator Guide for Cisco DNA Center 1.2.10