The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
The HA/DR certification test plan validates that the Security Manager application is highly available and can survive various hardware and software failures. The test plan also covers maintenance activities, such as manually switching the application between servers.
Note Security Manager client sessions require active users to log in again after an application failover. This behavior is equivalent to stopping and starting Security Manager services running on the server.
The following test case categories are contained in this appendix:
This section covers two different types of manual switches. In a single cluster with two servers, you can switch between the two servers in the cluster (intracluster switch); in a dual cluster configuration with a single server in each cluster, you can switch between clusters (intercluster switch).
This section contains the following topics:
Test Case Title: Manual application switch within a cluster.
Description: The application is manually switched to a different server in the same cluster using VCS.
Test Setup: A dual node cluster (Local Redundancy HA Configuration) in a single cluster configuration.
Step 1 Ensure that the APP service group is running on the primary server. Using the VCS Cluster Explorer, select the APP service group. From the shortcut menu, select Switch To, and choose the secondary server. Alternatively, issue the following command:
Step 2 From the Resource view of the APP service group, observe that the resources in the service group go offline on the primary server and then come online on the secondary server. Or issue the following command to observe the status of the APP service group.
Step 3 From a client machine, launch the Security Manager client, using the virtual hostname or IP address in the Server Name field of the login dialog box. Verify that you can log in to the application successfully.
Test Case Title: Manual application switch between clusters.
Description: The application is manually switched to a server in a different cluster using VCS.
Test Setup: A dual cluster configuration as shown in Geographic Redundancy (DR) Configuration with a single server in each cluster.
Step 1 Using the VCS Cluster Explorer, select the APP service group. From the shortcut menu, select Switch To, then Remote Switch(...), to open the Switch global dialog box. In the dialog box, specify the remote cluster and, if desired, a specific server in the remote cluster. Alternatively, issue the following command:
Step 2 From the Resource view of the APP service group, observe that the resources in the service group go offline in the primary cluster. Select the root cluster node in the tree and use the Remote Cluster Status view to see that the APP service group goes online on the remote cluster. Or issue the following command to observe the status of the APP service group.
Step 3 From a client machine, launch the Security Manager client by entering the appropriate hostname or application IP address used in the secondary cluster in the Server Name field of the Login dialog box. Verify that you can successfully log in to the application.
Step 4 Log out of the Security Manager client, and then switch the APP service group to the primary cluster using either the VCS Cluster Explorer or the following command:
HA/DR configurations have two types of server Ethernet connections. The first are the Ethernet connections used for network communications (public interfaces); the second are Ethernet interfaces dedicated for intracluster communications (private interfaces). This section covers failure test cases for each type of Ethernet interface.
This section describes the tests used to verify that VCS can detect failure of the network Ethernet ports used for network communications. This section contains the following topics:
Test Case Title: A failure occurs in the network Ethernet connection on the secondary server in a single cluster configuration.
Description: This test case verifies that VCS can detect a failure on the network Ethernet port on the secondary server and then recover after the failure is repaired.
Test Setup: A dual node cluster (Local Redundancy HA Configuration) in a single cluster configuration with a single network connection per server.
Step 1 Verify that the application is running on the primary server.
Step 2 Log in to the application from a client machine.
Step 3 Remove the Ethernet cable from the network port on the secondary server to isolate the server from communicating with the switch/router network. Wait for at least 60 seconds for VCS to detect the network port failure. Verify that VCS detects a failure of the NIC resource on the secondary server by running the following command:
Step 4 Restore the Ethernet cable to the network port on the secondary server. Verify that VCS detects that the failure was cleared by running the following command:
Test Case Title: A failure occurs in the network Ethernet connection on the primary server in a single cluster configuration.
Description: This test case verifies that VCS can detect a failure on the network Ethernet port of the primary server and automatically switch the application to the secondary server. After the problem is fixed, you can switch the application back to the primary server manually.
Test Setup: A dual node cluster (Ethernet and Storage Connections for a Dual-Node Site) with a single network connection per server.
Step 1 Verify that the application is running on the primary server.
Step 2 Remove the Ethernet cable from the network port on the primary server to isolate the server from communicating with the switch/router network. Verify that VCS detects a failure of the NIC resource and automatically switches the APP service group to the secondary server:
Step 3 Verify that you can log in to the application while it is running on the secondary server.
Step 4 Replace the Ethernet cable on the network port of the primary server and manually clear the faulted IP resource on the primary server:
Step 5 Manually switch the APP service group back to the primary server.
Test Case Title: A failure occurs in the network Ethernet connection on the secondary server in a dual cluster configuration.
Description: This test case verifies that VCS can detect a failure on the network Ethernet port and then recover after the failure is repaired.
Test Setup: A dual cluster configuration (Geographic Redundancy (DR) Configuration) with a single node in each cluster and a single Ethernet network connection for each server.
Step 1 Verify that the APP service group is running on the primary cluster/server.
Step 2 Log in to the Security Manager from a client machine.
Step 3 Remove the Ethernet cable from the network port on the server in the secondary cluster. This isolates the server from communicating with the switch/router network and interrupts replication. From the primary server, verify that replication was interrupted (disconnected) by running the following command:
Step 4 Run the following command from the primary server to verify that communication with the secondary cluster was lost:
Step 5 Reattach the network Ethernet cable to the secondary server and verify that replication resumed.
Step 6 Verify that communications to the secondary cluster has been restored.
Step 7 If replication has not recovered you may need to manually clear the IP resource if it has faulted and then start the APPrep service group on the secondary as follows:
Test Case Title: A failure occurs in the network Ethernet connection on the primary server.
Description: This test case verifies that VCS can detect a failure on the primary server network Ethernet port and can recover by starting the application on the secondary server. After the Ethernet connection is restored, you can manually fail over back to the original primary server, retaining any data changes that were made while running on the secondary.
Test Setup: A dual cluster configuration (Geographic Redundancy (DR) Configuration) with a single node in each cluster.
Step 1 Verify that the APP service group is running on the primary cluster.
Step 2 Remove the network Ethernet cable from the port on the server in the primary cluster to isolate the server from communicating with the switch/router network. VCS should detect this as a failure of the IP and NIC resources. Verify that VCS detected the failure and brought down the APP service group.
Step 3 Start the APP service group on the secondary cluster using the following command on the secondary server:
Step 4 From your client machine, log in to Security Manager to verify that it is operational. Change some data so that you can verify that changes are retained when you switch back to the primary server.
Step 5 Reconnect the network Ethernet cable to the primary cluster server.
Step 6 Clear any faults on the IP resource and turn on the APPrep service from the primary server:
Step 7 Convert the original primary RVG to secondary and synchronize the data volumes in the original primary RVG with the data volumes on the new primary RVG using the fast failback feature. Using the Cluster Explorer for the secondary cluster, right-click the RVGPrimary resource (APP_RVGPrimary), select actions, then select fbsync from the Actions dialog box, and then click OK. Alternatively, you can issue the following command:
Step 8 Using the VCS Cluster Explorer on the secondary cluster, select the APP service group. From the short-cut menu, select Switch To, then Remote Switch(...), to open the Switch global dialog box. In the dialog box, specify the primary cluster and the primary server. Alternatively, issue the following command:
Step 9 Log in to the application to verify that the changes you made on the secondary server were retained.
Test Case Title: Failures occur in the Ethernet used for cluster communication.
Description: The dedicated Ethernet connections used between servers in the cluster for intracluster communication fail. The test verifies that the cluster communications continue to function when up to two of the three redundant communication paths are lost.
Test Setup: A dual-node cluster (Local Redundancy HA Configuration) in a single cluster configuration, with two dedicated cluster communication Ethernet connections and a low-priority cluster communication connection configured on the network Ethernet connection.
Note In addition to the commands given in this test case, you can monitor the status of the cluster communications from the Cluster Explorer by selecting the root node in the tree and selecting the System Connectivity tab.
Step 1 Issue the following command to verify that all systems are communicating through GAB.
Note Group Membership Services/Atomic Broadcast (GAB) is a VCS protocol responsible for cluster membership and cluster communications.
Step 2 Remove the Ethernet cable from the first dedicated Ethernet port used for cluster communication on the primary server.
Step 3 Issue the following command to view the detailed status of the links used for cluster communication and verify that the first dedicated cluster communication port is down.
Note The asterisk (*) in the output indicates the server on which the command is run. The server where the command is run always shows its links up, even if one or more of those ports are the ones that are physically disconnected.
Step 4 If you configured a low-priority heartbeat link on the network interface, remove the Ethernet cable from the second dedicated Ethernet port used for cluster communication on the primary server.
Step 5 Issue the following command to verify that all systems are communicating through GAB. Also confirm that both servers in the cluster are now in a Jeopardy state, since each server has only one heartbeat working.
Step 6 Issue the following command to view the detailed status of the links used for cluster communication and verify that the second dedicated Ethernet port for cluster communications on the primary server is down.
Step 7 Replace the Ethernet cable on the second dedicated Ethernet port for cluster communications on the primary server.
Step 8 Verify that the Jeopardy condition was removed by issuing the following command:
Step 9 Replace the Ethernet cable on the first dedicated Ethernet port for cluster communications on the primary server.
This section covers causing server failures by removing the power from the server to cause a failure. Four cases are covered:
Test Case Title: The standby server in a single cluster configuration fails.
Description: This test case verifies that the application running in the primary server is unaffected and that after the standby server is repaired, the application can successfully rejoin the cluster configuration.
Test Setup: A dual node cluster (Ethernet and Storage Connections for a Dual-Node Site) with two dedicated cluster communication Ethernet connections and a low-priority cluster communication connection on the network Ethernet connection.
Step 1 Verify that the application is running on the primary server in the cluster.
Step 2 Remove the power for the secondary server and verify that VCS detected the failure and that the application continues to operate on the primary server.
Step 3 Reapply power and boot the secondary server. After the server recovers, verify that it rejoined the cluster in a healthy state by running the following command. The output should be identical to the output in Step 1.
Test Case Title: The primary server in a single cluster fails.
Description: This test case verifies that if a primary server fails, the application starts running on the secondary server and that after the primary server is restored, the application can be reestablished on the primary server.
Test Setup: A dual node cluster (Local Redundancy HA Configuration).
Step 1 Verify that the APP service group is running on the primary server in the cluster by examining the output of the following command:
Step 2 Remove the power from the primary server and verify that VCS detected the failure and that the APP service group automatically moved to the secondary server.
Step 3 Verify that you can successfully log in to Security Manager from a client machine.
Step 4 Restore the power to the primary server and verify that the server can rejoin the cluster in a healthy condition. Run the following command. The output should be identical to the output in Step 1.
Step 5 Manually switch the APP service group back to the primary server.
Test Case Title: The standby server in a dual cluster configuration fails.
Description: This test case verifies that an application running in the primary cluster is unaffected by a standby server failure and that after the standby server is repaired, the application can successfully rejoin the dual cluster configuration.
Test Setup: A dual cluster configuration, with replication (Geographic Redundancy (DR) Configuration), with a single node in each cluster.
Step 1 Verify that the APP and ClusterService service groups are running in the primary cluster by running the following command on the primary server:
Step 2 Remove the power from the secondary server and verify that the primary cluster detects a loss of communication to the secondary cluster:
Step 3 Restore the power to the secondary server. After the server restarts, verify that the primary cluster reestablished communications with the secondary cluster by running the following command. The output should be identical to the output in Step 1.
Step 4 Verify that the replication is operational and consistent by running the following command:
Test Case Title: The primary server in a dual cluster configuration fails.
Description: This test case verifies that if a primary server fails, the application starts running on the secondary server and that after the primary server is restored, the application can be reestablished on the primary server.
Test Setup: A dual cluster configuration, with replication (Geographic Redundancy (DR) Configuration), with a single node in each cluster.
Step 1 Verify that the APP and ClusterService service groups are running in the primary cluster by running the following command from the secondary server:
Step 2 Remove the power from the primary server to cause a server failure. Verify that the secondary cluster reported a loss of connectivity to the primary cluster.
Step 3 Confirm that the state of the replication is disconnected. You can see this state from the flags parameter in the output of the following command:
Step 4 Start the application on the secondary server by using the following command.
Step 5 Log in to the application and change some data so that you can verify later that changes made while the application operating on the secondary server can be retained when you revert to the primary server.
Step 6 Restore power to the primary server and allow the server to fully start up.
Step 7 Verify the status of the replication to show that the replication is connected; however, the two sides are not synchronized.
Step 8 Convert the original primary RVG to secondary and synchronize the data volumes in the original primary RVG with the data volumes on the new primary RVG using the fast failback feature. Using the Cluster Explorer for the secondary cluster, right-click the RVGPrimary resource (APP_RVGPrimary), select actions, then select fbsync from the Actions dialog box, and then click OK. Alternatively you can issue the following command:
Step 9 Verify that the current secondary (former primary) is synchronized with the current primary (former secondary) by looking for the keyword consistent in the flags parameter of the output of the following command:
Step 10 Using the VCS Cluster Explorer on the secondary cluster, select the APP service group. From the shortcut menu, select Switch To, then Remote Switch(...) to open the Switch global dialog box. In the dialog box specify the primary cluster and the primary server. Alternately issue the following command, where primarycluster is the name of the primary cluster:
Step 11 Log in to the application to verify that the changes you made on the secondary server were retained.
This section covers test cases where the Security Manager application fails. Two cases are covered: a single cluster configuration and a dual cluster configuration. This section contains the following topics:
Test Case Title: The application fails on the primary server in a single cluster configuration.
Description: This test case verifies that VCS detects an application failure and that VCS automatically moves the application to the secondary server.
Test Setup: A dual node cluster (Local Redundancy HA Configuration) using the default application failover behavior.
Step 1 Verify that the APP service group is running on the primary server in the cluster by running the following command:
Step 2 On the server where Security Manager is running, stop the application by issuing the following command:
Step 3 Verify that VCS detects that Security Manager failed on the primary server and starts the application on the secondary server.
Step 4 Manually clear the fault on the APP service group.
Step 5 Manually switch the APP service group back to the primary server.
Test Case Title: The application fails on the primary server in a dual cluster configuration.
Description: This test case verifies that VCS detects an application failure.
Test Setup: A dual cluster configuration, with replication (Geographic Redundancy (DR) Configuration), with a single node in each cluster. Likewise, the assumption is that the default application failover behavior has not been modified (that is, failover between clusters requires manual intervention).
Step 1 Verify that the APP and ClusterService service groups are running in the primary cluster by running the following command from the primary server:
Step 2 On the server where Security Manager is running, stop the application by issuing the following command:
Step 3 Verify that VCS detects that the application failed and stops the APP service group. Issue the following command and observe the output.
Step 4 Manually clear the fault on the APP service group.
Step 5 Put the APP service group online on the primary server to restart the application.