Installing VTS in High Availability Mode

This chapter provides detailed information about installing VTS in high availability (HA) mode. It details the procedure to enable VTS L2.

See Enabling VTS L2 High Availability for the detailed procedure to enable VTS L2 HA.

Important Notes regarding updating the cluster.conf file:
  • master_name and slave_name can not be the same

  • master_network_interface and slave_network_interface are interface names of VTC1 and VTC2 where the real IP resides. They should be the same.

  • If you are using VTF's, fill in vip_private and private_network_interface fields. Otherwise, leave these two fields blank.

  • Private_network_interface is the secondary interface names of VTC1 and VTC2 on the private network that VTF is also on.

  • vip_private is the vip for the VTS master's private interface.

  • private_gateway is the gateway for the private network of your vip_private.

This chapter has the following sections.

Enabling VTS L2 High Availability

To enable VTC L2 HA, VTC1 and VTC2 must be on the same subnet.

Spawn two VTC VMs. At a minimum, you would need to have 3 IP addresses for VTC. One for VTC1, One for VTC2, one for the public Virtual IP (VIP). If you are using VTFs, you will also need one for the private VIP, which other devices on the private network such as the VTF can reach.

Note


Cisco VTS supports dual stack clusters for L2 HA. Have both the VTCs (vts01 and vts02) installed and configured with IPv6 & IPv4 address for dual stack to be supported. Both of the VTCs should be reachable by any means with IPv6 address or IPv4 address.



Note


Before enabling HA, make sure that both VTC 1 and VTC 2 have the same password. If not, go to the VTC GUI and do a change password on newly brought up VTC, to make the password identical with that of the other VTC . When you upgrade a VTC / bring up a new VTC / do a hardware upgrade of VTC host, you should make sure that password is the same.


Enabling VTS L2 HA involves:

Setting up the VTC Environment

You need to set up the VTC environment before you run the high availability script.


    Step 1   Create a copy of cluster.conf file from cluster.conf.tmpl, which is under the /opt/vts/etc directory. For example:
    admin@vts01:~$ cd /opt/vts/etc
    
    admin@vts01:~$ copy cluster.conf.tmpl cluster.conf
    Step 2   Specify the VIP address and the details of the two nodes in cluster.conf file . For example:
    admin@vts01:/var/# cd /opt/vts/etc/
    
    admin@vts01/etc# vi cluster.conf
    
     
    
    ###Virtual Ip of VTC Master on the public interface. Must fill in at least 1
    vip_public=172.23.92.202
    vip_public_ipv6=2001:420:10e:2015:c00::202
    
     
    
    ###VTC1 Information. Must fill in at least 1 ip address
    master_name=vts01
    master_ip=172.23.92.200
    master_ipv6=2001:420:10e:2015:c00::200
    
     
    
    ###VTC2 Information. Must fill in at least 1 ip address
    slave_name=vts02
    slave_ip=172.23.92.201
    slave_ipv6=2001:420:10e:2015:c00:201
    
     
    
    ###In the event that a network failure occurs evenly between the two routers, the cluster needs an outside ip to determine where the failure lies
    ###This can be any external ip such as your vmm ip or a dns but it is recommended to be a stable ip within your environment
    ###Must fill in at least 1 ip address
    external_ip=171.70.168.183
    external_ipv6=2001:420:200:1::a
    Note   

    The two nodes communicate each other using VIP IP, and user can use VIP address to login to Cisco VTS UI. You will be directly logged in to the master node, when you use VIP IP address. Make sure that you specify the correct host name, IP Address, and interface type.


    Enabling VTC High Availability

    You must run the cluster_install.sh script on both VTCs to enable high availability.


      Step 1   Run the cluster installer script /opt/vts/bin/cluster_install.sh on both VTC1 and VTC2 . For example:
      admin@vts02:/opt/vts/etc$ sudo su  - 
      
      [sudo] password for admin:
      
      root@vts02:/opt/vts/etc$ cd ../bin
      
      root@vts02:/opt/vts/bin# ./cluster_install.sh
      172.23.92.200 vts01
      172.23.92.201 vts02
      2001:420:10e:2015:c00::200 vts01
      2001:420:10e:2015:c00::201 vts02
      
      Change made to ncs.conf file. Need to restart ncs
      
      Created symlink from /etc/systemd/system/multi-user.target.wants/pacemaker.service to /lib/systemd/system/pacemaker.service.
      
      Created symlink from /etc/systemd/system/multi-user.target.wants/corosync.service to /lib/systemd/system/corosync.service.
      
      Both nodes are online. Configuring master
      
      Configuring Pacemaker resources
      
      Master node configuration finished
      
      HA cluster is installed
      
      Step 2   Check the status on both the nodes to verify whether both nodes online, and node which got installed first is the master, and the other, slave. For example:
      admin@vts02:/opt/vts/log/nso$ sudo crm status
      
      [sudo] password for admin:
      
      Last updated: Mon Apr 10 18:43:52 2017          Last change: Mon Apr 10 17:15:21 2017 by root via crm_attribute on vts01
      
      Stack: corosync
      
      Current DC: vts01 (version 1.1.14-70404b0) - partition with quorum
      
      2 nodes and 4 resources configured
      
       
      
      Online: [ vts01 vts02 ]
      
       
      
      Full list of resources:
      
       
      
      Master/Slave Set: ms_vtc_ha [vtc_ha]
      
           Masters: [ vts02 ]
      
           Slaves: [ vts01 ]
      
      ClusterIP      (ocf::heartbeat:IPaddr2):       Started vts02
      
      ClusterIPV6    (ocf::heartbeat:IPaddr2):       Started vts02

      Enabling VTSR High Availability

      You need to enable VTSR HA, if you have VTFs in your setup. For information about enabling VTSR HA, see Installing VTSR in High Availability Mode.

      Registering vCenter to VTC

      To do this:


        Step 1   Log in to VCSA.
        Step 2   Go to Networking > Distributed Virtual Switch > Manage > VTS.
        Note    For vCenter 6.5, the VTS comes under Configure tab.
        Step 3   Click on System Configuration
        Step 4   Enter the following:
        • VTS IP—This is the Virtual public IP address.

        • VTS GUI Username

        • VTS GUI Password

        Step 5   Click Update.

        Switching Over Between Master and Slave Nodes

        There are two of ways to switch over from Master to Slave node.

        • Restart the nso service on the Master. The switchover happens automatically. For example:

          admin@vts02:/opt/vts/log/nso$ sudo service nso restart
           
          
          admin@vts02:/opt/vts/log/nso$ sudo crm status
          
          [sudo] password for admin:
          
          Last updated: Mon Apr 10 18:43:52 2017          Last change: Mon Apr 10 17:15:21 2017 by root via crm_attribute on vts01
          
          Stack: corosync
          
          Current DC: vts01 (version 1.1.14-70404b0) - partition with quorum
          
          2 nodes and 4 resources configured
          
           
          
          Online: [ vts01 vts02 ]
          
           
          
          Full list of resources:
          
           
          
          Master/Slave Set: ms_vtc_ha [vtc_ha]
          
               Masters: [ vts01 ]
          
               Slaves: [ vts02 ]
          
          ClusterIP      (ocf::heartbeat:IPaddr2):       Started vts01
          
          ClusterIPV6    (ocf::heartbeat:IPaddr2):       Started vts01
          
          
          
           
          Or,
        • Set the Master node to standby, and then bring it online.

          In the below example, vts02 is initially the Master, which is then switched over to the Slave role.

          admin@vts01:~$ sudo crm node standby
          [sudo] password for admin:
          
           
          
          admin@vts01:/opt/vts/log/nso$ sudo crm status
          
          [sudo] password for admin:
          
          Last updated: Mon Apr 10 18:43:52 2017          Last change: Mon Apr 10 17:15:21 2017 by root via crm_attribute on vts01
          
          Stack: corosync
          
          Current DC: vts01 (version 1.1.14-70404b0) - partition with quorum
          
          2 nodes and 4 resources configured
          
           
          
          Node vts01 standby
          Online: [ vts02 ]
          
           
          
          Full list of resources:
          
           
          
          Master/Slave Set: ms_vtc_ha [vtc_ha]
          
               Masters: [ vts02 ]
          
               Stopped: [ vts01 ]
          
          ClusterIP      (ocf::heartbeat:IPaddr2):       Started vts02
          
          ClusterIPV6    (ocf::heartbeat:IPaddr2):       Started vts02
          
           
          admin@vts01~$ sudo crm node online
          
           
          
          admin@vts02:/opt/vts/log/nso$ sudo crm status
          
          [sudo] password for admin:
          
          Last updated: Mon Apr 10 18:43:52 2017          Last change: Mon Apr 10 17:15:21 2017 by root via crm_attribute on vts01
          
          Stack: corosync
          
          Current DC: vts01 (version 1.1.14-70404b0) - partition with quorum
          
          2 nodes and 4 resources configured
          
           
          
          Online: [ vts01 vts02 ]
          
           
          
          Full list of resources:
          
           
          
          Master/Slave Set: ms_vtc_ha [vtc_ha]
          
               Masters: [ vts02 ]
          
               Slaves: [ vts01 ]
          
          ClusterIP      (ocf::heartbeat:IPaddr2):       Started vts02
          
          ClusterIPV6    (ocf::heartbeat:IPaddr2):       Started vts02

        Uninstalling VTC High Availability

        To move VTC back to its pre-High Availability state, run the following script:

        Note


        Make sure the ncs server is active/running. Then run this script on both the active and standby nodes.


        root@vts02:/opt/vts/bin# ./cluster_uninstall.sh
        This will move HA configuration on this system back to pre-installed state. Proceed?(y/n) y

        Troubleshooting Password Change Issues

        If a password change is performed while the VTS Active and Standby were up, and the change does not get applied to the Standby, the changed password will not get updated in the /opt/vts/etc/credentials file on the Standby. Due to this, when VTS Standby VM is brought up, it cannot connect to NCS. CRM_MON shows the state as shutdown for Standby, and it does not come online.

        To troubleshoot this:

          Step 1   Copy the /opt/vts/etc/credentials file from the VTC Active to the same location ( /opt/vts/etc/credentials) on the VTC Standby node.
          Step 2   Run the crm node command on VTC Standby to bring it online.
          crm node online VTC2
          Step 3   Run the command crm status to show both VTC1 and VTC2 online.
          crm status

          Installing VTSR in High Availability Mode

          VTSR high availability mode needs to be enabled before you install VTF(s) in your set up. The second VTSR will not get registered to the VTC if it starts up after VTF installation .

          Enabling VTSR high availability involves:

          The system automatically detects which VM is the Master and which is the slave, based on the information you provide while generating the ISO files.

          Verifying VTSR HA Setup

          You can check the VTSR HA status using the crm_mon command. For example:
           root@vtsr01:/opt/cisco/package# crm_mon -Afr1
          Last updated: Fri Apr 14 06:04:40 2017          Last change: Thu Apr 13 23:08:25 2017 by hacluster via crmd on vtsr01
          Stack: corosync
          Current DC: vtsr02 (version 1.1.14-70404b0) - partition with quorum
          2 nodes and 11 resources configured
          
          Online: [ vtsr01 vtsr02 ]
          
          Full list of resources:
          
          dl_server      (ocf::heartbeat:anything):      Started vtsr01
          Clone Set: cfg_dl_clone [cfg_dl]
               Started: [ vtsr01 vtsr02 ]
          Clone Set: rc_clone [rc]
               Started: [ vtsr01 vtsr02 ]
          Clone Set: confd_clone [confd]
               Started: [ vtsr01 vtsr02 ]
          Clone Set: mping_clone [mgmt_ping]
               Started: [ vtsr01 vtsr02 ]
          Clone Set: uping_clone [underlay_ping]
               Started: [ vtsr01 vtsr02 ]
          
          Node Attributes:
          * Node vtsr01:
              + mping                             : 100       
              + uping                             : 100       
          * Node vtsr02:
              + mping                             : 100       
              + uping                             : 100       
          
          Migration Summary:
          * Node vtsr02:
          * Node vtsr01:
          

          Hight Availability Scenarios

          This section describes the various HA scenarios.

          Manual Failover

          To do a manual failover:


            Step 1   Run sudo crm node standby on the current VTC Active to force a failover to the Standby node.
            Step 2   Verify the other VTC to check whether it has taken over the Active role.
            Step 3   On the earlier Active, run crm node online to bring it back to be part of the cluster again.

            VTC Master Reboot

            When the VTC Active reboots, much like a manual failover, the other VTC takes over as the Active. After coming up out of the reboot, the old Active VTC will automatically come up as the Standby.

            Split Brain

            When there is a network break and both VTCs are still up, VTC HA attempts to ascertain where the network break lies. During the network failure, the Active and Standby will lose connectivity with each other. At this point, the Active will attempt to contact the external ip (a parameter set during the initial configuration) to see if it still has outside connectivity.

            If it cannot reach the external ip, VTC cannot know if the Standby node is down or if it has promoted itself to Active. As a result, it will shut itself down to avoid having two Active nodes.

            The Standby, upon sensing the loss of connectivity with the Active, tries to promote itself to the Active mode. But first, it will check if it has external connectivity. If it does, it will become the Active node. However, if it also cannot reach the external ip (for instance if a common router is down), it will shut down.

            At this point, the VTC that had the network break cannot tell if the other VTC is Active and receiving transactions. When the network break is resolved, it will be able to do the comparison and the VTC with the latest database will become Active.

            If the other VTC also has a network break or is not available, the agent will not be able to do the comparison still, and it will wait. If the other VTC is not be available for some time, you may force the available VTC to be master:
            admin@vtc1:/home/admin# sudo /opt/vts/bin/force_master.py

            Double Failure

            When both VTC are down at the same time, a double failure scenario has occurred. After a VTC has come up, it does not immediately know the state of the other VTC's database. Consequently, before HA is resumed, an agent runs in the background trying to compare the two databases. When both systems have recovered, it will be able to do the comparison and the VTC with latest database will become the Active.

            If the other VTC is not be available for some time, you may force the available VTC to be master:
            admin@vtc1:/home/admin# sudo /opt/vts/bin/force_master.py