Introduction

High availability has been a requirement on wireless controllers to minimize downtime in live networks. This document provides information on the theory of operation and configuration for the Catalyst 9800 Wireless Controller as it pertains to supporting stateful switchover of access points and clients (AP and Client SSO). Catalyst 9800 Wireless Controller is the next generation wireless controller that can run on multiple platforms with different scalability goals from low to high scale. AP and Client SSO is supported on the physical appliances and the virtual cloud platforms of the Catalyst 9800 Wireless Controller, namely C9800-40-K9, C9800-80-K9 and C9800-CL-K9 on ESXi and KVM. The underlying SSO functionality is the same on all platforms with some differences in the setup process.

Overview

The High availability SSO capability on wireless controller allows the access point to establish a CAPWAP tunnel with the Active wireless controller and the Active wireless controller to share a mirror copy of the AP and client database with the Standby wireless controller. The APs do not go into the Discovery state and clients do not disconnect when the Active wireless controller fails and the Standby wireless controller takes over the network as the Active wireless controller. There is only one CAPWAP tunnel maintained at a time between the APs and the wireless controller that is in an Active state.

Release 16.10 supports full access point and Client Stateful Switch Over. Client SSO is supported for clients which have already completed the authentication and DHCP phase and have started passing traffic. With Client SSO, a client's information is synced to the Standby wireless controller when the client associates to the wireless controller or the client’s parameters change. Fully authenticated clients, i.e. the ones in Run state, are synced to the Standby and thus, client re-association is avoided on switchover making the failover seamless for the APs as well as for the clients, resulting in zero client service downtime and zero SSID outage. The overall goal for the addition of AP and client SSO support to the Catalyst 9800 Wireless controller is to reduce major downtime in wireless networks due to failure conditions that may occur due to box failover, network failover or power outage on the primary site.

Feature Description and Functional Behavior

All the control plane activities are centralized and synchronized between the active and standby units. The Active Controller centrally manages all the control and management communication. The network control data traffic is transparently switched from the standby unit to the active unit for centralized processing.

Bulk and Incremental configuration is synced between the two controllers at run-time and both controllers share the same IP address on the management interface. The CAPWAP state of the Access Points that are in Run State is also synched from the active wireless controller to the Hot-Standby wireless controller allowing the Access Points to be state-fully switched over when the Active wireless controller fails. The APs do not go to the Discovery state when Active wireless controller fails, and Standby wireless controller takes over as the Active wireless controller to serve the network.

The two units form a peer connection through a dedicated RP port (this can be a physical copper or fiber port) or a virtual interface for the VM. The Active/Standby election happens at boot time and it’s either based on the highest priority (priority can be set from 1 to 15) or the lowest MAC if the priority is the same. By default the C9800 has a priority of 1. Once the HA pair is formed, all the configuration and AP and client databases are synched between Active and standby. Any configuration is done on the Active is automatically synch to the Standby. The standby is continuously monitoring the Active via keepalives over the RP link. If the Active becomes unavailable, the standby assumes the role of Active. It does that by sending a Gratuitous ARP message advertising to the network that it now owns that wireless management IP address. All the configurations and databases are already in synch, so the standby can take over without service disruption.

There is no pre-empt functionality with SSO meaning that when the previous Active wireless controller resumes operation, it will not take back the role as an Active wireless controller but will negotiate its state with the current Active wireless controller and transition to Hot-Standby state.

Platforms supported

  • Cisco Catalyst C9800-40-K9 Wireless Controller

  • Cisco Catalyst C9800-80-K9 Wireless Controller

  • Cisco Catalyst C9800-CL-K9 Wireless Controller on ESXi

  • Cisco Catalyst C9800-CL-K9 Wireless Controller on KVM


Note

  • HA Pair can only be form between two wireless controllers of the same form factor.

  • Both controllers must be running the same software version in order to form the HA Pair.


SSO Pre-requisites

  • HA Pair can only be form between two wireless controllers of the same form factor

  • Both controllers must be running the same software version in order to form the HA Pair

  • Maximum RP link latency = 80 ms RTT, minimum bandwidth = 60 Mbps and minimum MTU = 1500


Note

During HA-SSO mode, the Cisco Catalyst 9800 Series Wireless Controller supports port channel mode=ON.

That is, you need to use the Port Channel mode=ON for the controller and switch ports connected to the controller.


SSO on Cisco Catalyst C9800-40-K9 and C9800-80-K9 wireless controllers

The Cisco C9800-40-K9 wireless controller is an extensible and high performing wireless controller, which can scale up to 2000 access points and 32000 clients. The controller has four 10G data ports and a throughput of 40G.

S.No.

Description

1

RP— RJ-45 1G redundancy Ethernet port.

2

Gigabit SFP RP port

The Cisco C9800-80-K9 Wireless Controller is a 100G wireless controller that occupies two rack unit space and supports a pluggable Module slot, and eight built-in 10GE/1GE interfaces.

S.No.

Description

1

RP— RJ-45 1G redundancy Ethernet port.

2

Gigabit SFP RP port.

Both C9800-40-K9 and C9800-80-K9 Wireless controllers have two RP Ports as shown in the figures above:

  • RJ-45 Ethernet Redundancy port

  • SFP Gigabit Ethernet Port

If both the Redundancy Ports are connected,

  • SFP Gigabit Ethernet port takes precedence if they are connected at same time.

  • HA between RJ-45 and SFP Gigabit RP ports is not supported.

  • Only Cisco supported SFPs (GLC-LH-SMD and GLC-SX-MMD) are supported for RP port

  • When HA link is up through RJ-45, SFPs on HA port should not be inserted even if there is no link between them. As it is a physical level detection, this would cause the HA to go down as precedence is given to SFP

Physical Connectivity for C9800-40-K9 and C9800-80-K9 wireless controller HA SSO

The HA Pair always has one active controller and one standby controller. If the active controller becomes unavailable, the standby assumes the role of the active. The Active wireless controller creates and updates all the wireless information and constantly synchronizes that information with the standby controller. If the active wireless controller fails, the standby wireless controller assumes the role of the active wireless controller and continues to the keep the HA Pair operational. Access Points and clients continue to remain connected during an active-to-standby switchover.

Connecting C9800 Wireless Controllers using RJ-45 RP Port for SSO

Connecting C9800 Wireless Controllers using SFP Gigabit RP Port for SSO

Connecting C9800 wireless controller HA pair to upstream switch

Option 1: C9800 controllers with port channel connected to a VSS pair

  • Use Ether channel from each wireless controller to Distribution VSS (Virtual Switching System)

  • Spread the links in each EC among the two physical switches: this will prevent a wireless controller switchover upon a failure of one of the VSS switch

  • Same considerations for connecting to a single Distribution switch apply

Option 2: C9800 with port channel connected to HSRP routers and RP connected to uplink

  • C9800 devices are connected to 2 HSRP routers (Active and Standby). The uplink is a port-channel

  • RP connected to the respective uplink routers

  • Failover of HSRP Active to Standby induces a switchover of C9800 HA pair since the RP port is connected thru the same HSRP routers as uplink

  • The AP/Clients are up after an SSO. It is a seamless transition and there are no drops on the AP and client

  • With static port channel (LACP mode on only) providing the link level redundancy and RP port via uplink providing the upstream check, this topology is highly available and is recommended


Note

At FCS Cisco recommends connecting the RP port through the same switch (or VSS pair) where the uplinks are connected and not directly (back-toback) between the two controllers.


SSO on Cisco Catalyst C9800-CL-K9 running on ESXi and KVM

The Virtual Catalyst 9800 Wireless controller can be deployed as an HA Pair in a single or dual server setup.

The figure on the left shows Redundant port connected on the same UCS/vSwitch

The figure on the right shows Redundant port L2 connected to a separate UCS server

Configuring High Availability SSO using GUI

Device redundancy can be configured from the Administration > Device > Redundancy page

On the Active controller, the priority is set to a higher value than the standby controller. The wireless controller with the higher priority value is selected as the active during the active-standby election process. The Remote IP is the IP address of the standby controller’s redundancy port IP.

On the standby controller, the remote IP is set to the Active controller’s redundancy port IP.

Configuring High Availability SSO using CLI

  • On Virtual Catalyst 9800 Wireless controller, enable High Availability SSO using the following command on each of the two virtual Catalyst 9800 Wireless controller instances

chassis ha-interface <RP interface> local-ip <local IP> <local IP subnet> remote-ip  <remote IP>

e.g.

On Virtual Catalyst 9800 Wireless controller instance-1:

chassis ha-interface Gig 3 local-ip 172.23.174.85 /24 remote-ip  172.23.174.86   

On Virtual Catalyst 9800 Wireless controller instance-2:

chassis ha-interface Gig 3 local-ip 172.23.174.86 /24 remote-ip  172.23.174.85  
  • On C9800-40-K9 and C9800-80-K9 wireless controller, enable High Availability SSO using the following command on each of the two wireless controller units

    chassis ha-interface local-ip <local IP> <local IP subnet> remote-ip  <remote IP>
  • Reload both wireless controllers by executing the command reload from the CLI

Active and Standby Election Process

An active C9800 wireless controller retains its role as an Active Controller unless one of the following events occur:

  • The wireless controller HA pair is reset.

  • The active wireless controller is removed from the HA pair.

  • The active wireless controller is reset or powered off.

  • The active wireless controller fails.

The active wireless controller is elected or re-elected based on one of these factors and in the order listed below:

  1. The wireless controller that is currently the active wireless controller.

  2. The wireless controller with the highest priority value.


    Note

    We recommend assigning the highest priority value to the wireless controller C9800 you prefer to be the active controller. This ensures that the controller is re-elected as active controller if a re-election occurs.


    Setting the Switch Priority Value

    chassis chassis -number priority new-priority-number

    Chassis-number Specifies the chassis number and the new priority for the chassis. The chassis number range is 1 to 2.

    The priority value range is 1 to 15.

    Example

    wireless controller#chassis 1 priority 2

    You can display the current priority value by using the show chassis user EXEC command. The new priority value takes effect immediately but does not affect the current Active Controller. The new priority value helps determine which controller is elected as the new Active Controller when the current active wireless controller or HA redundant pair reloads.

  3. The wireless controller with the shortest start-up time.

  4. The wireless controller with the lowest MAC Address.

    The HA LED on the chassis can be used to identify the current Active Controller.

State Transition for HA Pair formation

  1. Active wireless controller in Non Redundant mode

  2. Standby Insertion for HA Pairing

  3. HA Sync in Progress

  4. Terminal State for SSO


Note

Breaking the HA Pair : The HA configuration can be disabled by using the chassis clear command followed by a reload .


Monitoring the HA Pair

Both Active and Standby System can be be monitored from the Management UI of the Active Wireless Controller. This includes information about CPU and memory utilization as well and advanced CPU and memory views .

Navigate to Monitoring > System > Redundancy on the controller Web UI. The Redundancy States page is displayed:

Parameter

Description

My State

Shows the state of the active CPU controller module. Values are as follows:

Active

Standby HOT

Disable

Peer State

Displays the state of the peer (or standby) CPU controller module. Values are as follows:

Standby HOT

Disable

Mode

Displays the current state of the redundancy peer. Values are as follows:

Simplex—Single CPU controller module.

Duplex—Two CPU controller modules.

Unit ID

Displays the unit ID of the CPU controller module.

Redundancy Mode (Operational)

Displays the current operational redundancy mode supported on the unit.

Redundancy Mode (Configured)

Displays the current configured redundancy mode supported on the unit.

Redundancy State

Displays the current functioning redundancy state of the unit. Values are as follows:

SSO

Not Redundant

Manual Swact

Displays whether manual switchovers have been enabled.

Communications

Displays whether communications are up or down between the two controllers.

The same page displays Switchover history. The description for the following parameters are displayed in the table below:

Parameter

Description

Index

Displays the index number of the redundant unit.

Previous Active

Displays the controller that was active prior to switchover.

Current Active

Displays the controller that is currently active.

Switch Over Time

Displays the system time when the switchover occurred.

Switch Over Reason

Displays the cause of the switchover.

Monitoring HA Pair from CLI

The command show chasis displays summary information about the HA Pair, including the MAC address, role, switch priority, and current state of each wireless controller in the redundant HA pair. By default, the Local MAC Address of the HA Pair is the MAC address of the first elected Active Controller.

The show chassis command points to the current C9800 wireless controller on the console using the (*) symbol against the chassis number as shown above.

Verifying Redundancy States

The command show redundancy can be used to monitor the state of the two units

wireless controller#show redundancy ?
  application       box 2 box application information
  clients           Redundancy Facility (RF) client list
  config-sync       Show Redundancy Config Sync status
  counters          Redundancy Facility (RF) operational counters
  domain            Specify the RF domain
  history           Redundancy Facility (RF) history
  idb-sync-history  Redundancy Facility (RF) IDB sync history
  linecard-group    Line card redundancy group information
  rii               Display the redunduncy interface identifier for Box to Box
  states            Redundancy Facility (RF) states
  switchover        Redundancy Facility (RF) switchover
  trace             Redundancy Facility (RF) trace
  |                 Output modifiers
  <cr>              <cr>

The command show redundancy displays the redundant system and the current processor information. The redundant system information includes the system uptime, standby failures, switchover reason, hardware mode, and configured and operating redundancy mode. The current processor information displayed includes the image version, active location, software state, BOOT variable, configuration register value, and uptime in the current state, and so on. The Peer Processor information is only available from the Active Controller.

The command show redundancy states displays all the redundancy states of the active and standby controllers.

Manual Switchover Action (Manual Swact) i.e. the command redundancy force-switchover cannot be executed on the Standby wireless controller and is enabled only on the Active Controller.

Switchover History can be viewed using the following command:

ssss

Accessing standby wireless controller console

The active controller can be accessed through a console connection, Telnet, an SSH, or a Web Browser by using the Management IP address. To use the console on the standby wireless controller, execute the following commands from the active Catalyst 9800 Wireless controller.

conf t
redundancy
main-cpu
standby console enable

The prompt on the Standby console is appended with “-stby” to reflect the Standby wireless controller console as shown below.


Note

The show chassis command points to the current C9800 wireless controller on the console using the (*) symbol against the chassis number as shown above. In this case it is the console of the standby Unit.


Switchover Functionality

Process Failure Switchover

This type of switch over occurs when any of the key processes running on the Active unit fails or crashes. Upon such a failure, the Active unit reloads and the hot Standby takes over and becomes the new Active unit. When the failed system boots up, it will transition to Hot-Standby state. If the Standby unit is not yet in Hot Standby State, both units are reloaded and there will be no SSO. A process failure on the standby (hot or not) will cause it to reload.

Power-fail Switchover

This switchover from the Active to Standby unit is caused due to power failure of the current Active unit. The current Standby unit becomes the new Active unit and when the failed system boots up, it will transition to Hot-Standby state.

Manual Switchover

This is a user initiated forced switchover between the Active and Standby unit. The current Standby unit becomes the new Active unit and when the failed system boots up, it will transition to Hot-Standby state. To perform a manual switchover, execute the redundancy force-switchover command. This command initiates a graceful switchover from the active to the standby controller. The active controller reloads and the standby takes over as the New Active controller.

Failover Process

Standby wireless controller:

An Access Point and client Stateful Switch Over (SSO) implies that all the Access Point and client sessions are switched over state-fully and continue to operate in a network with no loss of sessions, providing improved network availability and reducing service downtime.

Once a redundancy pair is formed, HA is enabled, which means that Access Points and clients continue to remain connected during an active-to-standby switchover.

Verifying AP and Client SSO State Sync

On successful switchover of the standby wireless controller as active, all access points and clients connected to the previously active wireless controller must remain connected to the new Active controller.

This can be verified by executing the commands:

show ap uptime : Verifies that the uptime of the access point after the switchover is not reset.
show wireless client summary: Displays the clients connected to the new Active controller.

SSO Failover Time Metrics

Metrics Metrics

Time

Failure Detection

In the order of 50 ms.

Reconciliation Time (Standby becoming Active)

In the order of 1020 ms.

Upgrading an HA Pair

Upgrade of an HA Pair can be achieved by following the steps listed below:

  1. On successful switchover of the standby wireless controller as active, all access points and clients connected to the previously active wireless controller must remain connected to the new Active controller.

  2. Clone the same active image to standby using copy flash:<image name> stby-bootflash:<image name>

  3. Set boot system and verify to make sure you have entry of both active/standby boot system flash bootflash:<image name >

N+1 with SSO Hybrid deployment

A hybrid topology of SSO redundant pair and N+1 primary, secondary and tertiary model is supported as shown above. The secondary controller at the DR site can be a Catalyst C9800-40-K9, C9800-80-K9 or C9800-CL-K9 Wireless controller. Access points failing back from Catalyst 9800 Wireless controller to CUWN controllers will re-download the code before joining the CUWN wireless controller and vice versa.