Configuring high availability requires two identical ASAs connected to each other through a dedicated failover link and, optionally, a Stateful Failover link. The health of the active interfaces and units is monitored to determine if specific failover conditions are met. If those conditions are met, failover occurs.
The ASA supports two failover configurations, Active/Active failover and Active/Standby failover. Each failover configuration has its own method for determining and performing failover.
With Active/Active failover, both units can pass network traffic. This also lets you configure traffic sharing on your network. Active/Active failover is available only on units running in multiple context mode.
With Active/Standby failover, only one unit passes traffic while the other unit waits in a standby state. Active/Standby failover is available on units running in either single or multiple context mode.
Both failover configurations support stateful or stateless (regular) failover.
Note When the ASA is configured for Active/Active Stateful Failover, you cannot enable IPsec or SSL VPN. Therefore, these features are unavailable. VPN failover is available for Active/Standby failover configurations only.
Failover System Requirements
This section describes the hardware, software, and license requirements for ASAs in a failover configuration.
The two units in a failover configuration must be the same model, have the same number and types of interfaces, the same SSMs installed (if any), and the same RAM installed.
If you are using units with different flash memory sizes in your failover configuration, make sure the unit with the smaller flash memory has enough space to accommodate the software image files and the configuration files. If it does not, configuration synchronization from the unit with the larger flash memory to the unit with the smaller flash memory will fail.
The two units in a failover configuration must be in the same operating modes (routed or transparent, single or multiple context). They must have the same major (first number) and minor (second number) software version. However, you can use different versions of the software during an upgrade process; for example, you can upgrade one unit from Version 8.3(1) to Version 8.3(2) and have failover remain active. We recommend upgrading both units to the same version to ensure long-term compatibility.
The two units in a failover configuration do not need to have identical licenses; the licenses combine to make a failover cluster license. See the “Failover or ASA Cluster Licenses” section for more information.
Failover and Stateful Failover Links
This section describes the failover and the Stateful Failover links, which are dedicated connections between the two units in a failover configuration. This section includes the following topics:
The two units in a failover pair constantly communicate over a failover link to determine the operating status of each unit. The following information is communicated over the failover link:
The unit state (active or standby)
Hello messages (keep-alives)
Network link status
MAC address exchange
Configuration replication and synchronization
Caution All information sent over the failover and Stateful Failover links is sent in clear text unless you secure the communication with a failover key. If the ASA is used to terminate VPN tunnels, this information includes any usernames, passwords and preshared keys used for establishing the tunnels. Transmitting this sensitive data in clear text could pose a significant security risk. We recommend securing the failover communication with a failover key if you are using the ASA to terminate VPN tunnels.
You can use any unused interface on the device as the failover link; however, you cannot specify an interface that is currently configured with a name. The failover link interface is not configured as a normal networking interface; it exists for failover communication only. This interface should only be used for the failover link (and optionally for the Stateful Failover link).
Connect the failover link in one of the following two ways:
Using a switch, with no other device on the same network segment (broadcast domain or VLAN) as the failover interfaces of the ASA.
Using a crossover Ethernet cable to connect the appliances directly, without the need for an external switch.
Note When you use a crossover cable for the failover link, if the interface fails, the link is brought down on both peers. This condition may hamper troubleshooting efforts because you cannot easily determine which interface failed and caused the link to come down.
Note The ASA supports Auto-MDI/MDIX on its copper Ethernet ports, so you can either use a crossover cable or a straight-through cable. If you use a straight-through cable, the interface automatically detects the cable and swaps one of the transmit/receive pairs to MDIX.
Although you can configure the failover and failover state links on a port channel link, this port channel cannot be shared with other firewall traffic.
Stateful Failover Link
To use Stateful Failover, you must configure a Stateful Failover link to pass all state information. You have three options for configuring a Stateful Failover link:
You can use a dedicated Ethernet interface for the Stateful Failover link.
You can share the failover link.
You can share a regular data interface, such as the inside interface. However, this option is not recommended.
Connect a dedicated state link in one of the following two ways:
Using a switch, with no other device on the same network segment (broadcast domain or VLAN) as the failover interfaces of the ASA.
Using a crossover Ethernet cable to connect the appliances directly, without the need for an external switch.
Note When you use a crossover cable for the state link, if the interface fails, the link is brought down on both peers. This condition may hamper troubleshooting efforts because you cannot easily determine which interface failed and caused the link to come down.
The ASA supports Auto-MDI/MDIX on its copper Ethernet ports, so you can either use a crossover cable or a straight-through cable. If you use a straight-through cable, the interface automatically detects the cable and swaps one of the transmit/receive pairs to MDIX.
Enable the PortFast option on Cisco switch ports that connect directly to the ASA.
If you use a data interface as the Stateful Failover link, you receive the following warning when you specify that interface as the Stateful Failover link:
Sharing a data interface with the Stateful Failover interface can leave you vulnerable to replay attacks. Additionally, large amounts of Stateful Failover traffic may be sent on the interface, causing performance problems on that network segment.
Note Using a data interface as the Stateful Failover interface is supported in single context, routed mode only.
In multiple context mode, the Stateful Failover link resides in the system context. This interface and the failover interface are the only interfaces in the system context. All other interfaces are allocated to and configured from within security contexts.
Note The IP address and MAC address for the Stateful Failover link does not change at failover unless the Stateful Failover link is configured on a regular data interface.
Caution All information sent over the failover and Stateful Failover links is sent in clear text unless you secure the communication with a failover key. If the ASA is used to terminate VPN tunnels, this information includes any usernames, passwords, and preshared keys used for establishing the tunnels. Transmitting this sensitive data in clear text could pose a significant security risk. We recommend securing the failover communication with a failover key if you are using the ASA to terminate VPN tunnels.
Failover Interface Speed for Stateful Links
If you use the failover link as the Stateful Failover link, you should use the fastest Ethernet interface available. If you experience performance problems on that interface, consider dedicating a separate interface for the Stateful Failover interface.
Use the following failover interface speed guidelines for the ASAs:
Cisco ASA 5510
– Stateful link speed can be 100 Mbps, even though the data interface can operate at 1 Gigabit due to the CPU speed limitation.
Cisco ASA 5520/5540/5550
– Stateful link speed should match the fastest data link.
Cisco ASA 5580/5585-X
– Use only non-management 1 Gigabit ports for the stateful link because management ports have lower performance and cannot meet the performance requirement for Stateful Failover.
For optimum performance when using long distance failover, the latency for the failover link should be less than 10 milliseconds and no more than 250 milliseconds. If latency is more than10 milliseconds, some performance degradation occurs due to retransmission of failover messages.
The ASA supports sharing of failover heartbeat and stateful link, but we recommend using a separate heartbeat link on systems with high Stateful Failover traffic.
Avoiding Interrupted Failover Links
Because the uses failover interfaces to transport messages between primary and secondary units, if a failover interface is down (that is, the physical link is down or the switch used to connect the interface is down), then the ASA failover operation is affected until the health of the failover interface is restored.
In the event that all communication is cut off between the units in a failover pair, both units go into the active state, which is expected behavior. When communication is restored and the two active units resume communication through the failover link or through any monitored interface, the primary unit remains active, and the secondary unit immediately returns to the standby state. This relationship is established regardless of the health of the primary unit.
Because of this behavior, stateful flows that were passed properly by the secondary active unit during the network split are now interrupted. To avoid this interruption, failover links and data interfaces should travel through different paths to decrease the chance that all links fail at the same time. In the event that only one failover link is down, the ASA takes a sample of the interface health, exchanges this information with its peer through the data interface, and performs a switchover if the active unit has a greater number of down interfaces. Subsequently, the failover operation is suspended until the health of the failover link is restored.
Depending upon their network topologies, several primary/secondary failure scenarios exist in ASA failover pairs, as shown in the following scenarios.
Scenario 1—Not Recommended
If a single switch or a set of switches are used to connect both failover and data interfaces between two ASAs, then when a switch or inter-switch-link is down, both ASAs become active. Therefore, the following two connection methods shown in Figure 1-1 and Figure 1-2 are NOT recommended.
Figure 1-1 Connecting with a Single Switch—Not Recommended
Figure 1-2 Connecting with a Double Switch—Not Recommended
To make the ASA failover pair resistant to failover interface failure, we recommend that failover interfaces NOT use the same switch as the data interfaces, as shown in the preceding connections. Instead, use a different switch or use a direct cable to connect two ASA failover interfaces, as shown in Figure 1-3 and Figure 1-4.
Figure 1-3 Connecting with a Different Switch
Figure 1-4 Connecting with a Cable
If the ASA data interfaces are connected to more than one set of switches, then a failover interface can be connected to one of the switches, preferably the switch on the secure side of network, as shown in Figure 1-5.
Figure 1-5 Connecting with a Secure Switch
The most reliable failover configurations use a redundant interface on the failover interface, as shown in Figure 1-6 and Figure 1-7.
Figure 1-6 Connecting with Redundant Interfaces
Figure 1-7 Connecting with Inter-switch Links
Active/Active and Active/Standby Failover
Two types of failover configurations are supported by the ASA: Active/Standby and Active/Active.
In Active/Standby failover, one unit is the active unit. It passes traffic. The standby unit does not actively pass traffic. When a failover occurs, the active unit fails over to the standby unit, which then becomes active. You can use Active/Standby failover for ASAs in single or multiple context mode, although it is most commonly used for ASAs in single context mode.
Active/Active failover is only available to ASAs in multiple context mode. In an Active/Active failover configuration, both ASAs can pass network traffic. In Active/Active failover, you divide the security contexts on the ASA into
. A failover group is simply a logical group of one or more security contexts. Each group is assigned to be active on a specific ASA in the failover pair. When a failover occurs, it occurs at the failover group level.
For more detailed information about each type of failover, refer the following information:
When a failover occurs, all active connections are dropped. Clients need to reestablish connections when the new active unit takes over.
Note Some configuration elements for clientless SSL VPN (such as bookmarks and customization) use the VPN failover subsystem, which is part of Stateful Failover. You must use Stateful Failover to synchronize these elements between the members of the failover pair. Stateless (regular) failover is not recommended for clientless SSL VPN.
When Stateful Failover is enabled, the active unit continually passes per-connection state information to the standby unit. After a failover occurs, the same connection information is available at the new active unit. Supported end-user applications are not required to reconnect to keep the same communication session.
In Version 8.4 and later, Stateful Failover participates in dynamic routing protocols, like OSPF and EIGRP, so routes that are learned through dynamic routing protocols on the active unit are maintained in a Routing Information Base (RIB) table on the standby unit. Upon a failover event, packets travel normally with minimal disruption to traffic because the Active secondary ASA initially has rules that mirror the primary ASA. Immediately after failover, the re-convergence timer starts on the newly Active unit. Then the epoch number for the RIB table increments. During re-convergence, OSPF and EIGRP routes become updated with a new epoch number. Once the timer is expired, stale route entries (determined by the epoch number) are removed from the table. The RIB then contains the newest routing protocol forwarding information on the newly Active unit.
The following state information is passed to the standby ASA when Stateful Failover is enabled:
NAT translation table
TCP connection states
UDP connection states
The ARP table
The Layer 2 bridge table (when running in transparent firewall mode)
The HTTP connection states (if HTTP replication is enabled)
The ISAKMP and IPsec SA table
GTP PDP connection database
SIP signalling sessions
ICMP connection state. ICMP connection replication is enabled only if the respective interface is assigned to an asymmetric routing group.
The following state information is
passed to the standby ASA when Stateful Failover is enabled:
The HTTP connection table (unless HTTP replication is enabled).
The user authentication (uauth) table.
Inspected protocols are subject to advanced TCP-state tracking, and the TCP state of these connections is not automatically replicated. While these connections are replicated to the standby unit, there is a best-effort attempt to re-establish a TCP state.
DHCP server address leases.
State information for modules.
Stateful Failover for phone proxy. When the active unit goes down, the call fails, media stops flowing, and the phone should unregister from the failed unit and reregister with the active unit. The call must be re-established.
The following clientless SSL VPN features are not supported with Stateful Failover:
IPv6 clientless or Anyconnect sessions
Citrix authentication (Citrix users must reauthenticate after failover)
Note If failover occurs during an active Cisco IP SoftPhone session, the call remains active because the call session state information is replicated to the standby unit. When the call is terminated, the IP SoftPhone client loses connection with the Cisco CallManager. This occurs because there is no session information for the CTIQBE hangup message on the standby unit. When the IP SoftPhone client does not receive a response back from the Call Manager within a certain time period, it considers the CallManager unreachable and unregisters itself.
For VPN failover, VPN end-users should not have to reauthenticate or reconnect the VPN session in the event of a failover. However, applications operating over the VPN connection could lose packets during the failover process and not recover from the packet loss.
Intra- and Inter-Chassis Module Placement for the ASA Services Module
You can place the primary and secondary ASASMs within the same switch or in two separate switches. The following sections describe each option:
If you install the secondary ASASM in the same switch as the primary ASASM, you protect against module-level failure. To protect against switch-level failure, as well as module-level failure, see the “Inter-Chassis Failover” section.
Even though both ASASMs are assigned the same VLANs, only the active module takes part in networking. The standby module does not pass any traffic.
Figure 1-8 shows a typical intra-switch configuration.
Figure 1-8 Intra-Switch Failover
To protect against switch-level failure, you can install the secondary ASASM in a separate switch. The ASASM does not coordinate failover directly with the switch, but it works harmoniously with the switch failover operation. See the switch documentation to configure failover for the switch.
To accommodate the failover communications between ASASMs, we recommend that you configure a trunk port between the two switches that carries the failover and state VLANs. The trunk ensures that failover communication between the two units is subject to minimal failure risk.
For other VLANs, you must ensure that both switches have access to all firewall VLANs, and that monitored VLANs can successfully pass hello packets between both switches.
Figure 1-9 shows a typical switch and ASASM redundancy configuration. The trunk between the two switches carries the failover ASASM VLANs (VLANs 10 and 11).
Note ASASM failover is independent of the switch failover operation; however, ASASM works in any switch failover scenario.
Figure 1-9 Normal Operation
If the primary ASASM fails, then the secondary ASASM becomes active and successfully passes the firewall VLANs (Figure 1-10).
Figure 1-10 ASASM Failure
If the entire switch fails, as well as the ASASM (such as in a power failure), then both the switch and the ASASM fail over to their secondary units (Figure 1-11).
Figure 1-11 Switch Failure
Transparent Firewall Mode Requirements
When the active unit fails over to the standby unit, the connected switch port running Spanning Tree Protocol (STP) can go into a blocking state for 30 to 50 seconds when it senses the topology change. To avoid traffic loss while the port is in a blocking state, you can configure one of the following workarounds depending on the switch port mode:
Access mode—Enable the STP PortFast feature on the switch:
The PortFast feature immediately transitions the port into STP forwarding mode upon linkup. The port still participates in STP. So if the port is to be a part of the loop, the port eventually transitions into STP blocking mode.
Trunk mode—Block BPDUs on the ASA on both the inside and outside interfaces:
access-list id ethertype deny bpdu
access-group id in interface inside_name
access-groupid in interface outside_name
Blocking BPDUs disables STP on the switch. Be sure not to have any loops involving the ASA in your network layout.
If neither of the above options are possible, then you can use one of the following less desirable workarounds that impacts failover functionality or STP stability:
Disable failover interface monitoring.
Increase failover interface holdtime to a high value that will allow STP to converge before the ASAs fail over.
Decrease STP timers to allow STP to converge faster than the failover interface holdtime.
Auto Update Server Support in Failover Configurations
You can use the Auto Update Server to deploy software images and configuration files to ASAs in an Active/Standby failover configuration. To enable Auto Update on an Active/Standby failover configuration, enter the Auto Update Server configuration on the primary unit in the failover pair.
The following restrictions and behaviors apply to Auto Update Server support in failover configurations:
Only single mode, Active/Standby configurations are supported.
When loading a new platform software image, the failover pair stops passing traffic.
When using LAN-based failover, new configurations must not change the failover link configuration. If they do, communication between the units will fail.
Only the primary unit will perform the call home to the Auto Update Server. The primary unit must be in the active state to call home. If it is not, the ASA automatically fails over to the primary unit.
Only the primary unit downloads the software image or configuration file. The software image or configuration is then copied to the secondary unit.
The interface MAC address and hardware-serial ID is from the primary unit.
The configuration file stored on the Auto Update Server or HTTP server is for the primary unit only.
Auto Update Process Overview
The following is an overview of the Auto Update process in failover configurations. This process assumes that failover is enabled and operational. The Auto Update process cannot occur if the units are synchronizing configurations, if the standby unit is in the failed state for any reason other than SSM card failure, or if the failover link is down.
1. Both units exchange the platform and ASDM software checksum and version information.
2. The primary unit contacts the Auto Update Server. If the primary unit is not in the active state, the ASA first fails over to the primary unit and then contacts the Auto Update Server.
3. The Auto Update Server replies with software checksum and URL information.
4. If the primary unit determines that the platform image file needs to be updated for either the active or standby unit, the following occurs:
a. The primary unit retrieves the appropriate files from the HTTP server using the URL from the Auto Update Server.
b. The primary unit copies the image to the standby unit and then updates the image on itself.
c. If both units have new image, the secondary (standby) unit is reloaded first.
– If hitless upgrade can be performed when secondary unit boots, then the secondary unit becomes the active unit and the primary unit reloads. The primary unit becomes the active unit when it has finished loading.
– If hitless upgrade cannot be performed when the standby unit boots, then both units reload at the same time.
d. If only the secondary (standby) unit has new image, then only the secondary unit reloads. The primary unit waits until the secondary unit finishes reloading.
e. If only the primary (active) unit has new image, the secondary unit becomes the active unit, and the primary unit reloads.
f. The update process starts again at Step 1.
5. If the ASA determines that the ASDM file needs to be updated for either the primary or secondary unit, the following occurs:
a. The primary unit retrieves the ASDM image file from the HTTP server using the URL provided by the Auto Update Server.
b. The primary unit copies the ASDM image to the standby unit, if needed.
c. The primary unit updates the ASDM image on itself.
d. The update process starts again at Step 1.
6. If the primary unit determines that the configuration needs to be updated, the following occurs:
a. The primary unit retrieves the configuration file from the using the specified URL.
b. The new configuration replaces the old configuration on both units simultaneously.
c. The update process begins again at Step 1.
7. If the checksums match for all image and configuration files, no updates are required. The process ends until the next poll time.
Monitoring the Auto Update Process
You can use the
debug auto-update client
debug fover cmd-exe
commands to display the actions performed during the Auto Update process. The following is sample output from the
debug auto-update client
commands from a terminal session.
Auto-update client: Sent DeviceDetails to /cgi-bin/dda.pl of server 192.168.0.21
Auto-update client: Processing UpdateInfo from server 192.168.0.21
The ASA determines the health of the other unit by monitoring the failover link. When a unit does not receive three consecutive hello messages on the failover link, the unit sends interface hello messages on each interface, including the failover interface, to validate whether or not the peer interface is responsive. The action that the ASA takes depends upon the response from the other unit. See the following possible actions:
If the ASA receives a response on the failover interface, then it does not fail over.
If the ASA does not receive a response on the failover link, but it does receive a response on another interface, then the unit does not failover. The failover link is marked as failed. You should restore the failover link as soon as possible because the unit cannot fail over to the standby while the failover link is down.
If the ASA does not receive a response on any interface, then the standby unit switches to active mode and classifies the other unit as failed.
You can configure the frequency of the hello messages and the hold time before failover occurs. A faster poll time and shorter hold time speed the detection of unit failures and make failover occur more quickly, but it can also cause “false” failures due to network congestion delaying the keepalive packets.
You can monitor up to 250 interfaces divided between all contexts. You should monitor important interfaces. For example, you might configure one context to monitor a shared interface. (Because the interface is shared, all contexts benefit from the monitoring.)
When a unit does not receive hello messages on a monitored interface for half of the configured hold time, it runs the following tests:
1. Link Up/Down test—A test of the interface status. If the Link Up/Down test indicates that the interface is operational, then the ASA performs network tests. The purpose of these tests is to generate network traffic to determine which (if either) unit has failed. At the start of each test, each unit clears its received packet count for its interfaces. At the conclusion of each test, each unit looks to see if it has received any traffic. If it has, the interface is considered operational. If one unit receives traffic for a test and the other unit does not, the unit that received no traffic is considered failed. If neither unit has received traffic, then the next test is used.
2. Network Activity test—A received network activity test. The unit counts all received packets for up to 5 seconds. If any packets are received at any time during this interval, the interface is considered operational and testing stops. If no traffic is received, the ARP test begins.
3.ARP test—A reading of the unit ARP cache for the 2 most recently acquired entries. One at a time, the unit sends ARP requests to these machines, attempting to stimulate network traffic. After each request, the unit counts all received traffic for up to 5 seconds. If traffic is received, the interface is considered operational. If no traffic is received, an ARP request is sent to the next machine. If at the end of the list no traffic has been received, the ping test begins.
4. Broadcast Ping test—A ping test that consists of sending out a broadcast ping request. The unit then counts all received packets for up to 5 seconds. If any packets are received at any time during this interval, the interface is considered operational and testing stops.
If an interface has IPv4 and IPv6 addresses configured on it, the ASA uses the IPv4 addresses to perform the health monitoring.
If an interface has only IPv6 addresses configured on it, then the ASA uses IPv6 neighbor discovery instead of ARP to perform the health monitoring tests. For the broadcast ping test, the ASA uses the IPv6 all nodes address (FE02::1).
If all network tests fail for an interface, but this interface on the other unit continues to successfully pass traffic, then the interface is considered to be failed. If the threshold for failed interfaces is met, then a failover occurs. If the other unit interface also fails all the network tests, then both interfaces go into the “Unknown” state and do not count towards the failover limit.
An interface becomes operational again if it receives any traffic. A failed ASA returns to standby mode if the interface failure threshold is no longer met.
Note If a failed unit does not recover and you believe it should not be failed, you can reset the state by entering the failover reset command. If the failover condition persists, however, the unit will fail again.
Table 1-2 shows the minimum, default, and maximum failover times.
Table 1-2 Cisco ASA 5500 Series ASA Failover Times
Active unit loses power or stops normal operation.
Active unit main board interface link down.
Active unit 4GE module interface link down.
Active unit IPS or CSC module fails.
Active unit interface up, but connection problem causes interface testing.
When a failover occurs, both ASAs send out system messages. This section includes the following topics:
The ASA issues a number of system messages related to failover at priority level 2, which indicates a critical condition. To view these messages, see the
syslog messages guide
guide. To enable logging, see Chapter1, “Configuring Logging”
Note During switchover, failover logically shuts down and then bring up interfaces, generating syslog messages 411001 and 411002. This is normal activity.
To see debug messages, enter the
debug fover command. See the command reference for more information.
Note Because debugging output is assigned high priority in the CPU process, it can drastically affect system performance. For this reason, use the debug fover commands only to troubleshoot specific problems or during troubleshooting sessions with Cisco TAC.
To receive SNMP syslog traps for failover, configure the SNMP agent to send SNMP traps to SNMP management stations, define a syslog host, and compile the Cisco syslog MIB into your SNMP management station. See Chapter 1, “Configuring SNMP” for more information.