Deploy a Cluster for the ASA Virtual in a Public Cloud
Clustering lets you group multiple ASA virtuals together as a single logical device. A cluster provides all the convenience of a single device (management, integration into a network) while achieving the increased throughput and redundancy of multiple devices. You can deploy ASA virtual clusters in a public cloud using the following:
-
Amazon Web Services (AWS) with ASA 9.19+
-
Microsoft Azure with ASA 9.20(2)+
Currently, only routed firewall mode is supported.
Note |
About ASA Virtual Clustering in the Public Cloud
This section describes the clustering architecture and how it works.
How the Cluster Fits into Your Network
The cluster consists of multiple firewalls acting as a single device. To act as a cluster, the firewalls need the following infrastructure:
-
Isolated network for intra-cluster communication, known as the cluster control link, using VXLAN interfaces. VXLANs, which act as Layer 2 virtual networks over Layer 3 physical networks, let the ASA virtual send broadcast/multicast messages over the cluster control link.
-
Load Balancer(s)—For external load balancing, you have the following options:
-
AWS Gateway Load Balancer
The AWS Gateway Load Balancer combines a transparent network gateway and a load balancer that distributes traffic and scales virtual appliances on demand. The ASA virtual supports the Gateway Load Balancer centralized control plane with a distributed data plane (Gateway Load Balancer endpoint) using a Geneve interface single-arm proxy.
-
Equal-Cost Multi-Path Routing (ECMP) using inside and outside routers such as Cisco Cloud Services Router
ECMP routing can forward packets over multiple “best paths” that tie for top place in the routing metric. Like EtherChannel, a hash of source and destination IP addresses and/or source and destination ports can be used to send a packet to one of the next hops. If you use static routes for ECMP routing, then the ASA virtual failure can cause problems; the route continues to be used, and traffic to the failed ASA virtual will be lost. If you use static routes, be sure to use a static route monitoring feature such as Object Tracking. We recommend using dynamic routing protocols to add and remove routes, in which case, you must configure each ASA virtual to participate in dynamic routing.
Note
Layer 2 Spanned EtherChannels are not supported for load balancing.
-
Cluster Nodes
Cluster nodes work together to accomplish the sharing of the security policy and traffic flows. This section describes the nature of each node role.
Bootstrap Configuration
On each device, you configure a minimal bootstrap configuration including the cluster name, cluster control link interface, and other cluster settings. The first node on which you enable clustering typically becomes the control node. When you enable clustering on subsequent nodes, they join the cluster as data nodes.
Control and Data Node Roles
One member of the cluster is the control node. If multiple cluster nodes come online at the same time, the control node is determined by the priority setting in the bootstrap configuration; the priority is set between 1 and 100, where 1 is the highest priority. All other members are data nodes. Typically, when you first create a cluster, the first node you add becomes the control node simply because it is the only node in the cluster so far.
You must perform all configuration (aside from the bootstrap configuration) on the control node only; the configuration is then replicated to the data nodes. In the case of physical assets, such as interfaces, the configuration of the control node is mirrored on all data nodes. For example, if you configure Ethernet 1/2 as the inside interface and Ethernet 1/1 as the outside interface, then these interfaces are also used on the data nodes as inside and outside interfaces.
Some features do not scale in a cluster, and the control node handles all traffic for those features.
Individual Interfaces
You can configure cluster interfaces as Individual interfaces.
Individual interfaces are normal routed interfaces, each with their own local IP address. Interface configuration must be configured only on the control node, and each interface uses DHCP.
Note |
Layer 2 Spanned EtherChannels are not supported. |
Cluster Control Link
Each node must dedicate one interface as a VXLAN (VTEP) interface for the cluster control link.
VXLAN Tunnel Endpoint
VXLAN tunnel endpoint (VTEP) devices perform VXLAN encapsulation and decapsulation. Each VTEP has two interface types: one or more virtual interfaces called VXLAN Network Identifier (VNI) interfaces, and a regular interface called the VTEP source interface that tunnels the VNI interfaces between VTEPs. The VTEP source interface is attached to the transport IP network for VTEP-to-VTEP communication.
VTEP Source Interface
The VTEP source interface is a regular ASA virtual interface with which you plan to associate the VNI interface. You can configure one VTEP source interface to act as the cluster control link. The source interface is reserved for cluster control link use only. Each VTEP source interface has an IP address on the same subnet. This subnet should be isolated from all other traffic, and should include only the cluster control link interfaces.
VNI Interface
A VNI interface is similar to a VLAN interface: it is a virtual interface that keeps network traffic separated on a given physical interface by using tagging. You can only configure one VNI interface. Each VNI interface has an IP address on the same subnet.
Peer VTEPs
Unlike regular VXLAN for data interfaces, which allows a single VTEP peer, The ASA virtual clustering allows you to configure multiple peers.
Cluster Control Link Traffic Overview
Cluster control link traffic includes both control and data traffic.
Control traffic includes:
-
Control node election.
-
Configuration replication.
-
Health monitoring.
Data traffic includes:
-
State replication.
-
Connection ownership queries and data packet forwarding.
Cluster Control Link Failure
If the cluster control link line protocol goes down for a unit, then clustering is disabled; data interfaces are shut down. After you fix the cluster control link, you must manually rejoin the cluster by re-enabling clustering.
Note |
When the ASA virtual becomes inactive, all data interfaces are shut down; only the management-only interface can send and receive traffic. The management interface remains up using the IP address the unit received from DHCP or the cluster IP pool. If you use a cluster IP pool, if you reload and the unit is still inactive in the cluster, then the management interface is not accessible (because it then uses the Main IP address, which is the same as the control node). You must use the console port (if available) for any further configuration. |
Configuration Replication
All nodes in the cluster share a single configuration. You can only make configuration changes on the control node (with the exception of the bootstrap configuration), and changes are automatically synced to all other nodes in the cluster.
ASA Virtual Cluster Management
One of the benefits of using ASA virtual clustering is the ease of management. This section describes how to manage the cluster.
Management Network
We recommend connecting all nodes to a single management network. This network is separate from the cluster control link.
Management Interface
Use the Management 0/0 interface for management.
Note |
You cannot enable dynamic routing for the management interface. You must use a static route. |
You can use either static addressing or DHCP for the management IP address.
If you use static addressing, you can use a Main cluster IP address that is a fixed address for the cluster that always belongs to the current control node. For each interface, you also configure a range of addresses so that each node, including the current control node, can use a Local address from the range. The Main cluster IP address provides consistent management access to an address; when a control node changes, the Main cluster IP address moves to the new control node, so management of the cluster continues seamlessly. The Local IP address is used for routing, and is also useful for troubleshooting. For example, you can manage the cluster by connecting to the Main cluster IP address, which is always attached to the current control node. To manage an individual member, you can connect to the Local IP address. For outbound management traffic such as TFTP or syslog, each node, including the control node, uses the Local IP address to connect to the server.
If you use DHCP, you do not use a pool of Local addresses or have a Main cluster IP address.
Note |
To-the-box traffic needs to be directed to the node's management IP address; to-the-box traffic is not forwarded over the cluster control link to any other node. |
Control Node Management Vs. Data Node Management
All management and monitoring can take place on the control node. From the control node, you can check runtime statistics, resource usage, or other monitoring information of all nodes. You can also issue a command to all nodes in the cluster, and replicate the console messages from data nodes to the control node.
You can monitor data nodes directly if desired. Although also available from the control node, you can perform file management on data nodes (including backing up the configuration and updating images). The following functions are not available from the control node:
-
Monitoring per-node cluster-specific statistics.
-
Syslog monitoring per node (except for syslogs sent to the console when console replication is enabled).
-
SNMP
-
NetFlow
Crypto Key Replication
When you create a crypto key on the control node, the key is replicated to all data nodes. If you have an SSH session to the Main cluster IP address, you will be disconnected if the control node fails. The new control node uses the same key for SSH connections, so that you do not need to update the cached SSH host key when you reconnect to the new control node.
ASDM Connection Certificate IP Address Mismatch
By default, a self-signed certificate is used for the ASDM connection based on the Local IP address. If you connect to the Main cluster IP address using ASDM, then a warning message about a mismatched IP address might appear because the certificate uses the Local IP address, and not the Main cluster IP address. You can ignore the message and establish the ASDM connection. However, to avoid this type of warning, you can enroll a certificate that contains the Main cluster IP address and all the Local IP addresses from the IP address pool. You can then use this certificate for each cluster member. See https://www.cisco.com/c/en/us/td/docs/security/asdm/identity-cert/cert-install.html for more information.
Licenses for ASA Virtual Clustering
Each cluster node requires the same model license. We recommend using the same number of CPUs and memory for all nodes, or else peformance will be limited on all nodes to match the least capable member. The throughput level will be replicated from the control node to each data node so they match.
Note |
If you deregister the ASA virtual so that it is unlicensed, then it will revert to a severely rate-limited state if you reload the ASA virtual. An unlicensed, low performing cluster node will impact the performance of the entire cluster negatively. Be sure to keep all cluster nodes licensed, or remove any unlicensed nodes. |
Requirements and Prerequisites for ASA Virtual Clustering
Model Requirements
-
ASAv30, ASAv50, ASAv100
-
The following public cloud services:
-
Amazon Web Services (AWS) with ASA 9.19+
-
Microsoft Azure with ASA 9.20(2)+
-
-
Maximum 16 nodes
See also the general requirements for the ASA virtual in the ASA Virtual Getting Started Guide.
Hardware and Software Requirements
All nodes in a cluster:
-
Must be the same performance tier. We recommend using the same number of CPUs and memory for all nodes, or else peformance will be limited on all nodes to match the least capable node.
-
Must run the identical software except at the time of an image upgrade. Hitless upgrade is supported. Mismatched software versions can lead to poor performance, so be sure to upgrade all nodes in the same maintenance window.
-
Single Availability Zone deployment supported.
-
Cluster control link interfaces must be in the same subnet, so the cluster should be deployed in the same subnet.
MTU
Make sure the ports connected to the cluster control link have the correct (higher) MTU configured. If there is an MTU mismatch, the cluster formation will fail. The cluster control link MTU should be 154 bytes higher than the data interfaces. Because the cluster control link traffic includes data packet forwarding, the cluster control link needs to accommodate the entire size of a data packet plus cluster traffic overhead (100 bytes) plus VXLAN overhead (54 bytes).
For AWS with GWLB, the data interface uses Geneve encapsulation. In this case, the entire Ethernet datagram is being encapsulated, so the new packet is larger and requires a larger MTU. You should set the source interface MTU to be the network MTU + 306 bytes. So for the standard 1500 MTU network path, the source interface MTU should be 1806, and the cluster control link MTU should be +154, 1960.
For Azure with GWLB, the data interface uses VXLAN encapsulation. In this case, the entire Ethernet datagram is being encapsulated, so the new packet is larger and requires a larger MTU. You should set the source interface MTU to be the network MTU + 54 bytes.
The following table shows the suggested cluster control link MTU and data interface MTU.
Note |
We do not recommend setting the cluster control link MTU between 2561 and 8362; due to block pool handling, this MTU size is not optimal for system operation. |
Public Cloud |
Cluster Control Link MTU |
Data Interface MTU |
---|---|---|
AWS with GWLB |
1960 |
1806 |
AWS |
1654 |
1500 |
Azure with GWLB |
1554 |
1454 |
Azure |
1554 |
1400 |
Guidelines for ASA Virtual Clustering
High Availability
High Availability is not supported with clustering.
IPv6
The cluster control link is only supported using IPv4.
Additional Guidelines
-
When significant topology changes occur (such as adding or removing an EtherChannel interface, enabling or disabling an interface on the ASA virtual or the switch, adding an additional switch to form a redundant switch system) you should disable the health check feature and also disable interface monitoring for the disabled interfaces. When the topology change is complete, and the configuration change is synced to all units, you can re-enable the interface health check feature.
-
When adding a node to an existing cluster, or when reloading a node, there will be a temporary, limited packet/connection drop; this is expected behavior. In some cases, the dropped packets can hang your connection; for example, dropping a FIN/ACK packet for an FTP connection will make the FTP client hang. In this case, you need to reestablish the FTP connection.
-
For decrypted TLS/SSL connections, the decryption states are not synchronized, and if the connection owner fails, then decrypted connections will be reset. New connections will need to be established to a new node. Connections that are not decrypted (they match a do-not-decrypt rule) are not affected and are replicated correctly.
-
Dynamic scaling is not supported.
-
Perform a global deployment after the completion of each maintenance window.
-
Ensure that you do not remove more than one device at a time from the auto scale group (AWS) or scale set (Azure). We also recommend that you run the cluster disable command on the device before removing the device from the auto scale group (AWS) or scale set (Azure).
-
If you want to disable data nodes and the control node in a cluster, we recommend that you disable the data nodes before disabling the control node. If a control node is disabled while there are other data nodes in the cluster, one of the data nodes has to be promoted to be the control node. Note that the role change could disturb the cluster.
-
In the day 0 configuration scripts given in this guide, you can change the IP addresses as per your requirement, provide custom interface names, and change the sequence of the CCL-Link interface.
Defaults for Clustering
-
The cLACP system ID is auto-generated, and the system priority is 1 by default.
-
The cluster health check feature is enabled by default with the holdtime of 3 seconds. Interface health monitoring is enabled on all interfaces by default.
-
The cluster auto-rejoin feature for a failed cluster control link is unlimited attempts every 5 minutes.
-
The cluster auto-rejoin feature for a failed data interface is 3 attempts every 5 minutes, with the increasing interval set to 2.
-
Connection replication delay of 5 seconds is enabled by default for HTTP traffic.
Deploy the Cluster in AWS
To deploy a cluster in AWS, you can either manually deploy or use CloudFormation templates to deploy a stack. You can use the cluster with AWS Gateway Load Balancer, or with a non-native load-balancer such as the Cisco Cloud Services Router.
AWS Gateway Load Balancer and Geneve Single-Arm Proxy
Note |
This use case is the only currently supported use case for Geneve interfaces. |
The AWS Gateway Load Balancer combines a transparent network gateway and a load balancer that distributes traffic and scales virtual appliances on demand. The ASA Virtual supports the Gateway Load Balancer centralized control plane with a distributed data plane (Gateway Load Balancer endpoint). The following figure shows traffic forwarded to the Gateway Load Balancer from the Gateway Load Balancer endpoint. The Gateway Load Balancer balances traffic among multiple ASA Virtuals, which inspect the traffic before either dropping it or sending it back to the Gateway Load Balancer (U-turn traffic). The Gateway Load Balancer then sends the traffic back to the Gateway Load Balancer endpoint and to the destination.
Sample Topologies
ASA Virtual Clustering with Autoscale in Single and Multiple Availability Zones of an AWS Region
An availability zone is a standalone data center or a set of independent data centers within an AWS region that operate independently. Each zone has its own networking infrastructure, connectivity, and power source ensuring a failure in one zone does not affect others. To improve redundancy and reliability, companies use multiple availability zones in their disaster recovery plans.
Deploying ASA Virtual across multiple availability zones and configuring clustering with dynamic scaling can significantly enhance the availability and scalability of your infrastructure. In addition, utilizing multiple availability zones in the same region can offer extra redundancy and guarantee high availability in the event of a failure.
You can modify the IP allocation mechanism of Cluster Control Link (CCL) to support both single and multiavailability zone deployments of ASA Virtual clusters on AWS. The topologies given below depict both inbound and outbound traffic flow in a single and multiple availability zones with autoscaling ability.
ASA Virtual Clustering with Autoscale in Single Availability Zone
There are two ASA Virtual instances in the cluster that are connected to a GWLB.
Inbound traffic from the internet goes to the GWLB endpoint, which is then transmits the traffic to the GWLB. Traffic is then forwarded to the ASA Virtual cluster. After the traffic is inspected by an ASA Virtual instance in the cluster, it is forwarded to the application VM, App1.
Outbound traffic from App1 is transmitted to the GWLB endpoint > GWLB > ASAv > GWLB > GWLB Endpoint, which then sends it out to the internet.
ASA Virtual Clustering with Autoscale in Multiple Availability Zones with Autoscale Solution
There are two ASA Virtual instances in the cluster in different availability zones that are connected to a GWLB.
Inbound traffic from the internet goes to the GWLB endpoint, which then transmits the traffic to the GWLB. Based on the availability zone, the traffic is then routed to the ASA Virtual cluster. After the traffic is inspected by an ASA Virtual instance in the cluster, it is forwarded to the application VM, App1.
End-to-End Process for Deploying ASA Virtual Cluster on AWS
Template-based Deployment
The following flowchart illustrates the workflow for template-based deployment of the ASA Virtual cluster on AWS.
Manual Deployment
The following flowchart illustrates the workflow for manual deployment of the ASA Virtual cluster on AWS.
Workspace |
Steps |
|
---|---|---|
Local Host | Create day 0 configuration script. | |
AWS Console | Deploy ASA Virtual instance. | |
AWS Console | Attach interfaces to instance. | |
AWS Console | Verify if nodes have joined cluster. | |
AWS Console | Create target group and GWLB; attach target group to the GWLB. | |
AWS Console | Register instances with the target group using data interface IP. |
Templates
The templates given below are available in GitHub. The parameter values are self-explanatory with the parameter names, default values, allowed values, and description, given in the template.
-
infrastructure.yaml – Template for infrastructure deployment.
-
deploy_asav_clustering.yaml – Template for cluster deployment.
Note |
Ensure that you check the list of supported AWS instance types before deploying cluster nodes. This list is found in the deploy_asav_clustering.yaml template, under allowed values for the parameter InstanceType. |
Configure Target Failover for ASA Virtual Clustering with GWLB in AWS
ASA virtual clustering in AWS utilizes the Gateway Load Balancer (GWLB) to balance and forward network packets for inspection to a designated ASA virtual node. The GWLB is designed to continue sending network packets to the target node in the event of a failover or deregistration of that node.
The Target Failover feature in AWS enables GWLB to redirect network packets to a healthy target node in the event of node deregistration during planned maintenance or a target node failure. It takes advantage of the cluster's stateful failover.
Note |
If a target node fails while the GWLB routes traffic using SSH, SCP, CURL, or other protocols, then there may be a delay in redirecting traffic to a healthy target. This delay is caused due to rebalancing and rerouting of traffic flow. |
In AWS, you can configure Target Failover through the AWS ELB API or AWS console.
-
AWS API - In the AWS Elastic Load Balancing (ELB) API - modify-target-group-attributes you can define the flow handling behavior by modifying the following two new parameters.
-
target_failover.on_unhealthy - It defines how the GWLB handles the network flow when the target becomes unhealthy.
-
target_failover.on_deregistration - It defines how the GWLB handles the network flow when the target is deregistered.
The following command shows the sample API parameter configuration of defining these two parameters.
aws elbv2 modify-target-group-attributes \ --target-group-arn arn:aws:elasticloadbalancing:…/my-targets/73e2d6bc24d8a067 \ --attributes \ Key=target_failover.on_unhealthy, Value=rebalance[no_rebalance] \ Key=target_failover.on_deregistration, Value=rebalance[no_rebalance]
For more information, refer TargetGroupAttribute in AWS documentation.
-
-
AWS Console – In the EC2 console, you can enable the Target Failover option on the Target Group page by configuring the following options.
-
Edit Target Groups Attributes
-
Enable Target Failover
-
Verify Rebalance Flows
For more information about how to enable Target Failover, see Enable Target Failover Feature.
-
Deploy Stack in AWS Using a CloudFormation Template
Deploy the stack in AWS using the customized CloudFormation template.
Before you begin
-
You need a Linux computer with Python 3.
Procedure
Step 1 |
Prepare the template. |
Step 2 |
Deploy infrastructure.yaml and note the output values for the cluster deployment. Before deploying the infrastructure stack, it is important to identify the AWS region and the availability zones that will be used. Each AWS region has a different set of availability zones and VPC infrastructure, therefore it is essential to select the correct region and availability zones for your deployment. |
Step 3 |
Upload cluster_layer.zip, configure_asav_cluster.zip, and lifecycle_asav_cluster.zip to the S3 bucket created by infrastructure.yaml. |
Step 4 |
Deploy deploy_asav_clustering.yaml: The status changes from CREATE_IN_PROGRESS to CREATE COMPLETE indicating successful deployment. |
Step 5 |
Verify the cluster deployment by logging in to any one of the nodes and entering the show cluster info command:
|
Autoscale Parameter Configuration
After the deployment is completed, you must specify Minimum, Maximum, and Desired capacity of the ASAv Autoscale group. You must verify the Autoscale functionality.
Procedure
Step 1 |
From the AWS console, choose . |
Step 2 |
Configure Desired capacity, and then set the Scaling limits capacity. |
Step 3 |
Check if the CPU metric data is available and whether scaling is occurring as expected in AWS Cloudwatch alarms. |
Configure IMDSv2 Required Mode in ASA Virtual Clustering by Updating Stack
You can configure the IMDSv2 Required mode for the ASA Virtual Auto Scale group instances that are already deployed on the AWS.
Before you begin
The IMDSv2 Required mode is only supported in ASA Virtual Version 9.20.3 and later. Ensure that your existing instances' version is compatible (upgraded to Version 9.20.3 and later) with IMDSv2 APIs before configuring the IMDSv2 Required mode for your deployment.
Procedure
Step 1 |
From the AWS console, go to CloudFormation and click Stacks. |
Step 2 |
Select the stack of the intially deployed clustering instances. |
Step 3 |
Click Update. |
Step 4 |
In the Update stack page, click Replace existing template. |
Step 5 |
Under Specify template section, click Upload a template file. |
Step 6 |
Choose and upload the template that supports IMDSv2. |
Step 7 |
Provide values for the input parameters in the template. |
Step 8 |
Update the stack. |
Deploy the Cluster in AWS Manually
To deploy the cluster manually, prepare the Day-0 configuration, and deploy each node.
Create Day-0 Configuration for AWS
Provide the bootstrap configuration for each cluster node using the following commands:
Gateway Load Balancer Example
The following running configuration example creates a configuration for a Gateway Load Balancer with one Geneve interface for U-turn traffic and one VXLAN interface for the cluster control link.
cluster interface-mode individual force
policy-map global_policy
class inspection_default
no inspect h323 h225
no inspect h323 ras
no inspect rtsp
no inspect skinny
int m0/0
management-only
nameif management
security-level 100
ip address dhcp setroute
no shut
interface TenGigabitEthernet0/0
nameif geneve-vtep-ifc
security-level 0
ip address dhcp
no shutdown
interface TenGigabitEthernet0/1
nve-only cluster
nameif ccl_link
security-level 0
ip address dhcp
no shutdown
interface vni1
description Clustering Interface
segment-id 1
vtep-nve 1
interface vni2
proxy single-arm
nameif ge
security-level 0
vtep-nve 2
object network ccl_link
range 10.1.90.4 10.1.90.254 //Mandatory user input, use same range on all nodes
object-group network cluster_group
network-object object ccl_link
nve 2
encapsulation geneve
source-interface geneve-vtep-ifc
nve 1
encapsulation vxlan
source-interface ccl_link
peer-group cluster_group
cluster group asav-cluster // Mandatory user input, use same cluster name on all nodes
local-unit 1 //Value in bold here must be unique to each node
cluster-interface vni1 ip 1.1.1.1 255.255.255.0 //Value in bold here must be unique to each node
priority 1
enable noconfirm
mtu geneve-vtep-ifc 1806
mtu ccl_link 1960
aaa authentication listener http geneve-vtep-ifc port 7575 //Use same port number on all nodes
jumbo-frame reservation
wr mem
Note |
For the AWS health check settings, be sure to specify the aaa authentication listener http port you set here. |
Non-Native Load Balancer Example
The following example creates a configuration for use with non-native load balancers with management, inside, and outside interfaces, and a VXLAN interface for the cluster control link.
cluster interface-mode individual force
interface Management0/0
management-only
nameif management
ip address dhcp
interface GigabitEthernet0/0
no shutdown
nameif outside
ip address dhcp
interface GigabitEthernet0/1
no shutdown
nameif inside
ip address dhcp
interface GigabitEthernet0/2
nve-only cluster
nameif ccl_link
ip address dhcp
no shutdown
interface vni1
description Clustering Interface
segment-id 1
vtep-nve 1
jumbo-frame reservation
mtu ccl_link 1654
object network ccl_link
range 10.1.90.4 10.1.90.254 //mandatory user input
object-group network cluster_group
network-object object ccl_link
nve 1
encapsulation vxlan
source-interface ccl_link
peer-group cluster_group
cluster group asav-cluster //mandatory user input
local-unit 1 //mandatory user input
cluster-interface vni1 ip 10.1.1.1 255.255.255.0 //mandatory user input
priority 1
enable
Note |
If you are copying and pasting the configuration given above, ensure that you remove //mandatory user input from the configuration. |
Deploy Cluster Nodes
Deploy the cluster nodes to form a cluster.
Procedure
Step 1 |
Deploy the ASA Virtual instance by using the cluster day 0 configuration with the required number of interfaces - three interfaces if you are using Gateway Load Balancer (GWLB), or four interfaces if you are using non-native load balancer. To do this, in the Configure Instance Details > Advanced Details section, paste the cluster day 0 configuration.
For more information on deploying ASA Virtual on AWS, see Deploy the ASA Virtual on AWS. |
||
Step 2 |
Repeat Step 1 to deploy the required number of additional nodes. |
||
Step 3 |
Use the show cluster info command on the ASA Virtual console to verify if all nodes have successfully joined the cluster. |
||
Step 4 |
Configure the AWS Gateway Load Balancer. |
Enable Target Failover for ASA virtual in AWS
The data interface of ASA virtual is registered to a target group of GWLB in AWS. In the ASA virtual clustering, each instance is associated with a Target Group. The GWLB load balances and sends the traffic to this healthy instance identified or registered as a target node in the target group.
Before you begin
You must have deployed the ASA virtual stack in AWS either by manual method or using CloudFormation templates.
If you are deploying a cluster using a CloudFormation template, you can also enable the Target Failover parameter by assigning the rebalance attribute that is available under the GWLB Configuration section of the cluster deployment file deploy_asav_clustering.yaml. In the template, by default, the value is set to rebalance for this parameter. However, the default value for this parameter is set to no_rebalance on the AWS console.
Where,
-
no_rebalance - GWLB continues to send the network flow to the failed or deregistered target.
-
rebalance - GWLB sends the network flow to another healthy target when the existing target is failed or deregistered.
For information on deploying a stack in AWS, see:
Procedure
Step 1 |
On the AWS console, go to Services > EC2 |
Step 2 |
Click Target Groups to view the target groups page. |
Step 3 |
Choose and open the target group to which the ASA virtual instances IP addresses are registered. The target group details page is displayed. |
Step 4 |
Go to the Attributes menu. |
Step 5 |
Click Edit to edit the attributes. |
Step 6 |
Toggle the Rebalance flows slider button to the right. This enables the target failover to configure GWLB to rebalance and forward the existing network packets to a healthy target node in the event of target failover or deregisteration. |
Deploy the Cluster in Azure
In an Azure service chain, ASA virtual acts as a transparent gateway that can intercept packets between the internet and customer service. The clustering of ASA virtual instances on Azure helps to scale up the throughput of multi-node ASAv’s by abstracting them as a single device.
The ASAv consists of two logical interfaces - an External Interface facing the Internet and an Internal Interface facing customer service. These interfaces are defined on a single Network Interface Card (NIC) of the ASAv by utilizing VXLAN segments in a paired proxy.
About Azure Gateway Load Balancer
Azure Gateway Load Balancer (GWLB), help you balance and manage inbound and outbound traffic by routing through the VXLAN segments to the ASAv for traffic inspection. In an ASAv cluster environment, the Azure GWLB automatically scale up the throughput level of the ASAv nodes depending on the traffic load. The GWLB can ensure symmetrical flows or a consistent route to the network virtual appliance without having to update routes manually. With this capability, the packets can traverse the same network path in both directions.
The following figure shows traffic forwarded to the Azure GWLB from the Public Gateway Load Balancer on the external VXLAN segment. The Gateway Load Balancer primarily balances traffic across among multiple ASAv, which inspects the traffic before either dropping it or sending it back to the GWLB on the internal VXLAN segment. The Azure GWLB then sends the traffic back to the Public Gateway Load Balancer and the destination.
The following figure illustrated the network flow between GWLB and ASAv in Azure.
About Cluster Deployment in Azure
You can use the customized Azure Resource Manager (ARM) template to deploy the Virtual Machine Scale Set for Azure GWLB .
After the cluster deployment, you can configure each node on the cluster either manually by using the day0 configuration or through the Function app on the Azure portal.
Deploy the Cluster Using an Azure Resource Manager Template
Deploy the cluster nodes (virtual machine scale set) so they form a cluster using Azure Resource Manager (ARM) template.
Before you begin
-
To manually create the Azure cluster, you must prepare the configuration text file with the day0 configuration. See Prepare Configuration File for Cluster deployment on Azure.
Procedure
Step 1 |
Prepare the template.
|
Step 2 |
Log into the Azure portal: https://portal.azure.com. |
Step 3 |
Create a Resource group. |
Step 4 |
Create a virtual network with three subnets: Management, Outside, and Cluster Control Link (CCL) for your ASAv cluster.
|
Step 5 |
Add the subnets. |
Step 6 |
Deploy the Custom template.
|
Step 7 |
Configure the Instance details. |
Step 8 |
Enter the required values and then click Review + create. |
Step 9 |
Click Create after the validation is passed. |
Step 10 |
After the instance runs, verify the cluster deployment by logging into any of the nodes and entering the show cluster info command. |
What to do next
Configure the Cluster in Azure
To configure cluster on ASAv nodes in Azure, you can either manually configure using a configuration file or using the Azure Function App. You can use the cluster with native GWLB .
Prepare the Configuration File for Creating Cluster on Azure
You can manually configure a cluster on ASA virtual nodes using the configuration file or the Function App on the Azure portal.
For manual configuration of the cluster on an ASA virtual node, you must have configured the asav-gwlb-cluster-config.txt . In this file, you must define the parameters such as range objects, day0, cluster group name, licensing type and so on that is configured in the ASA virtual node of cluster.
This section explains about creating a cluster configuration file for configuring ASA virtual nodes in Azure with GWLB .
Procedure
Step 1 |
Download the asav-gwlb-cluster-config.txt from the Cisco GitHub repository directory asav-cluster/sample-config-file. |
||
Step 2 |
You can prepare the day0 configuration for cluster creation. The following sample day0 configuration helps you understand the parameters required for cluster creation in Azure with GWLB.
The following is the the sample configuration required for the vni interface.
|
||
Step 3 |
Upload the configuration file to the Azure storage and note the path (URL) of this location. This URL path is required for the manual configuration of the cluster on ASA virtual nodes. |
Configure Cluster using Configuration File Manually
To configure cluster on ASAv nodes in Azure manually using a configuration file.
Before you begin
You must have prepared the configuration file and noted the Azure storage location where it is uploaded. See Prepare Cluster Configuration File for Azure.
Procedure
Step 1 |
Log in to the Azure portal. |
Step 2 |
Open an ASAv instance deployed on Azure. |
Step 3 |
Run the following command to copy the cluster configuration file to the ASAv node by providing the URL of the file that you have uploaded to the Azure storage container. copy <Config File URL> running-config |
Step 4 |
Run the following command to configure the cluster on the ASAv instances
|
Step 5 |
Repeat steps 2 through 4 to configure the cluster on all the ASAv nodes. |
Configure Cluster using Azure Function App
To configure cluster on ASAv nodes in Azure using Azure Function App service.
Procedure
Step 1 |
Log in to the Azure portal. |
Step 2 |
Click the Function App. |
Step 3 |
Create FTPS Credentials by clicking Save. |
Step 4 |
Upload the Cluster_Function.zip file to the function app by executing the following command in the local terminal. curl -X POST -u <Userscope_Username> --data-binary @"Cluster_Function.zip" https://<Function_App_Name>.scm.azurewebsites.net/api/zipdeploy
The function will be uploaded to the Function app. The function will start, and you can see the logs in the storage account’s outqueue. The cluster will be enabled on all the ASAv nodes after the function execution.
|
Troubleshooting ASA Virtual Cluster in Azure
Traffic Issues
If the traffic is not working, then verify the following:
-
Verify the health probe status of the ASA virtual instances with the Loadbalancer is healthy.
If the ASA virtual instance's health probe status is unhealthy, then perform the following:
-
Verify the Static route configured in ASA virtual.
-
Verify default gateway is data subnet's gateway IP.
-
Ensure that the ASA virtual instance receives the health probe traffic.
-
Verify the Access policy configured in the ASA virtual is allowing the health probe traffic.
-
Cluster Issues
If the Cluster is not formed, then verify the following:
-
IP address of the Network Virtualization Endpoint (NVE-only) cluster interface. Ensure that you can ping the NVE-only cluster interface of other nodes.
-
IP address of the NVE-only cluster interfaces are part of the object group. Ensure the NVE is configure with the object group.
-
The cluster interface in the cluster group has the correct VNI interface. This VNI interface has the NVE of the corresponding object group.
-
Each node has its own IP interface, verify that the nodes should be able to ping each other to ensure connectivity between the nodes in a cluster.
-
Verify the CCL subnet's Start and End Addresses mentioned during the template deployment is correct. The starting address must begin with the first available IP address in the subnet. For example, if the subnet is 192.168.1.0/24. The start address should be 192.168.1.4 (The first three IP addresses are reserved by azure)
Role Related Issues
If there is any role-related error while deploying resources again in the same resource group, then perform the following:
When there is any issue related a specific roles, an error message is displayed.
The following is a sample error message.
"error": {
"code": "RoleAssignmentUpdateNotPermitted",
"message": "Tenant ID, application ID, principal ID, and scope are not allowed to be updated.”}
Remove the following roles by executing the following commands from the terminal.
-
Command to remove Storage Queue Data Contributor role:
az role assignment delete --resource-group <Resource Group Name> --role "Storage Queue Data Contributor"
-
Command to remove Contributor role:
az role assignment delete --resource-group <Resource Group Name> --role "Contributor"
Customize the Clustering Operation
You can customize clustering health monitoring, TCP connection replication delay, flow mobility and other optimizations, either as part of the Day 0 configuration or after you deploy the cluster.
Perform these procedures on the control node.
Configure Basic ASA Cluster Parameters
You can customize cluster settings on the control node.
Procedure
Step 1 |
Enter cluster configuration mode: cluster group name |
Step 2 |
(Optional) Enable console replication from data nodes to the control node: console-replicate This feature is disabled by default. The ASA prints out some messages directly to the console for certain critical events. If you enable console replication, data nodes send the console messages to the control node so that you only need to monitor one console port for the cluster. |
Step 3 |
Set the minimum trace level for clustering events: trace-level level Set the minimum level as desired:
|
Step 4 |
Set the keepalive interval for flow state refresh messages (clu_keepalive and clu_update messages) from the flow owner to the director and backup owner. clu-keepalive-interval seconds
You may want to set the interval to be longer than the default to reduce the amount of traffic on the cluster control link. |
Configure Health Monitoring and Auto-Rejoin Settings
This procedure configures node and interface health monitoring.
You might want to disable health monitoring of non-essential interfaces, for example, the management interface. Health monitoring is not performed on VLAN subinterfaces. You cannot configure monitoring for the cluster control link; it is always monitored.
Procedure
Step 1 |
Enter cluster configuration mode. cluster group name Example:
|
Step 2 |
Customize the cluster node health check feature. health-check [holdtime timeout] To determine node health, the ASA cluster nodes send heartbeat messages on the cluster control link to other nodes. If a node does not receive any heartbeat messages from a peer node within the holdtime period, the peer node is considered unresponsive or dead.
When any topology changes occur (such as adding or removing a data interface, enabling or disabling an interface on the ASA or the switch) you should disable the health check feature and also disable interface monitoring for the disabled interfaces (no health-check monitor-interface ). When the topology change is complete, and the configuration change is synced to all nodes, you can re-enable the health check feature. Example:
|
Step 3 |
Disable the interface health check on an interface. no health-check monitor-interface interface_id The interface health check monitors for link failures. The amount of time before the ASA removes a member from the cluster depends on whether the node is an established member or is joining the cluster. Health check is enabled by default for all interfaces. You can disable it per interface using the no form of this command. You might want to disable health monitoring of non-essential interfaces, for example, the management interface.
When any topology changes occur (such as adding or removing a data interface, enabling or disabling an interface on the ASA or the switch) you should disable the health check feature (no health-check ) and also disable interface monitoring for the disabled interfaces. When the topology change is complete, and the configuration change is synced to all nodes, you can re-enable the health check feature. Example:
|
Step 4 |
Customize the auto-rejoin cluster settings after a health check failure. health-check {data-interface | cluster-interface | system} auto-rejoin [unlimited | auto_rejoin_max] auto_rejoin_interval auto_rejoin_interval_variation
Example:
|
Step 5 |
Configure the debounce time before the ASA considers an interface to be failed and the node is removed from the cluster. health-check monitor-interface debounce-time ms Example:
Set the debounce time between 300 and 9000 ms. The default is 500 ms. Lower values allow for faster detection of interface failures. Note that configuring a lower debounce time increases the chances of false-positives. When an interface status update occurs, the ASA waits the number of milliseconds specified before marking the interface as failed and the node is removed from the cluster. |
Step 6 |
(Optional) Configure traffic load monitoring. load-monitor [frequency seconds] [intervals intervals]
You can monitor the traffic load for cluster members, including total connection count, CPU and memory usage, and buffer drops. If the load is too high, you can choose to manually disable clustering on the node if the remaining nodes can handle the load, or adjust the load balancing on the external switch. This feature is enabled by default. You can periodically monitor the traffic load. If the load is too high, you can choose to manually disable clustering on the node. Use the show cluster info load-monitor command to view the traffic load. Example:
|
Example
The following example configures the health-check holdtime to .3 seconds; disables monitoring on the Management 0/0 interface; sets the auto-rejoin for data interfaces to 4 attempts starting at 2 minutes, increasing the duration by 3 x the previous interval; and sets the auto-rejoin for the cluster control link to 6 attempts every 2 minutes.
ciscoasa(config)# cluster group test
ciscoasa(cfg-cluster)# health-check holdtime .3
ciscoasa(cfg-cluster)# no health-check monitor-interface management0/0
ciscoasa(cfg-cluster)# health-check data-interface auto-rejoin 4 2 3
ciscoasa(cfg-cluster)# health-check cluster-interface auto-rejoin 6 2 1
Manage Cluster Nodes
After you deploy the cluster, you can change the configuration and manage cluster nodes.
Become an Inactive Node
To become an inactive member of the cluster, disable clustering on the node while leaving the clustering configuration intact.
Note |
When an ASA becomes inactive (either manually or through a health check failure), all data interfaces are shut down; only the management-only interface can send and receive traffic. To resume traffic flow, re-enable clustering; or you can remove the node altogether from the cluster. The management interface remains up using the IP address the node received from the cluster IP pool. However if you reload, and the node is still inactive in the cluster (for example, you saved the configuration with clustering disabled), then the management interface is disabled. You must use the console port for any further configuration. |
Procedure
Step 1 |
Enter cluster configuration mode: cluster group name Example:
|
Step 2 |
Disable clustering: no enable If this node was the control node, a new control election takes place, and a different member becomes the control node. The cluster configuration is maintained, so that you can enable clustering again later. |
Deactivate a Data Node from the Control Node
To deactivate a member other than the node you are logged into, perform the following steps.
Note |
When an ASA becomes inactive, all data interfaces are shut down; only the management-only interface can send and receive traffic. To resume traffic flow, re-enable clustering. The management interface remains up using the IP address the node received from the cluster IP pool. However if you reload, and the node is still inactive in the cluster (for example, if you saved the configuration with clustering disabled), the management interface is disabled. You must use the console port for any further configuration. |
Procedure
Remove the node from the cluster. cluster remove unit node_name The bootstrap configuration remains intact, as well as the last configuration synched from the control node, so that you can later re-add the node without losing your configuration. If you enter this command on a data node to remove the control node, a new control node is elected. To view member names, enter cluster remove unit ?, or enter the show cluster info command. Example:
|
Rejoin the Cluster
If a node was removed from the cluster, for example for a failed interface or if you manually deactivated a member, you must manually rejoin the cluster.
Procedure
Step 1 |
At the console, enter cluster configuration mode: cluster group name Example:
|
Step 2 |
Enable clustering. enable |
Leave the Cluster
If you want to leave the cluster altogether, you need to remove the entire cluster bootstrap configuration. Because the current configuration on each node is the same (synced from the active unit), leaving the cluster also means either restoring a pre-clustering configuration from backup, or clearing your configuration and starting over to avoid IP address conflicts.
Procedure
Step 1 |
For a data node, disable clustering:
Example:
You cannot make configuration changes while clustering is enabled on a data node. |
Step 2 |
Clear the cluster configuration: clear configure cluster The ASA shuts down all interfaces including the management interface and cluster control link. |
Step 3 |
Disable cluster interface mode: no cluster interface-mode The mode is not stored in the configuration and must be reset manually. |
Step 4 |
If you have a backup configuration, copy the backup configuration to the running configuration: copy backup_cfg running-config Example:
|
Step 5 |
Save the configuration to startup: write memory |
Step 6 |
If you do not have a backup configuration, reconfigure management access. Be sure to change the interface IP addresses, and restore the correct hostname, for example. |
Change the Control Node
Caution |
The best method to change the control node is to disable clustering on the control node, wait for a new control election, and then re-enable clustering. If you must specify the exact node you want to become the control node, use the procedure in this section. Note, however, that for centralized features, if you force a control node change using this procedure, then all connections are dropped, and you have to re-establish the connections on the new control node. |
To change the control node, perform the following steps.
Procedure
Set a new node as the control node: cluster control-node unitnode_name Example:
You will need to reconnect to the Main cluster IP address. To view member names, enter cluster control-node unit ? (to see all names except the current node), or enter the show cluster info command. |
Execute a Command Cluster-Wide
To send a command to all nodes in the cluster, or to a specific node, perform the following steps. Sending a show command to all nodes collects all output and displays it on the console of the current node. Other commands, such as capture and copy, can also take advantage of cluster-wide execution.
Procedure
Send a command to all nodes, or if you specify the node name, a specific node: cluster exec [unit node_name] command Example:
To view node names, enter cluster exec unit ? (to see all names except the current node), or enter the show cluster info command. |
Examples
To copy the same capture file from all nodes in the cluster at the same time to a TFTP server, enter the following command on the control node:
ciscoasa# cluster exec copy /pcap capture: tftp://10.1.1.56/capture1.pcap
Multiple PCAP files, one from each node, are copied to the TFTP server. The destination capture file name is automatically attached with the node name, such as capture1_asa1.pcap, capture1_asa2.pcap, and so on. In this example, asa1 and asa2 are cluster node names.
Monitoring the Cluster
You can monitor and troubleshoot cluster status and connections.
Monitoring Cluster Status
See the following commands for monitoring cluster status:
-
show cluster info [health [details]]
With no keywords, the show cluster info command shows the status of all members of the cluster.
The show cluster info health command shows the current health of interfaces, nodes, and the cluster overall. The details keyword shows the number heartbeat message failures.
See the following output for the show cluster info command:
ciscoasa# show cluster info Cluster stbu: On This is "C" in state DATA_NODE ID : 0 Site ID : 1 Version : 9.4(1) Serial No.: P3000000025 CCL IP : 10.0.0.3 CCL MAC : 000b.fcf8.c192 Last join : 17:08:59 UTC Sep 26 2011 Last leave: N/A Other members in the cluster: Unit "D" in state DATA_NODE ID : 1 Site ID : 1 Version : 9.4(1) Serial No.: P3000000001 CCL IP : 10.0.0.4 CCL MAC : 000b.fcf8.c162 Last join : 19:13:11 UTC Sep 23 2011 Last leave: N/A Unit "A" in state CONTROL_NODE ID : 2 Site ID : 2 Version : 9.4(1) Serial No.: JAB0815R0JY CCL IP : 10.0.0.1 CCL MAC : 000f.f775.541e Last join : 19:13:20 UTC Sep 23 2011 Last leave: N/A Unit "B" in state DATA_NODE ID : 3 Site ID : 2 Version : 9.4(1) Serial No.: P3000000191 CCL IP : 10.0.0.2 CCL MAC : 000b.fcf8.c61e Last join : 19:13:50 UTC Sep 23 2011 Last leave: 19:13:36 UTC Sep 23 2011
-
show cluster info auto-join
Shows whether the cluster node will automatically rejoin the cluster after a time delay and if the failure conditions (such as waiting for the license, chassis health check failure, and so on) are cleared. If the node is permanently disabled, or if the node is already in the cluster, then this command will not show any output.
See the following outputs for the show cluster info auto-join command:
ciscoasa(cfg-cluster)# show cluster info auto-join Unit will try to join cluster in 253 seconds. Quit reason: Received control message DISABLE ciscoasa(cfg-cluster)# show cluster info auto-join Unit will try to join cluster when quit reason is cleared. Quit reason: Control node has application down that data node has up. ciscoasa(cfg-cluster)# show cluster info auto-join Unit will try to join cluster when quit reason is cleared. Quit reason: Chassis-blade health check failed. ciscoasa(cfg-cluster)# show cluster info auto-join Unit will try to join cluster when quit reason is cleared. Quit reason: Service chain application became down. ciscoasa(cfg-cluster)# show cluster info auto-join Unit will try to join cluster when quit reason is cleared. Quit reason: Unit is kicked out from cluster because of Application health check failure. ciscoasa(cfg-cluster)# show cluster info auto-join Unit join is pending (waiting for the smart license entitlement: ent1) ciscoasa(cfg-cluster)# show cluster info auto-join Unit join is pending (waiting for the smart license export control flag)
-
show cluster info transport {asp | cp [detail]}
Shows transport related statistics for the following:
-
asp —Data plane transport statistics.
-
cp —Control plane transport statistics.
If you enter the detail keyword, you can view cluster reliable transport protocol usage so you can identify packet drop issues when the buffer is full in the control plane. See the following output for the show cluster info transport cp detail command:
ciscoasa# show cluster info transport cp detail Member ID to name mapping: 0 - unit-1-1 2 - unit-4-1 3 - unit-2-1 Legend: U - unreliable messages UE - unreliable messages error SN - sequence number ESN - expecting sequence number R - reliable messages RE - reliable messages error RDC - reliable message deliveries confirmed RA - reliable ack packets received RFR - reliable fast retransmits RTR - reliable timer-based retransmits RDP - reliable message dropped RDPR - reliable message drops reported RI - reliable message with old sequence number RO - reliable message with out of order sequence number ROW - reliable message with out of window sequence number ROB - out of order reliable messages buffered RAS - reliable ack packets sent This unit as a sender -------------------------- all 0 2 3 U 123301 3867966 3230662 3850381 UE 0 0 0 0 SN 1656a4ce acb26fe 5f839f76 7b680831 R 733840 1042168 852285 867311 RE 0 0 0 0 RDC 699789 934969 740874 756490 RA 385525 281198 204021 205384 RFR 27626 56397 0 0 RTR 34051 107199 111411 110821 RDP 0 0 0 0 RDPR 0 0 0 0 This unit as a receiver of broadcast messages --------------------------------------------- 0 2 3 U 111847 121862 120029 R 7503 665700 749288 ESN 5d75b4b3 6d81d23 365ddd50 RI 630 34278 40291 RO 0 582 850 ROW 0 566 850 ROB 0 16 0 RAS 1571 123289 142256 This unit as a receiver of unicast messages --------------------------------------------- 0 2 3 U 1 3308122 4370233 R 513846 879979 1009492 ESN 4458903a 6d841a84 7b4e7fa7 RI 66024 108924 102114 RO 0 0 0 ROW 0 0 0 ROB 0 0 0 RAS 130258 218924 228303 Gated Tx Buffered Message Statistics ------------------------------------- current sequence number: 0 total: 0 current: 0 high watermark: 0 delivered: 0 deliver failures: 0 buffer full drops: 0 message truncate drops: 0 gate close ref count: 0 num of supported clients:45 MRT Tx of broadcast messages ============================= Message high watermark: 3% Total messages buffered at high watermark: 5677 [Per-client message usage at high watermark] --------------------------------------------------------------- Client name Total messages Percentage Cluster Redirect Client 4153 73% Route Cluster Client 419 7% RRI Cluster Client 1105 19% Current MRT buffer usage: 0% Total messages buffered in real-time: 1 [Per-client message usage in real-time] Legend: F - MRT messages sending when buffer is full L - MRT messages sending when cluster node leave R - MRT messages sending in Rx thread ---------------------------------------------------------------------------- Client name Total messages Percentage F L R VPN Clustering HA Client 1 100% 0 0 0 MRT Tx of unitcast messages(to member_id:0) ============================================ Message high watermark: 31% Total messages buffered at high watermark: 4059 [Per-client message usage at high watermark] --------------------------------------------------------------- Client name Total messages Percentage Cluster Redirect Client 3731 91% RRI Cluster Client 328 8% Current MRT buffer usage: 29% Total messages buffered in real-time: 3924 [Per-client message usage in real-time] Legend: F - MRT messages sending when buffer is full L - MRT messages sending when cluster node leave R - MRT messages sending in Rx thread ---------------------------------------------------------------------------- Client name Total messages Percentage F L R Cluster Redirect Client 3607 91% 0 0 0 RRI Cluster Client 317 8% 0 0 0 MRT Tx of unitcast messages(to member_id:2) ============================================ Message high watermark: 14% Total messages buffered at high watermark: 578 [Per-client message usage at high watermark] --------------------------------------------------------------- Client name Total messages Percentage VPN Clustering HA Client 578 100% Current MRT buffer usage: 0% Total messages buffered in real-time: 0 MRT Tx of unitcast messages(to member_id:3) ============================================ Message high watermark: 12% Total messages buffered at high watermark: 573 [Per-client message usage at high watermark] --------------------------------------------------------------- Client name Total messages Percentage VPN Clustering HA Client 572 99% Cluster VPN Unique ID Client 1 0% Current MRT buffer usage: 0% Total messages buffered in real-time: 0
-
-
show cluster history
Shows the cluster history, as well as error messages about why a cluster node failed to join or why a node left the cluster.
Capturing Packets Cluster-Wide
See the following command for capturing packets in a cluster:
cluster exec capture
To support cluster-wide troubleshooting, you can enable capture of cluster-specific traffic on the control node using the cluster exec capture command, which is then automatically enabled on all of the data nodes in the cluster.
Monitoring Cluster Resources
See the following command for monitoring cluster resources:
show cluster {cpu | memory | resource} [options]
Displays aggregated data for the entire cluster. The options available depends on the data type.
Monitoring Cluster Traffic
See the following commands for monitoring cluster traffic:
-
show conn [detail], cluster exec show conn
The show conn command shows whether a flow is a director, backup, or forwarder flow. Use the cluster exec show conn command on any node to view all connections. This command can show how traffic for a single flow arrives at different ASAs in the cluster. The throughput of the cluster is dependent on the efficiency and configuration of load balancing. This command provides an easy way to view how traffic for a connection is flowing through the cluster, and can help you understand how a load balancer might affect the performance of a flow.
The show conn detail command also shows which flows are subject to flow mobility.
The following is sample output for the show conn detail command:
ciscoasa/ASA2/data node# show conn detail 12 in use, 13 most used Cluster stub connections: 0 in use, 46 most used Flags: A - awaiting inside ACK to SYN, a - awaiting outside ACK to SYN, B - initial SYN from outside, b - TCP state-bypass or nailed, C - CTIQBE media, c - cluster centralized, D - DNS, d - dump, E - outside back connection, e - semi-distributed, F - outside FIN, f - inside FIN, G - group, g - MGCP, H - H.323, h - H.225.0, I - inbound data, i - incomplete, J - GTP, j - GTP data, K - GTP t3-response k - Skinny media, L - LISP triggered flow owner mobility, M - SMTP data, m - SIP media, n - GUP O - outbound data, o - offloaded, P - inside back connection, Q - Diameter, q - SQL*Net data, R - outside acknowledged FIN, R - UDP SUNRPC, r - inside acknowledged FIN, S - awaiting inside SYN, s - awaiting outside SYN, T - SIP, t - SIP transient, U - up, V - VPN orphan, W - WAAS, w - secondary domain backup, X - inspected by service module, x - per session, Y - director stub flow, y - backup stub flow, Z - Scansafe redirection, z - forwarding stub flow ESP outside: 10.1.227.1/53744 NP Identity Ifc: 10.1.226.1/30604, , flags c, idle 0s, uptime 1m21s, timeout 30s, bytes 7544, cluster sent/rcvd bytes 0/0, owners (0,255) Traffic received at interface outside Locally received: 7544 (93 byte/s) Traffic received at interface NP Identity Ifc Locally received: 0 (0 byte/s) UDP outside: 10.1.227.1/500 NP Identity Ifc: 10.1.226.1/500, flags -c, idle 1m22s, uptime 1m22s, timeout 2m0s, bytes 1580, cluster sent/rcvd bytes 0/0, cluster sent/rcvd total bytes 0/0, owners (0,255) Traffic received at interface outside Locally received: 864 (10 byte/s) Traffic received at interface NP Identity Ifc Locally received: 716 (8 byte/s)
To troubleshoot the connection flow, first see connections on all nodes by entering the cluster exec show conn command on any node. Look for flows that have the following flags: director (Y), backup (y), and forwarder (z). The following example shows an SSH connection from 172.18.124.187:22 to 192.168.103.131:44727 on all three ASAs; ASA 1 has the z flag showing it is a forwarder for the connection, ASA3 has the Y flag showing it is the director for the connection, and ASA2 has no special flags showing it is the owner. In the outbound direction, the packets for this connection enter the inside interface on ASA2 and exit the outside interface. In the inbound direction, the packets for this connection enter the outside interface on ASA 1 and ASA3, are forwarded over the cluster control link to ASA2, and then exit the inside interface on ASA2.
ciscoasa/ASA1/control node# cluster exec show conn ASA1(LOCAL):********************************************************** 18 in use, 22 most used Cluster stub connections: 0 in use, 5 most used TCP outside 172.18.124.187:22 inside 192.168.103.131:44727, idle 0:00:00, bytes 37240828, flags z ASA2:***************************************************************** 12 in use, 13 most used Cluster stub connections: 0 in use, 46 most used TCP outside 172.18.124.187:22 inside 192.168.103.131:44727, idle 0:00:00, bytes 37240828, flags UIO ASA3:***************************************************************** 10 in use, 12 most used Cluster stub connections: 2 in use, 29 most used TCP outside 172.18.124.187:22 inside 192.168.103.131:44727, idle 0:00:03, bytes 0, flags Y
-
show cluster info [conn-distribution | packet-distribution | loadbalance | flow-mobility counters]
The show cluster info conn-distribution and show cluster info packet-distribution commands show traffic distribution across all cluster nodes. These commands can help you to evaluate and adjust the external load balancer.
The show cluster info loadbalance command shows connection rebalance statistics.
The show cluster info flow-mobility counters command shows EID movement and flow owner movement information. See the following output for show cluster info flow-mobility counters:
ciscoasa# show cluster info flow-mobility counters EID movement notification received : 4 EID movement notification processed : 4 Flow owner moving requested : 2
-
show cluster info load-monitor [details]
The show cluster info load-monitor command shows the traffic load for cluster members for the last interval and also the average over total number of intervals configured (30 by default). Use the details keyword to view the value for each measure at each interval.
ciscoasa(cfg-cluster)# show cluster info load-monitor ID Unit Name 0 B 1 A_1 Information from all units with 20 second interval: Unit Connections Buffer Drops Memory Used CPU Used Average from last 1 interval: 0 0 0 14 25 1 0 0 16 20 Average from last 30 interval: 0 0 0 12 28 1 0 0 13 27 ciscoasa(cfg-cluster)# show cluster info load-monitor details ID Unit Name 0 B 1 A_1 Information from all units with 20 second interval Connection count captured over 30 intervals: Unit ID 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Unit ID 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Buffer drops captured over 30 intervals: Unit ID 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Unit ID 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Memory usage(%) captured over 30 intervals: Unit ID 0 25 25 30 30 30 35 25 25 35 30 30 30 25 25 30 25 25 35 30 30 30 25 25 25 25 20 30 30 30 30 Unit ID 1 30 25 35 25 30 30 25 25 35 25 30 35 30 30 35 30 30 30 25 20 30 25 25 30 20 30 35 30 30 35 CPU usage(%) captured over 30 intervals: Unit ID 0 25 25 30 30 30 35 25 25 35 30 30 30 25 25 30 25 25 35 30 30 30 25 25 25 25 20 30 30 30 30 Unit ID 1 30 25 35 25 30 30 25 25 35 25 30 35 30 30 35 30 30 30 25 20 30 25 25 30 20 30 35 30 30 35
-
show cluster {access-list | conn | traffic | user-identity | xlate} [options]
Displays aggregated data for the entire cluster. The options available depends on the data type.
See the following output for the show cluster access-list command:
ciscoasa# show cluster access-list hitcnt display order: cluster-wide aggregated result, unit-A, unit-B, unit-C, unit-D access-list cached ACL log flows: total 0, denied 0 (deny-flow-max 4096) alert-interval 300 access-list 101; 122 elements; name hash: 0xe7d586b5 access-list 101 line 1 extended permit tcp 192.168.143.0 255.255.255.0 any eq www (hitcnt=0, 0, 0, 0, 0) 0x207a2b7d access-list 101 line 2 extended permit tcp any 192.168.143.0 255.255.255.0 (hitcnt=0, 0, 0, 0, 0) 0xfe4f4947 access-list 101 line 3 extended permit tcp host 192.168.1.183 host 192.168.43.238 (hitcnt=1, 0, 0, 0, 1) 0x7b521307 access-list 101 line 4 extended permit tcp host 192.168.1.116 host 192.168.43.238 (hitcnt=0, 0, 0, 0, 0) 0x5795c069 access-list 101 line 5 extended permit tcp host 192.168.1.177 host 192.168.43.238 (hitcnt=1, 0, 0, 1, 0) 0x51bde7ee access list 101 line 6 extended permit tcp host 192.168.1.177 host 192.168.43.13 (hitcnt=0, 0, 0, 0, 0) 0x1e68697c access-list 101 line 7 extended permit tcp host 192.168.1.177 host 192.168.43.132 (hitcnt=2, 0, 0, 1, 1) 0xc1ce5c49 access-list 101 line 8 extended permit tcp host 192.168.1.177 host 192.168.43.192 (hitcnt=3, 0, 1, 1, 1) 0xb6f59512 access-list 101 line 9 extended permit tcp host 192.168.1.177 host 192.168.43.44 (hitcnt=0, 0, 0, 0, 0) 0xdc104200 access-list 101 line 10 extended permit tcp host 192.168.1.112 host 192.168.43.44 (hitcnt=429, 109, 107, 109, 104) 0xce4f281d access-list 101 line 11 extended permit tcp host 192.168.1.170 host 192.168.43.238 (hitcnt=3, 1, 0, 0, 2) 0x4143a818 access-list 101 line 12 extended permit tcp host 192.168.1.170 host 192.168.43.169 (hitcnt=2, 0, 1, 0, 1) 0xb18dfea4 access-list 101 line 13 extended permit tcp host 192.168.1.170 host 192.168.43.229 (hitcnt=1, 1, 0, 0, 0) 0x21557d71 access-list 101 line 14 extended permit tcp host 192.168.1.170 host 192.168.43.106 (hitcnt=0, 0, 0, 0, 0) 0x7316e016 access-list 101 line 15 extended permit tcp host 192.168.1.170 host 192.168.43.196 (hitcnt=0, 0, 0, 0, 0) 0x013fd5b8 access-list 101 line 16 extended permit tcp host 192.168.1.170 host 192.168.43.75 (hitcnt=0, 0, 0, 0, 0) 0x2c7dba0d
To display the aggregated count of in-use connections for all nodes, enter:
ciscoasa# show cluster conn count Usage Summary In Cluster:********************************************* 200 in use (cluster-wide aggregated) cl2(LOCAL):*********************************************************** 100 in use, 100 most used cl1:****************************************************************** 100 in use, 100 most used
-
show asp cluster counter
This command is useful for datapath troubleshooting.
Monitoring Cluster Routing
See the following commands for cluster routing:
-
show route cluster
-
debug route cluster
Shows cluster information for routing.
-
show lisp eid
Shows the ASA EID table showing EIDs and site IDs.
See the following output from the cluster exec show lisp eid command.
ciscoasa# cluster exec show lisp eid L1(LOCAL):************************************************************ LISP EID Site ID 33.44.33.105 2 33.44.33.201 2 11.22.11.1 4 11.22.11.2 4 L2:******************************************************************* LISP EID Site ID 33.44.33.105 2 33.44.33.201 2 11.22.11.1 4 11.22.11.2 4
-
show asp table classify domain inspect-lisp
This command is useful for troubleshooting.
Configuring Logging for Clustering
See the following command for configuring logging for clustering:
logging device-id
Each node in the cluster generates syslog messages independently. You can use the logging device-id command to generate syslog messages with identical or different device IDs to make messages appear to come from the same or different nodes in the cluster.
Monitoring Cluster Interfaces
See the following commands for monitoring cluster interfaces:
-
show cluster interface-mode
Shows the cluster interface mode.
Debugging Clustering
See the following commands for debugging clustering:
-
debug cluster [ccp | datapath | fsm | general | hc | license | rpc | transport]
Shows debug messages for clustering.
-
debug cluster flow-mobility
Shows events related to clustering flow mobility.
-
debug lisp eid-notify-intercept
Shows events when the eid-notify message is intercepted.
-
show cluster info trace
The show cluster info trace command shows the debug information for further troubleshooting.
See the following output for the show cluster info trace command:
ciscoasa# show cluster info trace Feb 02 14:19:47.456 [DBUG]Receive CCP message: CCP_MSG_LOAD_BALANCE Feb 02 14:19:47.456 [DBUG]Receive CCP message: CCP_MSG_LOAD_BALANCE Feb 02 14:19:47.456 [DBUG]Send CCP message to all: CCP_MSG_KEEPALIVE from 80-1 at CONTROL_NODE
For example, if you see the following messages showing that two nodes with the same local-unit name are acting as the control node, it could mean that either two nodes have the same local-unit name (check your configuration), or a node is receiving its own broadcast messages (check your network).
ciscoasa# show cluster info trace May 23 07:27:23.113 [CRIT]Received datapath event 'multi control_nodes' with parameter 1. May 23 07:27:23.113 [CRIT]Found both unit-9-1 and unit-9-1 as control_node units. Control_node role retained by unit-9-1, unit-9-1 will leave then join as a Data_node May 23 07:27:23.113 [DBUG]Send event (DISABLE, RESTART | INTERNAL-EVENT, 5000 msecs, Detected another Control_node, leave and re-join as Data_node) to FSM. Current state CONTROL_NODE May 23 07:27:23.113 [INFO]State machine changed from state CONTROL_NODE to DISABLED
Reference for Clustering
This section includes more information about how clustering operates.
ASA Features and Clustering
Some ASA features are not supported with ASA clustering, and some are only supported on the control node. Other features might have caveats for proper usage.
Unsupported Features with Clustering
These features cannot be configured with clustering enabled, and the commands will be rejected.
-
Unified Communication features that rely on TLS Proxy
-
Remote access VPN (SSL VPN and IPsec VPN)
-
Virtual Tunnel Interfaces (VTIs)
-
The following application inspections:
-
CTIQBE
-
H323, H225, and RAS
-
IPsec passthrough
-
MGCP
-
MMP
-
RTSP
-
SCCP (Skinny)
-
WAAS
-
WCCP
-
-
Botnet Traffic Filter
-
Auto Update Server
-
DHCP client, server, and proxy. DHCP relay is supported.
-
VPN load balancing
-
Failover on Azure
-
Integrated Routing and Bridging
-
FIPS mode
Centralized Features for Clustering
The following features are only supported on the control node, and are not scaled for the cluster.
Note |
Traffic for centralized features is forwarded from member nodes to the control node over the cluster control link. If you use the rebalancing feature, traffic for centralized features may be rebalanced to non-control nodes before the traffic is classified as a centralized feature; if this occurs, the traffic is then sent back to the control node. For centralized features, if the control node fails, all connections are dropped, and you have to re-establish the connections on the new control node. |
-
The following application inspections:
-
DCERPC
-
ESMTP
-
IM
-
NetBIOS
-
PPTP
-
RADIUS
-
RSH
-
SNMP
-
SQLNET
-
SUNRPC
-
TFTP
-
XDMCP
-
-
Static route monitoring
-
Authentication and Authorization for network access. Accounting is decentralized.
-
Filtering Services
-
Site-to-site VPN
-
Multicast routing
Features Applied to Individual Nodes
These features are applied to each ASA node, instead of the cluster as a whole or to the control node.
-
QoS—The QoS policy is synced across the cluster as part of configuration replication. However, the policy is enforced on each node independently. For example, if you configure policing on output, then the conform rate and conform burst values are enforced on traffic exiting a particular ASA. In a cluster with 3 nodes and with traffic evenly distributed, the conform rate actually becomes 3 times the rate for the cluster.
-
Threat detection—Threat detection works on each node independently; for example, the top statistics is node-specific. Port scanning detection, for example, does not work because scanning traffic will be load-balanced between all nodes, and one node will not see all traffic.
AAA for Network Access and Clustering
AAA for network access consists of three components: authentication, authorization, and accounting. Authentication and authorization are implemented as centralized features on the clustering control node with replication of the data structures to the cluster data nodes. If a control node is elected, the new control node will have all the information it needs to continue uninterrupted operation of the established authenticated users and their associated authorizations. Idle and absolute timeouts for user authentications are preserved when a control node change occurs.
Accounting is implemented as a distributed feature in a cluster. Accounting is done on a per-flow basis, so the cluster node owning a flow will send accounting start and stop messages to the AAA server when accounting is configured for a flow.
Connection Settings and Clustering
Connection limits are enforced cluster-wide (see the set connection conn-max , set connection embryonic-conn-max , set connection per-client-embryonic-max , and set connection per-client-max commands). Each node has an estimate of the cluster-wide counter values based on broadcast messages. Due to efficiency considerations, the configured connection limit across the cluster might not be enforced exactly at the limit number. Each node may overestimate or underestimate the cluster-wide counter value at any given time. However, the information will get updated over time in a load-balanced cluster.
Dynamic Routing and Clustering
In Individual interface mode, each node runs the routing protocol as a standalone router, and routes are learned by each node independently.
In the above diagram, Router A learns that there are 4 equal-cost paths to Router B, each through a node. ECMP is used to load balance traffic between the 4 paths. Each node picks a different router ID when talking to external routers.
You must configure a cluster pool for the router ID so that each node has a separate router ID.
EIGRP does not form neighbor relationships with cluster peers in individual interface mode.
Note |
If the cluster has multiple adjacencies to the same router for redundancy purposes, asymmetric routing can lead to unacceptable traffic loss. To avoid asymmetric routing, group all of these node interfaces into the same traffic zone. |
FTP and Clustering
-
If FTP data channel and control channel flows are owned by different cluster members, then the data channel owner will periodically send idle timeout updates to the control channel owner and update the idle timeout value. However, if the control flow owner is reloaded, and the control flow is re-hosted, the parent/child flow relationship will not longer be maintained; the control flow idle timeout will not be updated.
-
If you use AAA for FTP access, then the control channel flow is centralized on the control node.
ICMP Inspection and Clustering
The flow of ICMP and ICMP error packets through the cluster varies depending on whether ICMP/ICMP error inspection is enabled. Without ICMP inspection, ICMP is a one-direction flow, and there is no director flow support. With ICMP inspection, the ICMP flow becomes two-directional and is backed up by a director/backup flow. One difference for an inspected ICMP flow is in the director handling of a forwarded packet: the director will forward the ICMP echo reply packet to the flow owner instead of returning the packet to the forwarder.
Multicast Routing and Clustering
In Individual interface mode, units do not act independently with multicast. All data and routing packets are processed and forwarded by the control unit, thus avoiding packet replication.
NAT and Clustering
NAT can affect the overall throughput of the cluster. Inbound and outbound NAT packets can be sent to different ASAs in the cluster, because the load balancing algorithm relies on IP addresses and ports, and NAT causes inbound and outbound packets to have different IP addresses and/or ports. When a packet arrives at the ASA that is not the NAT owner, it is forwarded over the cluster control link to the owner, causing large amounts of traffic on the cluster control link. Note that the receiving node does not create a forwarding flow to the owner, because the NAT owner may not end up creating a connection for the packet depending on the results of security and policy checks.
If you still want to use NAT in clustering, then consider the following guidelines:
-
No Proxy ARP—For Individual interfaces, a proxy ARP reply is never sent for mapped addresses. This prevents the adjacent router from maintaining a peer relationship with an ASA that may no longer be in the cluster. The upstream router needs a static route or PBR with Object Tracking for the mapped addresses that points to the Main cluster IP address. This is not an issue for a Spanned EtherChannel, because there is only one IP address associated with the cluster interface.
-
PAT with Port Block Allocation—See the following guidelines for this feature:
-
Maximum-per-host limit is not a cluster-wide limit, and is enforced on each node individually. Thus, in a 3-node cluster with the maximum-per-host limit configured as 1, if the traffic from a host is load-balanced across all 3 nodes, then it can get allocated 3 blocks with 1 in each node.
-
Port blocks created on the backup node from the backup pools are not accounted for when enforcing the maximum-per-host limit.
-
On-the-fly PAT rule modifications, where the PAT pool is modified with a completely new range of IP addresses, will result in xlate backup creation failures for the xlate backup requests that were still in transit while the new pool became effective. This behavior is not specific to the port block allocation feature, and is a transient PAT pool issue seen only in cluster deployments where the pool is distributed and traffic is load-balanced across the cluster nodes.
-
When operating in a cluster, you cannot simply change the block allocation size. The new size is effective only after you reload each device in the cluster. To avoid having to reload each device, we recommend that you delete all block allocation rules and clear all xlates related to those rules. You can then change the block size and recreate the block allocation rules.
-
-
NAT pool address distribution for dynamic PAT—When you configure a PAT pool, the cluster divides each IP address in the pool into port blocks. By default, each block is 512 ports, but if you configure port block allocation rules, your block setting is used instead. These blocks are distributed evenly among the nodes in the cluster, so that each node has one or more blocks for each IP address in the PAT pool. Thus, you could have as few as one IP address in a PAT pool for a cluster, if that is sufficient for the number of PAT’ed connections you expect. Port blocks cover the 1024-65535 port range, unless you configure the option to include the reserved ports, 1-1023, on the PAT pool NAT rule.
-
Reusing a PAT pool in multiple rules—To use the same PAT pool in multiple rules, you must be careful about the interface selection in the rules. You must either use specific interfaces in all rules, or "any" in all rules. You cannot mix specific interfaces and "any" across the rules, or the system might not be able to match return traffic to the right node in the cluster. Using unique PAT pools per rule is the most reliable option.
-
No round-robin—Round-robin for a PAT pool is not supported with clustering.
-
No extended PAT—Extended PAT is not supported with clustering.
-
Dynamic NAT xlates managed by the control node—The control node maintains and replicates the xlate table to data nodes. When a data node receives a connection that requires dynamic NAT, and the xlate is not in the table, it requests the xlate from the control node. The data node owns the connection.
-
Stale xlates—The xlate idle time on the connection owner does not get updated. Thus, the idle time might exceed the idle timeout. An idle timer value higher than the configured timeout with a refcnt of 0 is an indication of a stale xlate.
-
Per-session PAT feature—Although not exclusive to clustering, the per-session PAT feature improves the scalability of PAT and, for clustering, allows each data node to own PAT connections; by contrast, multi-session PAT connections have to be forwarded to and owned by the control node. By default, all TCP traffic and UDP DNS traffic use a per-session PAT xlate, whereas ICMP and all other UDP traffic uses multi-session. You can configure per-session NAT rules to change these defaults for TCP and UDP, but you cannot configure per-session PAT for ICMP. For traffic that benefits from multi-session PAT, such as H.323, SIP, or Skinny, you can disable per-session PAT for the associated TCP ports (the UDP ports for those H.323 and SIP are already multi-session by default). For more information about per-session PAT, see the firewall configuration guide.
-
No static PAT for the following inspections—
-
FTP
-
PPTP
-
RSH
-
SQLNET
-
TFTP
-
XDMCP
-
SIP
-
-
If you have an extremely large number of NAT rules, over ten thousand, you should enable the transactional commit model using the asp rule-engine transactional-commit nat command in the device CLI. Otherwise, the node might not be able to join the cluster.
SCTP and Clustering
An SCTP association can be created on any node (due to load balancing); its multi-homing connections must reside on the same node.
SIP Inspection and Clustering
A control flow can be created on any node (due to load balancing); its child data flows must reside on the same node.
TLS Proxy configuration is not supported.
SNMP and Clustering
An SNMP agent polls each individual ASA by its Local IP address. You cannot poll consolidated data for the cluster.
You should always use the Local address, and not the Main cluster IP address for SNMP polling. If the SNMP agent polls the Main cluster IP address, if a new control node is elected, the poll to the new control node will fail.
When using SNMPv3 with clustering, if you add a new cluster node after the initial cluster formation, then SNMPv3 users are not replicated to the new node.You must re-add them on the control node to force the users to replicate to the new node, or directly on the data node.
STUN and Clustering
STUN inspection is supported in failover and cluster modes, as pinholes are replicated. However, the transaction ID is not replicated among nodes. In the case where a node fails after receiving a STUN Request and another node received the STUN Response, the STUN Response will be dropped.
Syslog and Clustering
-
Syslog—Each node in the cluster generates its own syslog messages. You can configure logging so that each node uses either the same or a different device ID in the syslog message header field. For example, the hostname configuration is replicated and shared by all nodes in the cluster. If you configure logging to use the hostname as the device ID, syslog messages generated by all nodes look as if they come from a single node. If you configure logging to use the local-node name that is assigned in the cluster bootstrap configuration as the device ID, syslog messages look as if they come from different nodes.
-
NetFlow—Each node in the cluster generates its own NetFlow stream. The NetFlow collector can only treat each ASA as a separate NetFlow exporter.
Cisco Trustsec and Clustering
Only the control node learns security group tag (SGT) information. The control node then populates the SGT to data nodes, and data nodes can make a match decision for SGT based on the security policy.
VPN and Clustering
Site-to-site VPN is a centralized feature; only the control node supports VPN connections.
Note |
Remote access VPN is not supported with clustering. |
VPN functionality is limited to the control node and does not take advantage of the cluster high availability capabilities. If the control node fails, all existing VPN connections are lost, and VPN users will see a disruption in service. When a new control node is elected, you must reestablish the VPN connections.
For connections to an Individual interface when using PBR or ECMP, you must always connect to the Main cluster IP address, not a Local address.
VPN-related keys and certificates are replicated to all nodes.
Performance Scaling Factor
When you combine multiple units into a cluster, you can expect the total cluster performance to be approximately 80% of the maximum combined throughput.
For example, if your model can handle approximately 10 Gbps of traffic when running alone, then for a cluster of 8 units, the maximum combined throughput will be approximately 80% of 80 Gbps (8 units x 10 Gbps): 64 Gbps.
Control Node Election
Nodes of the cluster communicate over the cluster control link to elect a control node as follows:
-
When you enable clustering for a node (or when it first starts up with clustering already enabled), it broadcasts an election request every 3 seconds.
-
Any other nodes with a higher priority respond to the election request; the priority is set between 1 and 100, where 1 is the highest priority.
-
If after 45 seconds, a node does not receive a response from another node with a higher priority, then it becomes the control node.
Note
If multiple nodes tie for the highest priority, the cluster node name and then the serial number is used to determine the control node.
-
If a node later joins the cluster with a higher priority, it does not automatically become the control node; the existing control node always remains as the control node unless it stops responding, at which point a new control node is elected.
-
In a "split brain" scenario when there are temporarily multiple control nodes, then the node with highest priority retains the role while the other nodes return to data node roles.
Note |
You can manually force a node to become the control node. For centralized features, if you force a control node change, then all connections are dropped, and you have to re-establish the connections on the new control node. |
High Availability within the Cluster
Clustering provides high availability by monitoring node and interface health and by replicating connection states between nodes.
Node Health Monitoring
Each node periodically sends a broadcast heartbeat packet over the cluster control link. If the control node does not receive any heartbeat packets or other packets from a data node within the configurable timeout period, then the control node removes the data node from the cluster. If the data nodes do not receive packets from the control node, then a new control node is elected from the remaining nodes.
If nodes cannot reach each other over the cluster control link because of a network failure and not because a node has actually failed, then the cluster may go into a "split brain" scenario where isolated data nodes will elect their own control nodes. For example, if a router fails between two cluster locations, then the original control node at location 1 will remove the location 2 data nodes from the cluster. Meanwhile, the nodes at location 2 will elect their own control node and form their own cluster. Note that asymmetric traffic may fail in this scenario. After the cluster control link is restored, then the control node that has the higher priority will keep the control node’s role.
Interface Monitoring
Each node monitors the link status of all named hardware interfaces in use, and reports status changes to the control node.
When you enable health monitoring, all physical interfaces are monitored by default; you can optionally disable monitoring per interface. Only named interfaces can be monitored.
A node is removed from the cluster if its monitored interfaces fail. The amount of time before the ASA removes a member from the cluster depends on whether the node is an established member or is joining the cluster. The ASA does not monitor interfaces for the first 90 seconds that a node joins the cluster. Interface status changes during this time will not cause the ASA to be removed from the cluster. The node is removed after 500 ms, regardless of the node state.
Status After Failure
If the control node fails, then another member of the cluster with the highest priority (lowest number) becomes the control node.
The ASA automatically tries to rejoin the cluster, depending on the failure event.
Note |
When the ASA becomes inactive and fails to automatically rejoin the cluster, all data interfaces are shut down; only the management-only interface can send and receive traffic. The management interface remains up using the IP address the node received from the cluster IP pool. However if you reload, and the node is still inactive in the cluster, the management interface is disabled. You must use the console port for any further configuration. |
Rejoining the Cluster
After a cluster node is removed from the cluster, how it can rejoin the cluster depends on why it was removed:
-
Failed cluster control link when initially joining—After you resolve the problem with the cluster control link, you must manually rejoin the cluster by re-enabling clustering at the CLI by entering cluster group name , and then enable.
-
Failed cluster control link after joining the cluster—The ASA automatically tries to rejoin every 5 minutes, indefinitely. This behavior is configurable.
-
Failed data interface—The ASA automatically tries to rejoin at 5 minutes, then at 10 minutes, and finally at 20 minutes. If the join is not successful after 20 minutes, then the ASA disables clustering. After you resolve the problem with the data interface, you have to manually enable clustering at the CLI by entering cluster group name , and then enable. This behavior is configurable.
-
Failed node—If the node was removed from the cluster because of a node health check failure, then rejoining the cluster depends on the source of the failure. For example, a temporary power failure means the node will rejoin the cluster when it starts up again as long as the cluster control link is up and clustering is still enabled with the enable command. The ASA attempts to rejoin the cluster every 5 seconds.
-
Internal error—Internal failures include: application sync timeout; inconsistent application statuses; and so on. A node will attempt to rejoin the cluster automatically at the following intervals: 5 minutes, 10 minutes, and then 20 minutes. This behavior is configurable.
Data Path Connection State Replication
Every connection has one owner and at least one backup owner in the cluster. The backup owner does not take over the connection in the event of a failure; instead, it stores TCP/UDP state information, so that the connection can be seamlessly transferred to a new owner in case of a failure. The backup owner is usually also the director.
Some traffic requires state information above the TCP or UDP layer. See the following table for clustering support or lack of support for this kind of traffic.
Traffic |
State Support |
Notes |
---|---|---|
Up time |
Yes |
Keeps track of the system up time. |
ARP Table |
Yes |
— |
MAC address table |
Yes |
— |
User Identity |
Yes |
Includes AAA rules (uauth). |
IPv6 Neighbor database |
Yes |
— |
Dynamic routing |
Yes |
— |
SNMP Engine ID |
No |
— |
Distributed VPN (Site-to-Site) for Firepower 4100/9300 |
Yes |
Backup session becomes the active session, then a new backup session is created. |
How the Cluster Manages Connections
Connections can be load-balanced to multiple nodes of the cluster. Connection roles determine how connections are handled in both normal operation and in a high availability situation.
Connection Roles
See the following roles defined for each connection:
-
Owner—Usually, the node that initially receives the connection. The owner maintains the TCP state and processes packets. A connection has only one owner. If the original owner fails, then when new nodes receive packets from the connection, the director chooses a new owner from those nodes.
-
Backup owner—The node that stores TCP/UDP state information received from the owner, so that the connection can be seamlessly transferred to a new owner in case of a failure. The backup owner does not take over the connection in the event of a failure. If the owner becomes unavailable, then the first node to receive packets from the connection (based on load balancing) contacts the backup owner for the relevant state information so it can become the new owner.
As long as the director (see below) is not the same node as the owner, then the director is also the backup owner. If the owner chooses itself as the director, then a separate backup owner is chosen.
For clustering on the Firepower 9300, which can include up to 3 cluster nodes in one chassis, if the backup owner is on the same chassis as the owner, then an additional backup owner will be chosen from another chassis to protect flows from a chassis failure.
If you enable director localization for inter-site clustering, then there are two backup owner roles: the local backup and the global backup. The owner always chooses a local backup at the same site as itself (based on site ID). The global backup can be at any site, and might even be the same node as the local backup. The owner sends connection state information to both backups.
If you enable site redundancy, and the backup owner is at the same site as the owner, then an additional backup owner will be chosen from another site to protect flows from a site failure. Chassis backup and site backup are independent, so in some cases a flow will have both a chassis backup and a site backup.
-
Director—The node that handles owner lookup requests from forwarders. When the owner receives a new connection, it chooses a director based on a hash of the source/destination IP address and ports (see below for ICMP hash details), and sends a message to the director to register the new connection. If packets arrive at any node other than the owner, the node queries the director about which node is the owner so it can forward the packets. A connection has only one director. If a director fails, the owner chooses a new director.
As long as the director is not the same node as the owner, then the director is also the backup owner (see above). If the owner chooses itself as the director, then a separate backup owner is chosen.
If you enable director localization for inter-site clustering, then there are two director roles: the local director and the global director. The owner always chooses a local director at the same site as itself (based on site ID). The global director can be at any site, and might even be the same node as the local director. If the original owner fails, then the local director chooses a new connection owner at the same site.
ICMP/ICMPv6 hash details:
-
For Echo packets, the source port is the ICMP identifier, and the destination port is 0.
-
For Reply packets, the source port is 0, and the destination port is the ICMP identifier.
-
For other packets, both source and destination ports are 0.
-
-
Forwarder—A node that forwards packets to the owner. If a forwarder receives a packet for a connection it does not own, it queries the director for the owner, and then establishes a flow to the owner for any other packets it receives for this connection. The director can also be a forwarder. If you enable director localization, then the forwarder always queries the local director. The forwarder only queries the global director if the local director does not know the owner, for example, if a cluster member receives packets for a connection that is owned on a different site. Note that if a forwarder receives the SYN-ACK packet, it can derive the owner directly from a SYN cookie in the packet, so it does not need to query the director. (If you disable TCP sequence randomization, the SYN cookie is not used; a query to the director is required.) For short-lived flows such as DNS and ICMP, instead of querying, the forwarder immediately sends the packet to the director, which then sends them to the owner. A connection can have multiple forwarders; the most efficient throughput is achieved by a good load-balancing method where there are no forwarders and all packets of a connection are received by the owner.
Note
We do not recommend disabling TCP sequence randomization when using clustering. There is a small chance that some TCP sessions won't be established, because the SYN/ACK packet might be dropped.
-
Fragment Owner—For fragmented packets, cluster nodes that receive a fragment determine a fragment owner using a hash of the fragment source IP address, destination IP address, and the packet ID. All fragments are then forwarded to the fragment owner over the cluster control link. Fragments may be load-balanced to different cluster nodes, because only the first fragment includes the 5-tuple used in the switch load balance hash. Other fragments do not contain the source and destination ports and may be load-balanced to other cluster nodes. The fragment owner temporarily reassembles the packet so it can determine the director based on a hash of the source/destination IP address and ports. If it is a new connection, the fragment owner will register to be the connection owner. If it is an existing connection, the fragment owner forwards all fragments to the provided connection owner over the cluster control link. The connection owner will then reassemble all fragments.
When a connection uses Port Address Translation (PAT), then the PAT type (per-session or multi-session) influences which member of the cluster becomes the owner of a new connection:
-
Per-session PAT—The owner is the node that receives the initial packet in the connection.
By default, TCP and DNS UDP traffic use per-session PAT.
-
Multi-session PAT—The owner is always the control node. If a multi-session PAT connection is initially received by a data node, then the data node forwards the connection to the control node.
By default, UDP (except for DNS UDP) and ICMP traffic use multi-session PAT, so these connections are always owned by the control node.
You can change the per-session PAT defaults for TCP and UDP so connections for these protocols are handled per-session or multi-session depending on the configuration. For ICMP, you cannot change from the default multi-session PAT. For more information about per-session PAT, see the firewall configuration guide.
New Connection Ownership
When a new connection is directed to a node of the cluster via load balancing, that node owns both directions of the connection. If any connection packets arrive at a different node, they are forwarded to the owner node over the cluster control link. For best performance, proper external load balancing is required for both directions of a flow to arrive at the same node, and for flows to be distributed evenly between nodes. If a reverse flow arrives at a different node, it is redirected back to the original node.
Sample Data Flow for TCP
The following example shows the establishment of a new connection.
-
The SYN packet originates from the client and is delivered to one ASA (based on the load balancing method), which becomes the owner. The owner creates a flow, encodes owner information into a SYN cookie, and forwards the packet to the server.
-
The SYN-ACK packet originates from the server and is delivered to a different ASA (based on the load balancing method). This ASA is the forwarder.
-
Because the forwarder does not own the connection, it decodes owner information from the SYN cookie, creates a forwarding flow to the owner, and forwards the SYN-ACK to the owner.
-
The owner sends a state update to the director, and forwards the SYN-ACK to the client.
-
The director receives the state update from the owner, creates a flow to the owner, and records the TCP state information as well as the owner. The director acts as the backup owner for the connection.
-
Any subsequent packets delivered to the forwarder will be forwarded to the owner.
-
If packets are delivered to any additional nodes, it will query the director for the owner and establish a flow.
-
Any state change for the flow results in a state update from the owner to the director.
Sample Data Flow for ICMP and UDP
The following example shows the establishment of a new connection.
-
The first UDP packet originates from the client and is delivered to one ASA (based on the load balancing method).
-
The node that received the first packet queries the director node that is chosen based on a hash of the source/destination IP address and ports.
-
The director finds no existing flow, creates a director flow and forwards the packet back to the previous node. In other words, the director has elected an owner for this flow.
-
The owner creates the flow, sends a state update to the director, and forwards the packet to the server.
-
The second UDP packet originates from the server and is delivered to the forwarder.
-
The forwarder queries the director for ownership information. For short-lived flows such as DNS, instead of querying, the forwarder immediately sends the packet to the director, which then sends it to the owner.
-
The director replies to the forwarder with ownership information.
-
The forwarder creates a forwarding flow to record owner information and forwards the packet to the owner.
-
The owner forwards the packet to the client.
History for ASA Virtual Clustering in the Public Cloud
Feature |
Version |
Details |
---|---|---|
ASA Virtual on Azure: Clustering with Gateway Load Balancing |
9.20(2) |
We now support the ASA Virtual clustering deployment on Azure using the Azure Resource Manager (ARM) template and then configure the ASAv clusters to use the Gateway Load Balancer (GWLB) for load balancing the network traffic. |
Configure Target Failover for ASA Virtual Clustering with GWLB in AWS | 9.20(2) | The Target Failover feature in AWS enables GWLB to redirect network packets to a healthy target node in the event of node deregistration during planned maintenance or a target node failure. It takes advantage of the cluster's stateful failover. |
Configurable cluster keepalive interval for flow status |
9.20(2) |
The flow owner sends keepalives (clu_keepalive messages) and updates (clu_update messages) to the director and backup owner to refresh the flow state. You can now set the keepalive interval. The default is 15 seconds, and you can set the interval between 15 and 55 seconds. You may want to set the interval to be longer to reduce the amount of traffic on the cluster control link. New/Modified commands: clu-keepalive-interval |
ASA Virtual Amazon Web Services (AWS) clustering |
9.19(1) |
The ASA Virtual supports Individual interface clustering for up to 16 nodes on AWS. You can use clustering with or without the AWS Gateway Load Balancer. |
ASA virtual IMDSv2 support in Amazon Web Services (AWS) clustering |
9.22 |
The ASA Virtual supports IMDSv2 on AWS. You can enable IMDSv2 Required mode by updating the stack. |