Interchassis Session Recovery


Interchassis Session Recovery
 
 
This chapter provides information on configuring interchassis session recovery (ICSR). The product Administration Guides provide examples and procedures for configuration of basic services on the system. It is recommended that you select the configuration example that best meets your service model, and configure the required elements for that model, as described in the respective product Administration Guide, before using the procedures in this chapter.interchassis session recovery.
 
This chapter discusses the following:
 
Overview
The interchassis session recovery feature provides the highest possible availability for continuous call processing without interrupting subscriber services. This is accomplished through the use of redundant chassis. The chassis are configured as primary and backup with one being active and one standby. Both chassis are connected to the same AAA server. A checkpoint duration timer is used to control when subscriber data is sent from the active chassis to the standby chassis. If the active chassis handling the call traffic goes out of service, the standby chassis transitions to the active state and continues processing the call traffic without interrupting the subscriber session. The chassis determines which is active through a propriety TCP-based connection called a redundancy link. This link is used to exchange Hello messages between the primary and backup chassis and must be maintained for proper system operation.
 
Important: ICSR is supported on chassis configured for GGSN or HA services only.
 
Interchassis Communication
Chassis configured to support interchassis session recovery communicate using periodic Hello messages. These messages are sent by each chassis to notify the peer of its current state. The Hello message contains information about the chassis such as its configuration and priority. A dead interval is used to set a time limit for a Hello message to be received from the chassis’ peer. If the standby chassis does not receive an Hello message from the active chassis within the dead interval, the standby chassis transitions to the active state. In situations where the redundancy link goes out of service, a priority scheme is used to determine which chassis processes the session. The following priority scheme is used:
 
 
Checkpoint Messages
Checkpoint messages are sent from the active chassis to the standby chassis. Checkpoint messages are sent at specific intervals and contain all the information needed to recreate the sessions on the standby chassis, if that chassis were to become active. Once a session exceeds the checkpoint duration, checkpoint data is collected on the session. The checkpoint parameter determines the amount of time a session must be active before it is included in the checkpoint message.
 
AAA Monitor
AAA servers are monitored using the authentication probe mechanism. AAA servers are considered up if the authentication-probe receives a valid response. AAA servers are considered down when the max-retries count specified in the configuration of the AAA server has been reached. The service-redundancy protocol will initiate a switchover when none of the configured AAA servers responds to an authentication probe. AAA probing is only be performed on the active chassis.
Important: A switchover event caused by a AAA monitoring failure is non-revertible. If the newly active chassis fails to monitor the configured AAA servers it remains as the active chassis until either a manual switchover, or another non-AAA failure event causes the system to switchover.
 
BGP Interaction
The service-redundancy protocol implements non-revertible switchover behavior by using a mechanism to adjust the route modifier value for the advertised loopback/IP Pool routes. The initial value of the route modifier value is determined by the chassis configured role and is initialized to a value that is higher than a normal operational value. This ensures that in the event of an SRP link failure and a SRP task failure that the correct chassis is still preferred in the routing domain. The Active and Standby chassis share the route modifier values they are currently using. When BGP advertises the loopback and ip pool routes, it converts the route modifier into an autonomous systems (AS) path prepend count. The Active chassis always has a lower route modifier, and thus prepends less to the AS-path attribute. This causes the route to be preferred in the routing domain. In the event that communication on the redundancy link is lost, and both chassis in the redundant pair are claiming to be Active. The previously Active chassis is still preferred since it is advertising a smaller AS-path into the BGP routing domain. The route modifier is incremented as switchover events occur. A threshold will be implemented to determine when the route modifier should be reset to its initial value to avoid rollover.
 
Requirements
ICSR configurations require the following:
 
Important: ICSR is a licensed feature. Be sure that each chassis has the appropriate license before using the procedures in this chapter. To do this, log in to both chassis and execute a show license information command. Interchassis Session Recovery feature is listed as Inter-Chassis Session Recovery. If the chassis is not licensed, please contact your local sales representative.
The following figure shows an ICSR network.
 
Interchassis Session Recovery Network Diagram
 
ICSR Operation
 
This section provides an operational flow for ICSR. The following figure shows an ICSR process flow.
 
ICSR Flow Diagram
 
Chassis Initialization
When the chassis are simultaneously initialized, they send Hello messages to their configured peer. The peer sends a response, establishes communication between the chassis, and messages are sent that contain configuration information.
Important: If the chassis are GGSNs, the messages include the APN tables.
During initialization, if both chassis are misconfigured in the same mode - both active (primary) or both standby (backup), then the chassis with the highest priority (highest number set with SRP priority) becomes active and the other chassis becomes the standby.
If the chassis priorities are the same, the system compares the two MAC addresses and the chassis with the higher SPIO MAC address becomes active. For example, if the chassis have MAC addresses of 00-02-43-03-1C-2B and 00-02-43-03-01-3B, the last 3 sets of octets (the first 3 sets are the vendor code) are compared. In this example, the 03-1C-2B and 03-01-3B are compared from left to right. The first pair of octets in both MAC addresses are the same, so the next pairs are compared. Since the 01 is lower than the 1C, the chassis with the SPIO MAC address of 00-02-43-03-1C-2B becomes active and the other chassis the standby.
 
Chassis Operation
This section describes how the chassis communicate, maintain subscriber sessions, and perform chassis switchover.
 
Chassis Communication
There is one chassis in the active state and one in the standby state. They both send Hello messages at each hello interval. Subscriber sessions that exceed the checkpoint session duration are included in checkpoint messages that are sent to the standby chassis. The checkpoint message contains subscriber session information so if the active chassis goes out of service, the backup chassis becomes active and is able to continue processing the subscriber sessions. Additional checkpoint messages occur at various intervals where subscriber session information is updated on the standby chassis.
 
Chassis Switchover
If the active chassis goes out of service the standby chassis continues to send Hello messages. If the standby chassis does not receive a response to the Hello messages within the dead interval, the standby chassis initiates a switchover. During the switch over, the standby chassis begins advertising the srp-activated loopback and pool routes into the routing domain. Once the chassis becomes active, it continues to process existing AAA services, subscriber sessions that had checkpoint information, and is able to establish new subscriber sessions as well.
When the primary chassis is back in service it sends Hello messages to the configured peer. The peer sends a response, establishes communication between the chassis, and Hello messages are sent that contain configuration information. The primary chassis receives an Hello message that shows the backup chassis state as active and the primary chassis becomes standby. The Hello message now continue to be sent to each peer and checkpoint information is now sent from the active chassis to the standby chassis at regular intervals.
When chassis switchover occurs, the session timers are recovered. The MIP HA session recovery is recreated with the full lifetime to avoid potential loss of the session and the possibility that a renewal update was lost in the transient checkpoint update process.
 
Configuring Interchassis Session Recovery (ICSR)
 
Important: The ICSR configuration must be the same on the primary and backup chassis. If each chassis has a different srp configuration, the session recovery feature does not function and sessions cannot be recovered when the active chassis goes out of service.
Important: This section provides the minimum instruction set for configuring ICSR on the system. For more information on commands that configure additional parameters and options, refer to the Command Line Interface Reference.
Procedures described here assume the following:
 
For more configuration information and instructions on configuring services, refer to the respective product Administration Guide.
For more configuration information and instructions on configuring the AAA server, refer to the AAA Interface Administration and Reference.
BGP router installed and configured. See Routing for more information on configuring BGP services.
To configure the Interchassis Session Recovery on a primary and/or backup chassis:
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Save your configuration as described in the Verifying and Saving Your Configuration chapter.
 
Configuring the Service Redundancy Protocol (SRP) Context
To configure the system to work for interchassis session recovery:
Step 1
Step 2
Step 3
Step 4
Step 5
Save your configuration as described in the Verifying and Saving Your Configuration chapter.
 
Creating and Binding the SRP Context
Use the following example to create the SRP context bind it to primary chassis IP address:
Important: ICSR is configured using two systems. Be sure to create the redundancy context on both systems. CLI commands must be executed on both systems. Always make configuration changes on the primary system first. It would be a good idea to log on both chassis before continuing. Before starting this configuration, determine which system to configure as the primary and use that login session.
configure
  context <srp_ctxt_name> [ -noconfirm ]
     service-redundancy-protocol
        bind address <ip_address>
        end
Notes:
 
 
Configuring the SRP Context Parameters
This configuration assign a chassis mode, priority, and configure the redundancy link between the primary and backup systems:
Important: CLI commands must be executed on both systems. Always make configuration changes on the primary system first. It would be a good idea to log on both chassis before continuing.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        chassis-mode { primary | backup }
        priority <priority>
        peer-ip-address <ip_address>
        hello-interval <dur_sec>
        dead-interval <dead_dur_sec>
        end
Notes:
 
 
Configuring the SRP Context Interface Parameters
This procedure configures communication interface with IP address and port number for the SRP context to communicate with chassis:
Important: CLI commands must be executed on both systems. Always make configuration changes on the primary system first. It would be a good idea to log on both chassis before continuing.
configure
  context <vpn_ctxt_name> [ -noconfirm ]
     interface <srp_if_name>
        ip-address { <ip_address> | <ip_address>/<mask> }
        exit
     exit
  port ethernet <slot_num>/<port_num>
     description <des_string>
     medium { auto | speed { 10 | 100 | 1000 } duplex { full | half } }
     no shutdown
     bind interface <srp_if_name> <srp_ctxt_name>
     end
Notes:
 
Verifying SRP Configuration
Step 1
 
show srp info
The output of this command given below is the sample output. In this example, a SRP context called srp1 was configured and you can observe some parameters configured as default.
 
Service Redundancy Protocol:
----------------------------------------------------------------------
Context: srp1
Local Address: 0.0.0.0
Chassis State: Init
Chassis Mode: Backup
Chassis Priority: 125
Local Tiebreaker: 00-00-00-00-00-00
Route-Modifier: 34
Peer Remote Address: 0.0.0.0
Peer State: Init
Peer Mode: Init
Peer Priority: 0
Peer Tiebreaker: 00-00-00-00-00-00
Peer Route-Modifier: 0
Last Hello Message received: -
Peer Configuration Validation: Initial
Last Peer Configuration Error: None
Last Peer Configuration Event: -
Connection State: None
 
Modifying the Source Context for ICSR
To modify the source context of core service:
Step 1
Step 2
Step 3
Step 4
Save your configuration as described in the Verifying and Saving Your Configuration chapter.
 
Configuring BGP Router and HA Address
Use the following example to create the BGP context and network addresses.
configure
  context <source_ctxt_name>
     router bgp <AS_num>
        network <ha_ip_address>
        neighbor <neighbor_ip_address> remote-as <AS_num>
        end
Notes:
 
 
Configuring SRP Context for BGP
Use the following example to configure the BGP context and IP addresses in SRP context.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        monitor bgp context <source_ctxt_name> <neighbor_ip_address>
        end
Notes:
 
Verifying BGP Configuration
Step 1
 
show srp monitor bgp
 
Modifying the Destination Context for ICSR
To modify the destination context of core service:
Step 1
Step 2
Step 3
Step 4
Step 5
Save your configuration as described in the Verifying and Saving Your Configuration chapter.
 
Configuring BGP Router and HA Address in Destination Context
Use the following example to create the BGP context and network addresses.
configure
  context <dest_ctxt_name>
     router bgp <AS_num>
        network <ha_ip_address>
        neighbor <neighbor_ip_address> remote-as <AS_num>
        end
Notes:
 
 
Configuring SRP Context for BGP for Destination Context
Use the following example to configure the BGP context and IP addresses in SRP context.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        monitor bgp context <dest_ctxt_name> <neighbor_ip_address>
        end
Notes:
 
Setting Subscriber to Default Mode
Use the following example to set the subscriber mode to Default.
configure
  context <dest_ctxt_name>
     subscriber default
     end
Notes:
 
Verifying BGP Configuration in Destination Context
Step 1
 
show srp monitor bgp
 
Disabling Bulk Statistics Collection on a Standby System
You can optionally configure bulk statistics not to be collected from a system when it is in the standby mode of operation.
Important: When this feature is enabled and a system transitions to standby state any pending accumulated statistics data is transferred at the first opportunity. After that no additional statistics gathering takes place until the system comes out of standby state.
Use the following example to disable the bulk statistics collection on a standby system.
configure
  bulkstat mode
     no gather-on-standby
     end
Notes:
 
 
Verifying the Primary and Backup Chassis Configuration
These instructions are used to compare the ICSR configuration on both chassis.
Step 1
Verify that both chassis have the same srp configuration information. The output looks similar to following:
 
config
context source
interface haservice loopback
ip address 172.17.1.1 255.255.255.255 srp-activate
#exit
radius attribute nas-ip-address address 172.17.1.1
radius server 192.168.83.2 encrypted key 01abd002c82b4a2c port 1812
radius accounting server 192.168.83.2 encrypted key 01abd002c82b4a2c port 1813
ha-service ha-pdsn
mn-ha-spi spi-number 256 encrypted secret 6c93f7960b726b6f6c93f7960b726b6f hash-algorithm md5
fa-ha-spi remote-address 192.168.82.0/24 spi-number 256 encrypted secret 1088bdd6817f64df
bind address 172.17.1.1
#exit
#exit
context destination
ip pool dynamic 172.18.0.0 255.255.0.0 public 0 srp-activate
ip pool static 172.19.0.0 255.255.240.0 static srp-activate
#exit
context srp
service-redundancy-protocol
#exit
#exit
end
 

Cisco Systems Inc.
Tel: 408-526-4000
Fax: 408-527-0883