Session Recovery


Session Recovery
 
 
In the telecommunications industry, over 90 percent of all equipment failures are software-related. With robust hardware failover and redundancy protection, any card-level hardware failures on the system can quickly be corrected. However, software failures can occur for numerous reasons, many times without prior indication. For this reason, we have introduced a new solution to recover subscriber sessions in the event of failure.
The Session Recovery feature provides seamless failover and reconstruction of subscriber session information in the event of a hardware or software fault within the system preventing a fully connected user session from being disconnected.
This feature is available for the following functions:
Session recovery is not supported for the following functions:
Important: Session Recovery can only be enabled through a feature use license key. If you have not previously purchased this enhanced feature, contact your sales representative for more information.
When session recovery occurs, the system reconstructs the following subscriber information:
Session Recovery is also useful for Software Patch Upgrade activities. If session recovery feature is enabled during the software patch upgrading, it helps to permit preservation of existing sessions on the active PAC/PSC/PSC2 during the upgrade process. For more details refer to Software Patch Upgrade in the System Administration Guide.
Important: Any partially connected calls (e.g., a session where HA authentication was pending but has not yet been acknowledged by the AAA server) are not recovered when a failure occurs.
 
How Session Recovery Works
This section provides an overview of how this feature is implemented and the recovery process.
Session recovery is performed by mirroring key software processes (e.g., session manager and AAA manager) within the system. These mirrored processes remain in an idle state (in standby-mode), wherein they perform no processing, until they may be needed in the case of a software failure (e.g., a session manager task aborts). The system spawns new instances of “standby mode” session and AAA managers for each active control processor (CP) being used. Naturally, these mirrored processes require both memory and processing resources, which means that additional hardware may be required to enable this feature (see the Additional Hardware Requirements section).
Additionally, other key system-level software tasks, such as VPN manager, are performed on a physically separate Packet Accelerator Card (PAC)/Packet Services Card (PSC/PSC2) to ensure that a double software fault (e.g., session manager and VPN manager fails at same time on same card) cannot occur. The PAC/PSC/PSC2 used to host the VPN manager process is in active mode and is reserved by the operating system for this sole use when session recovery is enabled.
There are two modes of session recovery.
Task recovery mode: Wherein one or more session manager failures occur and are recovered without the need to use resources on a standby PAC/PSC/PSC2. In this mode, recovery is performed by using the mirrored “standby-mode” session manager task(s) running on active PACs/PSCs/PSC2s. The “standby-mode” task is renamed, made active, and is then populated using information from other tasks such as AAA manager. In case of Task failure, limited subscribers will be affected and will suffer outage only until the task starts back up.
Full PAC/PSC/PSC2 recovery mode: Used when a PAC/PSC/PSC2 hardware failure occurs, or when a planned PAC/PSC/PSC2 migration fails. In this mode, the standby PAC/PSC/PSC2 is made active and the “standby-mode” session manager and AAA manager tasks on the newly activated PAC/PSC/PSC2 perform session recovery.
Session/Call state information is saved in the peer AAA manager task because each AAA manager and session manager task is paired together. These pairs are started on physically different application cards to ensure task recovery.
There are some situations wherein session recovery may not operate properly. These include:
Important: After a session recovery operation, some statistics, such as those collected and maintained on a per manager basis (AAA Manager, Session Manager, etc.) are in general not recovered, only accounting/billing related information is checkpointed/recovered.
 
Additional Hardware Requirements
Because session recovery requires numerous hardware resources, such as memory, control processors, NPU processing capacity, etc., some additional hardware may be required to ensure that enough resources are available to fully support this feature.
Important: A minimum of four PACs/PSCs/PSC2s (three active and one standby) per individual chassis is required to use this feature.
To allow for complete session recovery in the event of a hardware failure during a PAC/PSC migration, a minimum of three active PACs/PSCs/PSC2s and two standby PACs/PSCs/PSC2s should be deployed.
To assist you in your network design and capacity planning, the following list provides information that should be considered.
If a PAC/PSC/PSC2 migration is being performed, this may temporarily impact the ability to perform session recovery as hardware resources (e.g., memory, processors, etc.) that may be needed are not available during this operation. To avoid this condition, a minimum of two standby PACs/PSCs/PSC2s should be configured.
 
Configuring the System to Support Session Recovery
The following configuration procedures allow you to configure the session recovery feature for either an operational system that is currently in-service (able to accept incoming calls) or a system that is out-of-service (not part of your production network and therefore not processing any live subscriber/customer data).
Important: Session recovery can only be enabled through a feature use license key. If you have not previously purchased this enhanced feature, contact your sales representative for more information.
The session recovery feature, even when the feature use key is present, is disabled by default on the system.
 
Enabling Session Recovery
As noted earlier, session recovery can be enabled on a system that is out-of-service (OoS) and does not yet have any contexts configured, or on an in-service system that is currently capable of processing calls. However, if the system is in-service, it must be restarted before the session recovery feature takes effect. Each procedure is shown below.
 
Enabling Session Recovery on an Out-of-Service System
The following procedure is for a system that does not have any contexts configured.
To enable the session recovery feature on an out-of-service (OoS) system, follow the procedure below. This procedure assumes that you begin at the Exec mode prompt.
Step 1
 
show license info
The output of this command appears similar to the example shown below. Note that the session recovery feature is bold-faced in this example.
 
Key Information (installed key):
   Comment                <Host Name>
   CF Device 1            Model: "SanDiskSDCFB-512"
                          Serial Number: "115212D1904T0314"
   CF Device 2            Model: "SanDiskSDCFB-512"
                          Serial Number: "115206D1904S5951"
   Date of Issue          Thursday May 12 14:35:50 EDT 2005
   Issued By              <Vendor Name>
   Key Number             17120
Enabled Features:
   Part Number  Quantity  Feature
   -----------  --------  -----------------------
   xxx-xx-xxxx        15  PDSN/GGSN/SGSN (10K)
        [none]        -   FA         
        [none]        -   IPv4 Routing Protocols
   xxx-xx-xxxx        -   IPSec
   xxx-xx-xxxx        -   2TP LAC (PDSN/GGSN/SGSN)
   xxx-xx-xxxx        1   L2TP LNS (10K)
   xxx-xx-xxxx        6   L2TP LNS (1K)
   xxx-xx-xxxx        -   Session Recovery (PDIF/PDSN/GGSN/SGSN)
        [none]         - Session Recovery (HA)
   xxx-xx-xxxx        -   PCF Monitoring
   xxx-xx-xxxx        -   Layer 2 Traffic Management
 Session Limits:
                Sessions  Session Type
                --------  -----------------------
                  150000  PDSN/GGSN/SGSN
 Status:
                   16000  L2TP LNS
   CF Device 1            Does not match either SPC
   CF Device 2            Does not match either SPC
   License Status         Good (Not Redundant)
Important: If the Session Recovery feature appears as Disabled, then you cannot enable this feature until a new license key is installed in the system.
Step 2
configure
   require session recovery
   end
Step 3
Save your configuration as described in the Saving Your Configuration section in the System Administration Guide.
The system, when started, enables session recovery, creates all mirrored “standby-mode” tasks, and performs PAC/PSC/PSC2 reservations and other operations automatically.
Step 4
 
Enabling Session Recovery on an In-Service System
When enabling session recovery on a system that already has a saved configuration, the session recovery commands are automatically placed before any service configuration commands in the configuration file.
To enable the session recovery feature on an in-service system, follow the procedure below. This procedure assumes that you begin at the Exec mode prompt.
Step 1
 
show license info
The output of this command appears similar to the example shown below. Note that the session recovery feature is bold-faced in this example.
 
Key Information (installed key):
   Comment                <Host Name>
   CF Device 1            Model: "SanDiskSDCFB-512"
                          Serial Number: "115212D1904T0314"
   CF Device 2            Model: "SanDiskSDCFB-512"
                          Serial Number: "115206D1904S5951"
   Date of Issue          Thursday May 12 14:35:50 EDT 2005
   Issued By              <Vendor Name>
   Key Number             17120
Enabled Features:
   Part Number  Quantity  Feature
   -----------  --------  -----------------------
   xxx-xx-xxxx        15  PDSN/GGSN/SGSN (10K)
        [none]        -   FA         
        [none]        -   IPv4 Routing Protocols
   xxx-xx-xxxx        -   IPSec
   xxx-xx-xxxx        -   2TP LAC (PDSN/GGSN/SGSN)
   xxx-xx-xxxx        1   L2TP LNS (10K)
   xxx-xx-xxxx        6   L2TP LNS (1K)
   xxx-xx-xxxx        -   Session Recovery (PDIF/PDSN/GGSN/SGSN)
        [none]         - Session Recovery (HA)
   xxx-xx-xxxx        -   PCF Monitoring
   xxx-xx-xxxx        -   Layer 2 Traffic Management
 Session Limits:
                Sessions  Session Type
                --------  -----------------------
                  150000  PDSN/GGSN/SGSN
 Status:
                   16000  L2TP LNS
   CF Device 1            Does not match either SPC
   CF Device 2            Does not match either SPC
   License Status         Good (Not Redundant)
Important: If the Session Recovery feature for HA appears as Disabled, then you cannot enable this feature until a new license key is installed in the system.
Step 2
configure
   require session recovery
   end
Important: This feature does not take effect until after the system has been restarted.
Step 3
Step 4
reload
The following prompt appears:
Are you sure? [Yes|No]:
Confirm your desire to perform a system restart by entering the following:
yes
The system, when restarted, enables session recovery and creates all mirrored “standby-mode” tasks, performs PAC/PSC/PSC2 reservations, and other operations automatically.
Step 5
Important: More advanced users may opt to simply insert the require session recovery command syntax into an existing configuration file using a text editor or other means, and then applying the configuration file manually. Caution should be taken when doing this to ensure that this command is placed among the first few lines of any existing configuration file to ensure that it appears before the creation of any non-local context.
 
Disabling the Session Recovery Feature
To disable the session recovery feature on a system, enter the following command from the Global Configuration mode prompt:
 
no require session recovery
Important: If this command is issued on an in-service system, then the system must be restarted by issuing the reload command.
 
Viewing Session Recovery Status
To determine if the system is capable of performing session recovery, when enabled, enter the following command from the Exec mode prompt.
 
show session recovery status [verbose]
The output of this command should be similar to the examples shown below.
 
[local]host_name# show session recovery status
Session Recovery Status:
  Overall Status         : SESSMGR Not Ready For Recovery
  Last Status Update     : 1 second ago
 
[local]host_name# show session recovery status
Session Recovery Status:
  Overall Status         : Ready For Recovery
  Last Status Update     : 8 seconds ago
 
[local]host_name# show session recovery status verbose
Session Recovery Status:
  Overall Status         : Ready For Recovery
  Last Status Update     : 2 seconds ago
 
              ----sessmgr---     ----aaamgr----     demux
 cpu state    active  standby    active  standby    active  status
---- -------  ------  -------    ------  -------    ------  ------------
 1/1 Active   2       1          1       1          0       Good
 1/2 Active   1       1          0       0          0       Good
 1/3 Active   1       1          3       1          0       Good
 2/1 Active   1       1          1       1          0       Good
 2/2 Active   1       1          0       0          0       Good
 2/3 Active   2       1          3       1          0       Good
 3/0 Active   0       0          0       0          1       Good (Demux)
 3/2 Active   0       0          0       0          1       Good (Demux)
 4/1 Standby  0       2          0       1          0       Good
 4/2 Standby  0       1          0       0          0       Good
 4/3 Standby  0       2          0       3          0       Good
 
[local]host_name#
 
 
Viewing Recovered Session Information
Per subscriber session information is available to show any changes in session recovery status. A new field has been added to the show subscriber debug-info command that is named “Redundancy Status”. This field shows whether or not the session has been recovered or is the original information. There are two valid outputs for this field:
 
Original - indicating that this is the original session information, containing all event states and time information.
Recreated Session - indicating that this session was reconstructed during a session recovery operation.
This command can be executed before or after a session recovery operation has been performed, and would show information relative to the specific session.
To view session state information and any session recovery status, enter the following command:
 
show subscriber debug-info {callid | msid | username}
Displays subscriber information for the call specified by id. The call ID must be specified as an 8-byte hexadecimal number.
Displays information for the mobile user identified by id. id must be from 7 to 16 digits specified as an IMSI, MIN, or RMI. Wildcard characters $ and * are allowed. The * wildcard matches multiple characters and the $ wildcard matches a single character. If you do not want the wildcard characters interpreted as a wildcard enclose them in single quotes ( ‘ ). For example; ‘$’.
Displays information for connections for the subscriber identified by name. The user must have ben previously configured. name must be a sequence of characters and/or wildcard characters ('$' and '*') from 1 to 127 characters in length. The * wildcard matches multiple characters and the $ wildcard matches a single character. If you do not want the wildcard characters interpreted as wildcard enclose them in single quotes ( ‘). For example; ‘$’.
The following example shows the output of this command both before and after a session recovery operation has been performed. The “Redundancy Status” fields in this example have been bold-faced for clarity.
 
username: user1       callid: 01ca11b1         msid: 0000100003
  Card/Cpu: 4/2
  Sessmgr Instance: 7
  Primary callline:
  Redundancy Status: Original Session
  Checkpoints    Attempts    Success    Last-Attempt    Last-Success
     Full:             69         68         29800ms         29800ms
     Micro:           206        206         20100ms         20100ms
   Current state: SMGR_STATE_CONNECTED
   FSM Event trace:
         State                            Event
         SMGR_STATE_OPEN                  SMGR_EVT_NEWCALL
         SMGR_STATE_NEWCALL_ARRIVED       SMGR_EVT_ANSWER_CALL
         SMGR_STATE_NEWCALL_ANSWERED      SMGR_EVT_LINE_CONNECTED
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_LINK_CONTROL_UP
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_AUTH_REQ
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_IPADDR_ALLOC_SUCCESS
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_UPDATE_SESS_CONFIG
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_LOWER_LAYER_UP
  Data Reorder statistics
  Total timer expiry:            0         Total flush (tmr expiry):    0
      Total no buffers:          0         Total flush (no buffers):    0
      Total flush (queue full):  0         Total flush (out of range):  0
      Total flush (svc change):  0         Total out-of-seq pkt drop:   0
      Total out-of-seq arrived:  0
  IPv4 Reassembly Statistics:
       Success:                  0         In Progress: 0
       Failure (timeout):        0         Failure (no buffers): 0
       Failure (other reasons):  0
  Redirected Session Entries:
 
       Allowed:                  2000      Current:                   0
       Added:                    0        Deleted:                  0
       Revoked for use by different subscriber: 0
  Peer callline:
  Redundancy Status: Original Session
  Checkpoints    Attempts    Success    Last-Attempt    Last-Success
     Full:              0          0             0ms             0ms
     Micro:             0          0             0ms             0ms
   Current state: SMGR_STATE_CONNECTED
   FSM Event trace:
         State                            Event
         SMGR_STATE_LINE_CONNECTED        SMGR_EVT_LOWER_LAYER_UP
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_CONNECTED             SMGR_EVT_REQ_SUB_SESSION
         SMGR_STATE_CONNECTED             SMGR_EVT_RSP_SUB_SESSION
         SMGR_STATE_CONNECTED             SMGR_EVT_ADD_SUB_SESSION
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_REQ
         SMGR_STATE_CONNECTED             SMGR_EVT_AUTH_SUCCESS
  Data Reorder statistics
         Total timer expiry:       0         Total flush (tmr expiry): 0
         Total no buffers:         0         Total flush (no buffers): 0
         Total flush (queue full): 0         Total flush (out of range):0
         Total flush (svc change): 0         Total out-of-seq pkt drop: 0
         Total out-of-seq arrived: 0
  IPv4 Reassembly Statistics:
         Success:                  0         In Progress:               0
         Failure (timeout):        0         Failure (no buffers):      0
         Failure (other reasons):  0
  Redirected Session Entries:
         Allowed:               2000         Current:                  0
         Added:                    0         Deleted:                  0
         Revoked for use by different subscriber: 0
 
Notice that is the example above, where the session has been recovered/recreated, that state events (FSM Event State field) no longer exist. This field is re-populated as new state changes occur.
 

Cisco Systems Inc.
Tel: 408-526-4000
Fax: 408-527-0883