1
|
Up
|
P-Reachable
|
G-Reachable
|
G-Reachable
|
No SSO
|
No Action
|
|
2
|
Up
|
P-Reachable
|
G-Reachable
|
G-Unreachable
|
No SSO
|
No action is required. The standby unit is not ready for SSO in this state because it does not have gateway reachability.
In this scenario, the standby unit appears in standby-recovery mode.
Spring Back:
If the gateway reachability is restored (G_Reachable), the controller returns to Standby state (no reboot is necessary).
Note
|
RP resources and gateway resources each trigger distinct actions.
|
|
Spring Back: If the gateway reachability is restored (G_Reachable), the controller transitions to Standby state. A reboot is not required.
|
3
|
Up
|
P-Reachable
|
G-Unreachable
|
G-Reachable
|
SSO
|
The system exchanges gateway reachability messages over the RMI and RP links. When the active controller reboots, the standby
controller takes over as the active controller. The RP goes down during the reboot process.
|
The Stack Manager sends a message to the standby controller to initiate a role change. The standby controller consults the
active controller.
-
If the active controller responds, the standby controller determines that the active controller does not have all the required
resources and allows the role change.
-
If the active controller does not respond, or if the RMI link is down, the standby controller proceeds with the role change
because it has all the resources required to become active.
|
4
|
Up
|
P-Reachable
|
G-Unreachable
|
G-Unreachable
|
No SSO
|
The standby controller is not ready for SSO in this state because it does not have gateway reachability. The standby controller
appears in Standby-Recovery mode.
|
SpringBack:
If the gateway reachability is restored on the Standby-Recovery controller (G_Reachable), the controller transitions to the
standby state.
|
5
|
Up
|
P-Unreachable
|
G-Reachable
|
G-Reachable
|
No SSO
|
No action taken when RMI goes DOWN. There will be no DAD when the RMI link is DOWN.
|
If gateway reachability (G_Reachable) is lost, the controller transitions to Standby. This situation is managed as case (3)
The active controller maintains its state when gateway reachability is lost.
No action is taken when the RMI link goes down.
Dual-Active Detection (DAD) does not occur when the RMI link is down.
|
6
|
Up
|
P-Unreachable
|
G-Reachable
|
G-Unreachable
|
No SSO
|
No Action. Standby is not ready for SSO in this state as it does not have gateway reachability. The standby shall be shown
to be in standby-recovery mode.
|
Spring Back:
If the gateway reachability is restored (G_Reachable), the controller shall go to Standby mode without a reload. There shall be no action if the RMI comes UP.
|
7
|
Up
|
P-Unreachable
|
G-Unreachable
|
G-Reachable
|
SSO
|
A gateway reachability message is also exchanged over the RP link. The Active device reboots so that the Standby device becomes
the new Active. The RP link goes down when the Active device reboots. The Stack Manager sends a message over the RP and RMI
links to the Standby Controller to initiate the role change. The Standby controller consults the Active device.
-
. If the Active responds, the Standby determines that the Active does not have all resources and allows the role change.
-
If the Active does not respond—possibly because the RP link is already down—the Standby allows the role change regardless
of resource status.
|
When the active controller reboots, the RP goes down. The Stack Manager sends a message over the RP and RMI link to the standby
controller to initiate a role change. The standby controller consults the active controller.
|
8
|
Up
|
P-Unreachable
|
G-Unreachable
|
G-Unreachable
|
No SSO
|
The standby controller is not ready for SSO in this state because it does not have gateway reachability. The standby controller
appears in standby-recovery mode.
Spring Back:
If gateway reachability is restored on the standby-recovery controller (G_Reachable), the controller transitions to standby.
Refer to step 7 for more details.
The active controller does not change its state when gateway reachability is lost.
No action occurs if the RMI comes up.
|
When the Active device reboots, the RP goes down. The Stack Manager sends a message over the RP and RMI links to the Standby
Controller to initiate a role change.
The Standby Controller consults the Active Controller.
-
. If the Active responds, the Standby deduces that the Active does not have all the required resources and proceeds with the
role change.
-
If the Active does not respond (for example, if the RP is already down), the Standby allows the role change regardless of
the resource status.
|
9
|
Down
|
P-Reachable
|
G-Reachable
|
G-Reachable
|
No SSO
|
When the RP is not available, the standby transitions to Standby-Recovery mode. The stack manager requests a role change
when the RP goes down. If the RMI is up, the RIF manager sends a message to the active unit to check its status. If a response
is received, the standby does not allow the role change and transitions to Standby-Recovery. If there is no response, such
as when the active unit is down due to a crash, the role change is allowed.
This scenario works differently if the RP goes down before the standby reaches Standby-Hot state. If the RP link goes down
before the standby becomes Standby-Hot, the RIF sends a positive response to the stack manager, resulting in a controller
reload.
|
Spring Back:
If gateway reachability is restored on the Standby-Recovery (G_Reachable), the controller transitions to Standby. In this
case, refer to state (7). The Active controller does not change its state when gateway reachability is lost. No action is
taken if the RMI comes up.
|
10
|
Down
|
P-Reachable
|
G-Reachable
|
G-Unreachable
|
No SSO
|
The standby is not ready for SSO in this state because it does not have gateway reachability. The standby will appear in standby-recovery
mode.
There are two possible scenarios:
-
The RP goes down first, followed by the standby gateway.
-
The standby gateway goes down first, followed by the RP.
Consider the case where the RP goes down first. In this situation, the stack manager requests a role change. However, because
the standby does not have gateway reachability, it cannot allow the role change. The system starts a 30-minute timer when
the RMI goes down (meaning both the RP and RMI are down).
If the RP goes down before the standby is in standby-hot state, the system reloads. There are several sub-cases:
-
If the active unit crashes and returns within 30 minutes, the timer stops. The standby remains in recovery and reboots when
the RP is up.
-
If the RP stays down, no action is taken when the timer expires, provided the RMI is up.
-
If the active unit continuously crashes, the timer expires with the RMI down, and the standby-recovery unit reboots as the
active unit.
Spring Back:
-
If the gateway returns first, the standby-recovery unit remains in recovery.
-
If the RP returns first, the system reboots to standby-recovery or standby, depending on whether the gateway is reachable.
|
When the RP goes down, the stack manager requests a role change. While the RMI is operational, the RIF manager sends a message
to the active controller to verify its status. If a response is received, the standby controller prevents the role change
and transitions to standby-recovery. If there is no response, such as when the active controller is down due to a crash, the
role change is permitted.
However, if the RP goes down before the standby controller reaches the standby-hot state, the RIF manager sends a positive
response to the stack manager, which results in a controller reload.
|
11
|
Down
|
P-Reachable
|
G-Unreachable
|
G-Reachable
|
SSO
|
The system exchanges gateway reachability messages over RP and RMI links.
When the Old-Active controller transitions to Active-Recovery mode, configuration mode is disabled. All interfaces are set
to ADMIN DOWN, except for the wireless management interface with the RMI IP address.
When the RP link comes up, the controller in Active-Recovery reloads to become standby. If the gateway remains unreachable,
it reloads to become standby-recovery.
If the gateway (GW) is lost and the RP link goes down less than eight seconds after the gateway loss, the following actions
occur. The stack manager requests a role change on the standby controller. If the standby controller has not reached Standby-Hot
state, it allows a reload. Otherwise, it queries the active controller.
If the active controller lacks resources and responds affirmatively, the standby controller becomes active, DAD runs, and
the old active controller transitions to Active-Recovery.
|
Assume that the active device has already lost the gateway (GW), and then the RP goes down. If the gateway is lost for less
than eight seconds, the system triggers a stateful switchover (SSO) that is initiated by the gateway. This scenario describes
when the gateway has been lost for less than eight seconds before the RP goes down.
In this case, the Stack Manager requests a role change from the standby device. If the standby has not yet reached the Standby-Hot
state, the system sends a positive response to the Stack Manager. Since the standby has all resources available except for
the RP, it sends a query to the active device to request a role change. The active device responds affirmatively because it
does not have all the necessary resources.
The standby then becomes active. DAD must run to ensure that the new active device maintains its status. The former active
device enters the Active-Recovery state.
Spring Back:
The controller in Active-Recovery reboots after the RP link is restored. If the gateway is still down, the controller transitions
to standby-recovery. If the gateway is restored, the controller transitions to standby.
|
12
|
Down
|
P-Reachable
|
G-Unreachable
|
G-Unreachable
|
No SSO
|
p
Standby transitions to Standby-Recovery. Assume both controllers lose gateway, then RP goes down. Stack manager requests a
role change. Because Standby lacks resources, it starts a 30-minute timer when RMI goes down (that is, RP and RMI are both
down).
here are three possible outcomes:
-
If Active recovers within 30 minutes, the timer stops. Standby remains in recovery and may reboot when RP returns.
-
If RP stays down, no action occurs when the timer expires, provided RMI is up.
-
If Active never recovers, the timer expires with RMI down, and Standby-Recovery reboots as Active.
-
If Active never recovers, the timer expires with RMI down, and Standby-Recovery reboots as Active.
Note
|
If gateway reachability was not enabled, SSO is not allowed when Active is up. If Active is down and Standby is standby-hot,
SSO is allowed. If RP returns before standby-hot, it reloads. Note: Recovery to Standby without reload is possible only if
recovery was due solely to gateway.
|
Spring Back:
-
If gateway returns first, the system remains in Standby-Recovery.
-
If RP returns first, the system reboots to Standby-Recovery and then to Standby if gateway is up.
|
Let us assume that both the controllers lost their GW and then the RP went DOWN.
The stack manager will request for a role change when the RP goes DOWN. The standby anyway does not have all resources (Gateway
Reachability at present) and hence it shall not allow role change to happen. It will start the 30 min timer when RMI goes
DOWN( timer starts when RP+RMI are DOWN).There are now two possibilities:
-
The active suffered a software glitch (For example: a crash) in which case, it would come up within 30 minutes and the timer
would be stopped. The standby will continue to be in standby-recovery. If the RP comes UP when the timer is running, the Standby-Recovery
would reboot and might come up as Standby or Standby-Recovery.
-
Physical RP connection went down and it remains down. When the timer expires, if the RMI is UP, no action shall be taken.
-
The active continuously crashes, that is, it does not come up after 30 minutes. In this case, when the timer expires,the RMI
will be DOWN. The standby-recovery shall reboot when the timer expires (and might come UP as Active.)
When RP DOWN event is received, if the Gateway Reachability is not enabled, Gateway will not be considered as a resource.
In this case, SSO shall not be allowed if the Active is UP. SSO shall be allowed if Active is DOWN, provided Standby is in
Standby-Hot state.
If the RP link goes down before the standby becomes standby-hot, it shall reload.
Note
|
The Standby-Recovery that has lost RP is no more Standby Hot. This implies that the recovery from Standby-Recovery to Standby
without a reboot (as was the case earlier in 17.2) is not possible for RP events. It is however possible for Gateway events.
|
Spring Back:
-
When the Standby-Recovery findsGateway is UP it continues to be in Standby-Recovery if RP is still DOWN.
-
When the Standby-Recovery finds that its RP is UP, it will reboot and come up as Standby-Recovery
|
13
|
Down
|
P-Unreachable
|
G-Reachable
|
G-Reachable
|
SSO
|
A double fault may result in two active controllers. When this occurs, the Standby controller becomes active, but the original
Active controller may still exist. Once connectivity is restored, role negotiation ensures that the most recent Active controller
is retained.
In the event that RMI goes down and then RP also goes down, the stack manager requests a role change. If RMI is unavailable,
Standby grants the role change only if it is in standby-hot mode; otherwise, it denies the request. If RP returns before standby-hot
mode is reached, it reloads.
Spring Back:
If RMI returns, the previous Active controller enters Active-Recovery mode. When RP returns, the controller reboots and transitions
to Standby. If RP goes down, RMI goes down, and the timer expires, Standby reboots as Active. The timer may be skipped in
cases of a pure double fault.
Note
|
You may skip the timer for pure double-fault cases.
|
|
Let us assume that the RMI goes DOWN first and then the RP goes DOWN. When the RP goes DOWN, the stack manager requests a
role change. Since the RMI is DOWN, the standby cannot consult with the Active. The standby allows a role change to become
Active, regardless of its resource state, provided the standby is in Standby-Hot. If the standby is not in Standby-Hot, a
role change is not allowed. If the RP link goes down before the standby becomes Standby-Hot, the standby reloads
Spring Back:
If the RMI comes UP at any time, Old Active transitions to Active-Recovery. Active-Recovery reboots when the RP comes up,
after which it will become Standby.
If the RP goes DOWN first, refer to case (9). If RP_DOWN and RMI_DOWN occur in that sequence and the 30-minute timer expires,
the standby shall reboot. It will come up as Active if RP and RMI continue to be DOWN. Alternatively, the 30-minute timer
may not be started in this case.
What if the RP goes DOWN first - see case(9) above.
If the RP goes DOWN first, refer to case (9). If RP_DOWN and RMI_DOWN occur in that sequence and the 30-minute timer expires,
the standby shall reboot. It will come up as Active if RP and RMI continue to be DOWN. Alternatively, the 30-minute timer
may not be started in this case.
The timer can be used when the standby does not have all required resources, such as gateway reachability at present or port
status and gateway reachability in the future, to take over as Active.
Note
|
Another option is to not start the 30-minute timer in this situation. Use the timer only if the standby does not have all
the required resources to take over as active. Currently, this refers to gateway reachability; in the future, it may also
include port status and gateway reachability.
|
|
14
|
Down
|
P-Unreachable
|
G-Reachable
|
G-Unreachable
|
No SSO
|
Double fault – two active controllers possible. Old Active stays Active; Standby may become Active if connectivity is not
restored within a set time. If Standby is in standby-recovery due to GW loss, then RMI goes down, then RP goes down. Stack
manager requests role change; no RMI means no consult, so Standby allows change. If Active crashed, it restarts as Standby;
if both come up, split-brain conflict may occur.
|
Let us assume that the Standby is inStandby-Recovery mode as it loses GW.
Let us assume that the RMI goes DOWN first and then the RP goes DOWN.
The stack manager shall request role change when the RP goes DOWN. Since the RMI isDOWN, the standby cannot consult with the
Active. The standby shall allow role change.
Spring Back:
If RMI returns, Old Active enters Active-Recovery and reboots on RP return to become Standby
|
15
|
Down
|
P-Unreachable
|
G-Unreachable
|
G-Reachable
|
SSO
|
Double fault – two active controllers possible. Standby becomes active; old Active may still exist. Role negotiation occurs
once connectivity is restored. Assume GW loss on Active, then RMI down then RP down. Stack manager requests role change; no
RMI means standby allows change if in standby-hot, else reloads. If RP returns before standby-hot, it reloads.
Spring Back:
If RMI returns, old Active goes to Active-Recovery and reboots on RP return to become Standby.
|
Suppose the Standby is in Standby-Recovery mode after losing GW. Assume the RMI goes down first, then the RP goes down. The
stack manager requests a role change when the RP goes down. Because the RMI is down, the Standby cannot consult with the Active,
so it allows the role change. If the Active went down due to a software glitch, it will come up and become Standby. If no
communication is established between the two controllers, both may become active, causing a network conflict
Spring Back:
If the RMI comes UP at some point of time,Old Active will go to Active-Recovery. Active-Recovery shall reboot when the RP
comes up and will become Standby.
|
16
|
Down
|
P-Unreachable
|
G-Unreachable
|
G-Unreachable
|
No SSO
|
A double fault can result in two active controllers. The old Active remains Active, and the Standby may become Active if connectivity
is not restored within a stipulated time.
If both controllers lose GW and the Standby is in standby-recovery, then RMI goes down, followed by RP going down. The stack
manager requests a role change. If there is no RMI, the Standby allows the change, which can cause a conflict.
Spring Back:
If RMI returns, the old Active enters Active-Recovery and, when RP returns, reboots to become Standby.
|
Assume that both Active and Standby lose GW, and Standby enters Standby-Recovery. If RMI goes DOWN first, followed by RP going
DOWN, the stack manager requests a role change when RP goes DOWN. Since RMI is DOWN, Standby cannot consult with Active and
allows the role change. This situation can cause a network conflict.
Spring Back:
If RMI comes UP at any point, the old Active transitions to Active-Recovery. Active-Recovery reboots when RP comes UP and
then becomes Standby.
|