provides a solution to avoid GTPU Path Failure when a burst of GTPU Error
Indication occurs. This enhancement is applicable only for SGSN.
Following a kernel crash and Hardware
Failure (Fabric corruption) in a Demux Card, the SGSN is unable to respond Echo
Requests from the GGSN. This results in Path Failure detection by the GGSN and
a large number of sessions are cleaned up.
But the sessions
are still active at the SGSN in PSC3 Cards where Session Manager is running.
The SGSN sends uplink data for these sessions and this triggers a flood of GTPU
Error Indications (~6 to ~9 million) from the GGSN to SGSN.
Demux card migration is triggered in the SGSN to recover from the kernel crash
and Hardware Failure. After the migration is completed, the SGSN restarts the
Path Management Echo Requests. But the GGSN had already started sending Echo
requests as soon as the new sessions were set up at the GGSN. This difference
in the restarting of the Echo requests from both ends on the path leads to
delay in detecting path failure between the SGSN and GGSN if echo responses are
not received for any reason.
Once the Demux
card has recovered at SGSN, the following are observed:
A flood of
GTPU Error Indication messages further result in packet drops at the SGSN
Request causing another path failure at the GGSN
cause a path failure on the SGSN with delay as well as loss of GTPU Error
Indications at SGSN
This delay in
Path Failure results in another flood of GTPU Error Indications in response to
SGSN uplink data for the active sessions, which were already cleaned up at the
GGSN (those created after first path failure). This flood of GTPU Error
Indications results in additional packet drops at the SGSN. The cycle of
cleaning up sessions and setting up new sessions continues until the SGSN is
The issue is
resolved by creating an additional midplane socket for GTPU Error Indications
so that flood of GTPU Error Indication will not create any impact on Path
Management. New midplane socket and flows have been introduced to avoid path
management failure due to flood of GTPU Error Indication packets. GTPU Echo
Request/Response will continue to be received at existing midplane sockets. A
new path for GTPU Error Indication will prevent issues in Path Management
towards GGSN or towards RNC and avoids un-wanted detection of path failures.
This enhancement requires new flows to be installed at the NPU.
existing statistics are helpful in observing loss of packets and drop of GTPU
Error Indication Packets:
[local]asr5000# show sgtpu statistics
Total Error Ind Rcvd: 0
Rcvd from GGSN: 0
Rcvd from RNC: 0
Rcvd from GGSN through RNC: 0
Rcvd from RNC through GGSN: 0
The following show
commands are useful to verify the NPU related statistics:
To check the flow
id range associated with sgtpcmgr, use the following command:
show npumgr flow range
To check whether
flow corresponding to GTPU Error Indication is installed or not, use the
[local]asr5000# show npu flow
record min-flowid id max-flowid id slot no verbose