provides a solution to avoid GTPU Path Failure when a burst of GTPU Error
Indication occurs. This enhancement is applicable only for SGSN.
Following a kernel crash and
Hardware Failure (Fabric corruption) in a Demux Card, the SGSN is unable to
respond Echo Requests from the GGSN. This results in Path Failure detection by
the GGSN and a large number of sessions are cleaned up.
But the sessions are still
active at the SGSN in PSC3 Cards where Session Manager is running. The SGSN
sends uplink data for these sessions and this triggers a flood of GTPU Error
Indications (~6 to ~9 million) from the GGSN to SGSN.
Simultaneously a Demux card
migration is triggered in the SGSN to recover from the kernel crash and
Hardware Failure. After the migration is completed, the SGSN restarts the Path
Management Echo Requests. But the GGSN had already started sending Echo
requests as soon as the new sessions were set up at the GGSN. This difference
in the restarting of the Echo requests from both ends on the path leads to
delay in detecting path failure between the SGSN and GGSN if echo responses are
not received for any reason.
Once the Demux card has
recovered at SGSN, the following are observed:
A flood of GTPU Error
Indication messages further result in packet drops at the SGSN
The Echo Request causing
another path failure at the GGSN
Echo Response cause a path failure on
the SGSN with delay as well as loss of GTPU Error Indications at SGSN
This delay in Path Failure results in
another flood of GTPU Error Indications in response to SGSN uplink data for the
active sessions, which were already cleaned up at the GGSN (those created after
first path failure). This flood of GTPU Error Indications results in additional
packet drops at the SGSN. The cycle of cleaning up sessions and setting up new
sessions continues until the SGSN is restarted.
The issue is
resolved by creating an additional midplane socket for GTPU Error Indications
so that flood of GTPU Error Indication will not create any impact on Path
Management. New midplane socket and flows have been introduced to avoid path
management failure due to flood of GTPU Error Indication packets. GTPU Echo
Request/Response will continue to be received at existing midplane sockets. A
new path for GTPU Error Indication will prevent issues in Path Management
towards GGSN or towards RNC and avoids un-wanted detection of path failures.
This enhancement requires new flows to be installed at the NPU.
existing statistics are helpful in observing loss of packets and drop of GTPU
Error Indication Packets:
show sgtpu statistics
Total Error Ind Rcvd: 0
Rcvd from GGSN: 0
Rcvd from RNC: 0
Rcvd from GGSN through RNC: 0
Rcvd from RNC through GGSN: 0
The following show
commands are useful to verify the NPU related statistics:
To check the flow
id range associated with sgtpcmgr, use the following command:
For ASR 5500:
show npumgr flow range
To check whether
flow corresponding to GTPU Error Indication is installed or not, use the