Guest

Cisco BPX/IGX/IPX WAN Software

Field Notice: Slow Connection Reroute Times in BPX8600 and IGX8400 Networks


June 7, 2002



Products Affected

Product

BPX8600 Networks

IGX8400 Networks

Problem Description

After a trunk or node failure event, slow reroute times and failures to reroute connections that appear healthy but do not pass traffic may be seen.

Background

This anomaly has been observed only in networks with mixed 9.2 and 9.3 Cisco IOS® Software.

Problem Symptoms

Broken user connections are not listed in dspcons -f output but are not routed and are not passing data. Network users will not have access to remote resources.

Workaround/Solution

The anomaly discussed in this field notice may be avoided by not maintaining a mixed switch software environment (9.2 and 9.3) in the same network for extended periods of time.

Solution: Upgrade entire network to release 9.3 and monitor bug ID CSCdx55308 (registered customers only) for further solutions.

There are two methods for recovery should connections fail during the upgrade:

  1. Recommended for small networks. For purposes of this field notice, "small network" means 20 IGX8400 and BPX8600 nodes or less. Connections that are not listed as failed and also not passing traffic can be recovered in most cases, by using the command rrtcon -pfail will reroute all permanently failed connections. In the context of this field notice "Permanently Failed" refers to connections that failed because they did not reroute. The rrtcon -pfail command exists in release 9.3 and later releases. In previous releases the -pfail option did not exist for the rrtcon command. Rrtcon is a SuperUser level command.

  2. Recommended for larger networks. For purposes of this field notice, "larger networks" means greater than 20 IGX8400 and BPX8600 nodes. The following algorithm has been implemented and is being tested in Cisco's Large Scale Network Test. It has been designed with the primary goal of reducing the cluster lock effect.

How To Upgrade Software

Use the steps below:

  1. Collect dsprrgps (display reroute groups) for the entire network.

  2. Sort dsprrgps output in decreasing order of connections requiring reroute.

  3. If there are still connections requiring reroute:

    • If this is the first time dsprrgps is run.

    • Turn off routing on all nodes in the network * (Use Service Level command off1 Cm_Rerouting option.)

    • For each node in the node list compiled in step 2;

    • Wait based on the quantity of conns that need to be rerouted (as determined by dsprrgps collection):

    • Disable routing on the node ** (Use Service Level command option.)

    • Collect dsprrgps and regenerate the list of nodes with unrouted connections.

  4. Enable routing on all nodes in the network. (Use Service Level command on1 Cm_Rerouting option.)

  5. End

Note:?(*) This should be optimized to only turn off routing on those nodes with conns to reroute.

Note:?(**) While this step will guarantee that "cluster lock" cannot occur, it may be too aggressive and reduce overall performance. Under investigation.

DDTS

To follow the bug ID link below and see detailed bug information, you must be a registered user and you must be logged in.

DDTS

Description

CSCdx55308 (registered customers only)

Slow rerouting time after trunk event

For More Information

If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods: