Cisco IOS XR Troubleshooting Guide for the Cisco ASR 9000 Aggregation Services Router
Troubleshooting Fabric
Downloads: This chapterpdf (PDF - 495.0KB) The complete bookPDF (PDF - 3.41MB) | Feedback

Troubleshooting Router Switch Fabric and Data Path

Table Of Contents

Troubleshooting Router Switch Fabric and Data Path

Understanding Switch Fabric Architecture

Getting Started with Fabric Troubleshooting

Troubleshooting Packet Drops

Displaying Traffic Status in Line Cards and RSP Cards

Locating Packet Drops by Examining Counters

Locating Drops of Punted Packets

Packet Drop from LC to LC

Packet Drop Between RSP and LC

Packet Drop After Certain Actions

Packet Drop After a Redundancy Switchover

Packet Drop with Unknown Reason

Troubleshooting RSP and LC Crashes

Active RSP Is Crashing

Standby RSP Is Crashing

LC Is Crashing

Troubleshooting Complete Loss of Traffic

No Traffic from LC to LC

No Traffic Between RSP and LC

Gathering Fabric Information Before Calling TAC


Troubleshooting Router Switch Fabric and Data Path


This chapter describes techniques to troubleshoot router switch fabric and data path. It includes the following sections:

Understanding Switch Fabric Architecture

Getting Started with Fabric Troubleshooting

Troubleshooting Packet Drops

Troubleshooting RSP and LC Crashes

Troubleshooting Complete Loss of Traffic

Gathering Fabric Information Before Calling TAC

Understanding Switch Fabric Architecture

Figure 7-1 provides an overview of the switch fabric architecture.

Figure 7-1 Switch Fabric Architecture

As shown in Figure 7-1, there are two fabric interface ASIC on each RSP. Each fabric interface ASIC provides 40 GB of throughput. If one RSP is lost, the shelf can still operate at full capacity without loss of bandwidth.

Each line card (LC) has four 23 GB fabric channels on which to send traffic to the fabric ASICs. The switch fabric is in an active/active relationship. All four fabric ASICs are active, even though the RSP cards are in an active/standby relationship. The system performs load balancing on unicast traffic across these four channels.

The arbiters are in an active/standby relationship (the arbiter on the active RSP card is the active arbiter). Both the active and standby arbiters receive requests for switch fabric access from the LCs. If there is a switchover of the active RSP, the standby RSP arbiter has a current copy of switch fabric requests, which helps to speed up the switchover.

Figure 7-2 shows the data path from ingress to egress. (Several types of LCs are shown in this example.)

Figure 7-2 Data Path

As shown in the drawing, the path travelled by each data packet is:

Incoming interface on LC--> NP mapped to incoming interface on LC --> Bridge3 on LC --> FIA on LC --> Crossbar switch on RSP --> FIA on LC ---> Bridge3 on LC ---> NP mapped to outgoing interface ---> Outgoing Interface


Note In this document, the network processor ASICs are referred to either as network processors (NPs) or network processor units (NPUs).


Getting Started with Fabric Troubleshooting

To begin troubleshooting problems with the fabric, perform the following steps.


Step 1 Look for active platform fault manager (PFM) alarms on the LCs and RSPs.

Step 2 Check that you have the appropriate version of the bridge field-programmable gate arrays (FPGAs) in your RSP card.

Step 3 Check that you have the correct software version, board, and FPGA and ASIC versions.

RP/0/RSP0/CPU0:router# show version 
RP/0/RSP0/CPU0:router# show inventory raw 
RP/0/RSP0/CPU0:router# show hw-module fpd location all
 
   

Step 4 Check if there are any errors detected by the system diagnostics.

RP/0/RSP0/CPU0:router# show diag 
 
   

Step 5 Check that you have the appropriate version of the NPs in your RSP cards.

RP/0/RSP0/CPU0:router# show controllers np summary all 
 
   
Node: 0/1/CPU0:
----------------------------------------------------------------
 [total 4 NP]  Driver - Version 10.26a Build 9 ( Dec 13 2008, 20:47:03 )
 NP 0 :  Hardware rev v2 A1
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 1 :  Hardware rev v2 A1
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 2 :  Hardware rev v2 A1
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 3 :  Hardware rev v2 A1
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 
                Node: 0/2/CPU0:  <-- [ LC built with A0 NPU that has known issue ]
----------------------------------------------------------------
 [total 4 NP]  Driver - Version 10.26a Build 9 ( Dec 13 2008, 20:47:03 )
 NP 0 :  Hardware rev v2 A0
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 1 :  Hardware rev v2 A0
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 2 :  Hardware rev v2 A0
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 
 NP 3 :  Hardware rev v2 A0
      :  Ucode - Version: 255.255 Build Date: ( Dec 12 2008, 2:13:00 ) 

Troubleshooting Packet Drops

This section explains how to track packets through the system from ingress to egress, and how to troubleshoot packet drops. It includes the following sections:

Displaying Traffic Status in Line Cards and RSP Cards

Locating Packet Drops by Examining Counters

Locating Drops of Punted Packets

Packet Drop from LC to LC

Packet Drop Between RSP and LC

Packet Drop After Certain Actions

Packet Drop After a Redundancy Switchover

Packet Drop with Unknown Reason

Displaying Traffic Status in Line Cards and RSP Cards

Figure 7-3 shows the traffic path on the LC and the corresponding CLI commands you use to display the status at each point in the path.

Figure 7-3 LC Traffic Path and Corresponding CLI Commands

Figure 7-4 shows the traffic path on the RSP and the corresponding CLI commands you use to display information at each point in the path.

Figure 7-4 RSP Traffic Path and Corresponding CLI Commands

Locating Packet Drops by Examining Counters

To locate the source of packet drops, perform the following procedure.

SUMMARY STEPS

1. Clear the interface counters

2. Clear the NP counters

3. Clear the fabric counters

4. Start the traffic pattern that caused the packet drop

5. Display the NP-to-interface mapping.

6. Check the counters at the input interface

7. Check the NP counters

8. Check the NP Bridge3 counters

9. Check the bridge counters

10. Check the fabric interface ASIC (FIA) counters

11. Check the crossbar counters


Note For the procedure to troubleshoot drops of punted packets, see the Locating Drops of Punted Packets.


DETAILED STEPS


Step 1 Clear the interface counters.

RP/0/RSP0/CPU0:router# clear counters all
 
   
Clear "show interface" counters on all interfaces [confirm]
 
   

Step 2 Clear the NP counters.

RP/0/RSP0/CPU0:router# clear controller np counters all
 
   

Step 3 Clear fabric counters.

a. Clear FIA and bridge counters on the LC and RSP.

RP/0/RSP0/CPU0:router# clear controller fabric fia location
 
   

b. Clear fabric crossbar counters.

RP/0/RSP0/CPU0:router# clear controller fabric crossbar-counters location 
 
   

Step 4 Start the traffic pattern that caused the packet drop.

Step 5 Run the following command to display the NP-to-interface mapping.

RP/0/RSP0/CPU0:router# show controllers np ports all 
 
   

Step 6 Check the counters at the input interface.

RP/0/RSP0/CPU0:router# show interfaces type location
 
   

Step 7 Check the NP counters to verify that traffic is flowing in NP counters along the data path.

RP/0/RSP0/CPU0:router# show controllers np counters {np0|np1|np2|np3|all} location node-id 
{| include DROP}
 
   
RP/0/RSP0/CPU0:router# show controllers np counters np3 location 0/0/CPU0
 
   
RP/0/RSP0/CPU0:router# show controllers np counters np3 location 0/0/CPU0  | include DROP 
 
   

The show controllers np command displays information about counters that helps you troubleshoot drops in the LCs. The names of the internal NP counters have the general format STAGE_DIRECTION_ACTION, for example, PARSE_FABRIC_RECEIVE_CNT, RESOLVE_EGRESS_DROP_CNT, and MODIFY_FRAMES_PADDED_CNT.

The values of stage, directon, and action are as follows:

There are five stages in the NP:

Parse

Search-I

Modify

Search-II

Resolve

Examples of the direction are:

Ingress

Egress

Next_hop

Examples of the action are:

Drop_count

Down

There are additional counters, such as DROP, PUNT, and DIAGS, that provide important information but are not associated with a specific internal NP stage. Drop and punt counters are kept as an aggregate total per stage.

Example

RP/0/RSP0/CPU0:router# show controllers np ports all 
 
   
Thu Jan  1 02:18:48.264 UTC                Node: 0/0/CPU0:
----------------------------------------------------------------
NP Bridge Fia                       Ports                      
-- ------ --- ---------------------------------------------------
0  1      0   GigabitEthernet0/0/0/30 - GigabitEthernet0/0/0/39 
1  1      0   GigabitEthernet0/0/0/20 - GigabitEthernet0/0/0/29 
2  0      0   GigabitEthernet0/0/0/10 - GigabitEthernet0/0/0/19 
3  0      0   GigabitEthernet0/0/0/0 - GigabitEthernet0/0/0/9 
 
   
 
   
RP/0/RSP0/CPU0:router# show interfaces tenGigE 0/1/0/0 
 
   
Thu Jan  1 01:10:01.908 UTC
TenGigE0/1/0/0 is up, line protocol is up 
  Interface state transitions: 1
  Hardware is TenGigE, address is 001e.bdfd.1736 (bia 001e.bdfd.1736)
  Layer 2 Transport Mode
  MTU 1514 bytes, BW 10000000 Kbit
     reliability 255/255, txload 0/255, rxload 0/255
  Encapsulation ARPA,
  Full-duplex, 10000Mb/s, LR, link type is force-up
  output flow control is off, input flow control is off
  loopback not set,
  Maintenance is enabled,
  ARP type ARPA, ARP timeout 04:00:00
  Last clearing of "show interface" counters never
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     0 packets input, 0 bytes, 0 total input drops
     0 drops for unrecognized upper-level protocol
     Received 0 broadcast packets, 0 multicast packets
              0 runts, 0 giants, 0 throttles, 0 parity
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     0 packets output, 0 bytes, 0 total output drops
     Output 0 broadcast packets, 0 multicast packets
     0 output errors, 0 underruns, 0 applique, 0 resets
     0 output buffer failures, 0 output buffers swapped out
     1 carrier transitions
 
   

In the following example, there were some ingress and egress drops in the RESOLVE stage. All of these drops in the ingress (9 drops) and egress (6 drops) were caused by the next hop being unreachable (a total of 15 drops for IPv4 next hop down).

RP/0/RSP0/CPU0:router# show controllers np counters np3 location 0/0/CPU0  | include DROP 
Mon Nov 15 12:18:35.289 EST
 
   
  30  RESOLVE_INGRESS_DROP_CNT                                   9           0
  31  RESOLVE_EGRESS_DROP_CNT                                    6           0
 295  DROP_IPV4_NEXT_HOP_DOWN                                   15           0
 
   

The following example shows a typical output from the same command, but without the modifier | include DROP.

RP/0/RSP0/CPU0:router# show controllers np counters np3 
Mon Nov 15 12:20:35.289 EST
 
   
                Node: 0/0/CPU0:
----------------------------------------------------------------
Show global stats counters for NP3, revision v3
 
   
Read 20 non-zero NP counters:
Offset  Counter                                         FrameValue   Rate (pps)
-------------------------------------------------------------------------------
  23  PARSE_FABRIC_RECEIVE_CNT                               417           0
  30  RESOLVE_INRESS_DROP_CNT                                  9           0
  31  RESOLVE_EGRESS_DROP_CNT                                  6           0
  53  MODIFY_FRAMES_PADDED_CNT                              3230           0
  67  PARSE_MOFRR_SWITCH_MSG_RCVD_FROM_FAB                   920           0
  70  RESOLVE_INGRESS_L2_PUNT_CNT                           1081           0
  71  RESOLVE_EGRESS_L3_PUNT_CNT                            4613           0
  74  RESOLVE_LEARN_FROM_NOTIFY_CNT                         3484           0
  75  RESOLVE_BD_FLUSH_DELETE_CNT                            104           0
  83  RESOLVE_MOFRR_HASH_UPDATE_CNT                          463           0
  87  RESOLVE_MOFRR_SWITCH_MSG_INGNORED                      407           0
 111  DIAGS                                                  536           0
 295  DROP_IPV4_NEXT_HOP_DOWN                                 15           0
.
.
.
 
   

Step 8 Check the NP Bridge3 counters.

RP/0/RSP0/CPU0:router# show controllers np fabric-counters all ?
 
   
  all  All NP instances
  np0  NP0 instance
  np1  NP1 instance
  np2  NP2 instance
  np3  NP3 instance
 
   
RP/0/RSP0/CPU0:router# show controllers np fabric-counters all <np instance or all> 
location <location> 
 
   
RP/0/RSP0/CPU0:router# show controllers np fabric-counters all np3 location 0/5/CPU0
 
   

Check the NP-bridge rx/tx counters for each NP on the LC. View the packet sent and received counts, bytes transferred, packet counters categorized by packet size, and so forth. The fields of interest are:

xaui_a_t_transmited_packets_cnt: The number of packets sent by the NP to the bridge

xaui_a_r_received_packets_cnt: The number of packets sent by the bridge to the NP

Step 9 Check the bridge counters

RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats location node-id
 
   

Examples

RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats location 0/RSP0/CPU0 
Mon Nov 22 14:14:48.010 PST
Device  Rx Interface      Packet              Error            Threshold        
                          Count               Drops            Drops            
--------------------------------------------------------------------------------
Bridge0 From-Fabric(DDR)  492283              0                0               
        From CPU          492283              0                0 
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats location 0/1/CPU0
Mon Nov 22 14:18:54.834 PST
 
   
UC - Unicast , MC - Multicast
LP - LowPriority , HP - HighPriority
 
   
--------------------------------------------------------------------------------
                                 FIA  0
                                 ******
Cast/   Packet            Packet              Error            Threshold        
Prio    Direction         Count               Drops            Drops            
--------------------------------------------------------------------------------
 
   
Unicast Egress Stats
********************
UC HP   Fabric to NP-0    70329               0                0               
UC LP   Fabric to NP-0    0                   0                0               
UC HP   Fabric to NP-1    70329               0                0               
UC LP   Fabric to NP-1    0                   0                0               
UC HP   Fabric to NP-2    70329               0                0               
UC LP   Fabric to NP-2    0                   0                0               
UC HP   Fabric to NP-3    70329               0                0               
UC LP   Fabric to NP-3    0                   0                0               
----------------------------------------------------------------
UC      Total Egress      281316              0                0               
 
   
Multicast Egress Stats
*********************
MC HP   Fabric to NP-0    0                   0                0               
MC LP   Fabric to NP-0    0                   0                0               
MC HP   Fabric to NP-1    0                   0                0               
MC LP   Fabric to NP-1    0                   0                0               
MC HP   Fabric to NP-2    0                   0                0               
MC LP   Fabric to NP-2    0                   0                0               
MC HP   Fabric to NP-3    0                   0                0               
MC LP   Fabric to NP-3    0                   0                0               
---------------------------------------------------------------
MC      Total Egress      0                   0                0               
 
   
 
   
Cast/   Packet            Packet              
Prio    Direction         Count               
--------------------------------------------------
Unicast Ingress Stats
*********************
UC HP   NP-0 to Fabric    70329               
UC LP   NP-0 to Fabric    0                   
UC HP   NP-1 to Fabric    70329               
UC LP   NP-1 to Fabric    0                   
UC HP   NP-2 to Fabric    70329               
UC LP   NP-2 to Fabric    0                   
UC HP   NP-3 to Fabric    70329               
UC LP   NP-3 to Fabric    0                   
--------------------------------------------------
UC      Total Ingress     281316              
 
   
Multicast Ingress Stats
***********************
MC HP   NP-0 to Fabric    0                   
MC LP   NP-0 to Fabric    0                   
MC HP   NP-1 to Fabric    0                   
MC LP   NP-1 to Fabric    0                   
MC HP   NP-2 to Fabric    0                   
MC LP   NP-2 to Fabric    0                   
MC HP   NP-3 to Fabric    0                   
MC LP   NP-3 to Fabric    0                   
--------------------------------------------------
MC      Total Ingress     0                   
 
   
 
   
Ingress Drop Stats (MC & UC combined)
**************************************
PriorityPacket            Error            Threshold        
        Direction         Drops            Drops            
--------------------------------------------------
LP      NP-0 to Fabric    0                   0                   
HP      NP-0 to Fabric    0                   0                   
LP      NP-1 to Fabric    0                   0                   
HP      NP-1 to Fabric    0                   0                   
LP      NP-2 to Fabric    0                   0                   
HP      NP-2 to Fabric    0                   0                   
LP      NP-3 to Fabric    0                   0                   
HP      NP-3 to Fabric    0                   0                   
--------------------------------------------------
      Total IngressDrops  0                   0 
 
   
 
   

Step 10 Check the FIA counters

RP/0/RSP0/CPU0:router# show controllers fabric fia stats location location
Examples:
RP/0/RSP0/CPU0:router# show controllers fabric fia stats location 0/RSP0/CPU0
 
   
Wed Aug 25 12:36:43.151 DST
 
   
FIA:0 DDR Packet counters:
=========================
From Punt  686545              
To Punt    582387              
 
   
FIA:0 SuperFrame counters:
=========================
To Unicast Xbar[0]     821335              
To Unicast Xbar[1]     0                   
To Unicast Xbar[2]     0                   
To Unicast Xbar[3]     0                   
To MultiCast Xbar[0]   7758                
To MultiCast Xbar[1]   0                   
To MultiCast Xbar[2]   15807               
To MultiCast Xbar[3]   0                   
 
   
From Unicast Xbar[0]   629854              
From Unicast Xbar[1]   0                   
From Unicast Xbar[2]   1                   
From Unicast Xbar[3]   0                   
From MultiCast Xbar[0] 2589                
From MultiCast Xbar[1] 0                   
From MultiCast Xbar[2] 2588                
From MultiCast Xbar[3] 0                   
 
   
FIA:0 Total Drop counters:
=========================
Ingress drop: 0                         
Egress drop: 2                        
Total drop: 2 
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia stats location 0/2/CPU0 
 
   
FIA:0 DDR Packet counters:
=========================
From Bridge#[0]  510                 
To Bridge  #[0]  510                 
From Bridge#[1]  510                 
To Bridge  #[1]  510                 
 
   
FIA:0 SuperFrame counters:
=========================
To Unicast Xbar[0]     19                  
To Unicast Xbar[1]     20                  
To Unicast Xbar[2]     0                   
To Unicast Xbar[3]     0                   
To MultiCast Xbar[0]   0                   
To MultiCast Xbar[1]   0                   
To MultiCast Xbar[2]   0                   
To MultiCast Xbar[3]   0                   
 
   
From Unicast Xbar[0]   19                  
From Unicast Xbar[1]   20                  
From Unicast Xbar[2]   0                   
From Unicast Xbar[3]   0                   
From MultiCast Xbar[0] 0                   
From MultiCast Xbar[1] 0                   
From MultiCast Xbar[2] 0                   
From MultiCast Xbar[3] 0                   
 
   
FIA:0 Total Drop counters:
=========================
Ingress drop: 0                         
Egress drop: 0                        
Total drop: 0                        
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia q-depth [location location] 
 
   
Thu Jan  1 02:16:37.227 UTC
FIA  0
------
Total Pkt queue depth count = 0
 
   

Step 11 Check the crossbar counters to make sure there are no dropped packets.

RP/0/RSP0/CPU0:router# show controllers fabric crossbar statistics instance [0|1] location 
location
 
   
Example:
RP/0/RSP0/CPU0:router# show controllers fabric crossbar statistics instance 0 location 
0/RSP0/CPU0
 
   
Location: 0/RSP0/CPU0 (physical slot 4)
  Asic Instance: 0
  Fabric info for node 0/RSP0/CPU0 (physical slot: 4)
 
   
  Dropped packets             :            mcast            unicast
  +---------------------------------------------------------------+
  Input buf bp pkts           :                0                  0
  Output buf bp pkts          :                0                  0
  Xbar timeout buf bp pkts    :                0                  0
  HOL drop pkts               :                0                  0
  Null POE drop pkts          :                0                  0
 
   

Locating Drops of Punted Packets

To locate drops of punted packets, perform the following procedure.

SUMMARY STEPS

1. Clear all packet counters

2. Start traffic

3. Check traffic counters at each component

4. Check NP counters for NP mapping to interface, and check NP0 for inject packet count

5. Check fabric-related counters

6. Check punt FPGA counters

DETAILED STEPS


Step 1 Clear all packet counters as described in the "Locating Packet Drops by Examining Counters" section.

Step 2 Start traffic.

Step 3 Check traffic counters at each component in the punted packet path. Use a procedure similar to the one described in the "Locating Packet Drops by Examining Counters" section. However, for punted packets, the data path is:

Incoming Interface --> NP --> LC CPU --> NP --> Bridge3 --> LC FIA --> RSP Crossbar--> Punt FPGA on RSP --> RSP CPU --> RSP FIA --> RSP Crossbar --> LC FIA --> LC CPU --> NP0 ---> LC FIA ---> Crossbar ---> RSP FIA ---> RSP CPU

Step 4 Check the NP counters for NP mapping to interface, and check NP0 for the inject packet count. The following fields provide information on the NP counters:

801 PARSE_FABRIC_RECEIVE_CNT

820 PARSE_LC_INJECT_TO_FAB_CNT

872 RESOLVE_INGRESS_L2_PUNT_CNT

970 MODIFY_FABRIC_TRANSMIT_CNT

822 PARSE_FAB_INJECT_IPV4_CNT

Step 5 Check the fabric-related counters for any packet drops.

RP/0/RSP0/CPU0:router# show controllers fabric crossbar statistics instance 0 location 
0/RSP0/CPU0
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia stats [location location]
 
   
Example: 
RP/0/RSP0/CPU0:router# show controllers fabric fia stats location 0/5/CPU0
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats [location location]
 
   
Examples:
RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats location 0/RSP0/CPU0
 
   
Wed Aug 25 14:12:03.916 DST
 
   
Device  Rx Interface      Packet              Error            Threshold        
                          Count               Drops            Drops            
--------------------------------------------------------------------------------
Bridge0 From-Fabric(DDR)  603698              0                0               
        From CPU          711734              0                0 
 
   
RP/0/RSP0/CPU0:router# show controllers fabric fia bridge stats location 0/5/CPU0
 
   
Wed Aug 25 14:12:20.867 DST
 
   
UC - Unicast , MC - Multicast
LP - LowPriority , HP - HighPriority
 
   
--------------------------------------------------------------------------------
                                 FIA  0
                                 ******
Cast/   Packet            Packet              Error            Threshold        
Prio    Direction         Count               Drops            Drops            
--------------------------------------------------------------------------------
 
   
Unicast Egress Stats
********************
UC HP   Fabric to NP-0    28                  0                0               
UC LP   Fabric to NP-0    0                   0                0               
UC HP   Fabric to NP-1    28                  0                0               
UC LP   Fabric to NP-1    0                   0                0               
UC HP   Fabric to NP-2    28                  0                0               
UC LP   Fabric to NP-2    0                   0                0               
UC HP   Fabric to NP-3    28                  0                0               
UC LP   Fabric to NP-3    0                   0                0               
----------------------------------------------------------------
UC      Total Egress      112                 0                0               
 
   
Multicast Egress Stats
*********************
MC HP   Fabric to NP-0    205                 0                0               
MC LP   Fabric to NP-0    2                   0                0               
MC HP   Fabric to NP-1    205                 0                0               
MC LP   Fabric to NP-1    2                   0                0               
MC HP   Fabric to NP-2    205                 0                0               
MC LP   Fabric to NP-2    2                   0                0               
MC HP   Fabric to NP-3    205                 0                0               
MC LP   Fabric to NP-3    2                   0                0               
---------------------------------------------------------------
MC      Total Egress      828                 0                0               
 
   
--More--
 
   

Step 6 To check for packets punted to and injected from the LC or RP CPU, run the following commands.

RP/0/RSP0/CPU0:router# show spp interface location node-id
 
   
RP/0/RSP0/CPU0:router# show spp node-counters location node-id
 
   
RP/0/RSP0/CPU0:router# show spp node location node-id
 
   
RP/0/RSP0/CPU0:router# show spp sid stats location node-id
 
   
RP/0/RSP0/CPU0:router# show spp client location node-id
 
   

Note To clear the spp counters, run the command clear spp {client | interface | node-counters} location node-id. This command clears client statistics, interface statistics, and per-node counters, depending on the keyword you use.


Step 7 To query the punt switch for the statistics on the LC CPU, run the following command.

RP/0/RSP0/CPU0:router# show controllers punt-switch switch-stats location node-id
 
   

Packet Drop from LC to LC

In this scenario, you have configured the system, RSP and LC have come up and are stable, LC to LC traffic is going through, but some packets are dropped.

The possible causes are:

Traffic dropped at interface

Traffic dropped at NP3

Traffic dropped at bridge

Traffic dropped at the fabric I/O

Synchronization between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Oversubscribed traffic

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 If not already done, perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the LC.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 dump PFM errors on both source and destination LC.

show pfm location <0/1/cpu0>
 
   

Step 5 Collect the fabric I/O/Bridge counters on both source and destination card.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   

Step 6 Collect redundancy information.

show redundancy
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up.

Step 3 Stop other streams of traffic to see if this failed stream can go through.

Step 4 Reduce the rate of the traffic to see if the drop continues.


Packet Drop Between RSP and LC

In this scenario, you have configured the system, RSP and LC have come up and are stable, but one of the following problems occurred:

Protocol or ping traffic (punt path traffic) has some drops

Initially the ping/protocol packets are not going through, but later recover.

The possible causes are:

Traffic dropped at interface

Traffic dropped at NP3

Traffic dropped at bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Traffic drop at Punt FPGA

sn database sync issue

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 If not already done, perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the linecard.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 5 Collect the fabric I/O/bridge counters on both RSP and LC.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   
show controllers fabric fia stats location 0/rsp0/CPU0 
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up.

Step 3 Stop other streams of traffic to see if this failed stream can go through.

Step 4 Determine whether the drop is a single burst in the beginning or is continuous.

Step 5 Determine if the drop is associated with particular packet size.


Packet Drop After Certain Actions

In this scenario, the system is configured, RSP and LC have come up, and traffic is flowing properly for some time. However, after certain action such as configuration change, online insertion and removal (OIR) of LC/RSP, LC reload, or software upgrade, some traffic drop or complete traffic loss is observed.

The possible causes are:

Traffic dropped at interface

Traffic dropped at NP3

Traffic dropped at bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Traffic drop at Punt FPGA

sn database sync issue

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 Perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the linecard.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 5 Collect the fabric I/O/bridge counters on both the RSP and LC.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   
show controllers fabric fia stats location 0/rsp0/CPU0 
 
   

Step 6 Collect redundancy information.

show redundancy
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up.

Step 3 Stop other streams of traffic to see if this failed stream can go through.

Step 4 Repeat Step 1 through Step 3 to determine whether the results are reproducible.


Packet Drop After a Redundancy Switchover

In this scenario, you have configured the system, RSP and LC have come up, and traffic is flowing properly for some time. However, after a switchover (by a command or OIR), you see some traffic drop or complete traffic loss.

The possible causes are:

Traffic dropped at interface

Traffic dropped at NP3

Traffic dropped at bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Traffic drop at Punt FPGA

sn database sync issue

Fabric is stuck

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 Perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the linecard before and after the switchover.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 5 Collect the fabric I/O/bridge counters on both the RSP and LC.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   
show controllers fabric fia stats location 0/rsp0/CPU0 
 
   

Step 6 Collect redundancy information.

show redundancy
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Stop other streams of traffic to see if this failed stream can go through again.

Step 2 Repeat Step 1 several times to determine if the result is reproducible.

Step 3 Perfom a switchover back to the other side to determine whether both directions are having the same traffic problems.

Step 4 After obtaining the necessary approvals from your network and system administrators (because this step will stop all traffic on this unit), reboot the entire system and check to see if it recovers.


Packet Drop with Unknown Reason

In this scenario, you have configured the system, RSP and LC have come up, and traffic is flowing properly for a significant time (at least several days). However, for an unknown reason, the system experiences traffic drops or complete traffic loss.

The possible causes are:

Traffic dropped at interface

Traffic dropped at NP3

Traffic dropped at bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Traffic drop at Punt FPGA

Fabric is stuck

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 Perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the linecard before and after the switchover.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 4 Collect the fabric I/O/bridge counters on both the RSP and LC.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   
show controllers fabric fia stats location 0/rsp0/CPU0 
 
   

Step 5 Collect redundancy information.

show redundancy
 
   

Step 6 Check for drops on the the fabric I/O interface (FIA drop counters) on the LC in both the ingress (to fabric) and egress (from fabric) directions.

show controllers fabric fia drops egress location 
show controllers fabric fia drops ingress location 
show controllers fabric fia error egress location 
show controllers fabric fia error ingress location 
 
   

Step 7 Check for drops on the bridge. Counters are a combination of high priority (HP), low priority (LP), unicast, multicast, DDR, and DDR-threshold packets. They are furthur segregated into critical and informational based on their severity. All Ethernet linecards have 2 bridges. Use the following command to obtain this information.

show controllers fabric fia bridge stats location <linecard location>
 
   

Step 8 Check if there are any drops on Punt FPGA on RSP.

show controllers fabric fia bridge stats location 0/RSP0/CPU0
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Stop other streams of traffic to see if this failed stream can go through again.

Step 2 Reboot the LCs one at a time and check if the traffic recovers.

Step 3 After obtaining the necessary approvals from your network and system administrators (because this step will stop all traffic on this unit), reboot the entire system and check to see if it recovers.

Step 4 Reconfigure the system to see if it recovers.


Troubleshooting RSP and LC Crashes

This section explains how to troubleshoot the following problems:

Active RSP Is Crashing

Standby RSP Is Crashing

LC Is Crashing

Active RSP Is Crashing

In this scenario, the active RSP keeps crashing and the RSP console shows that the active fabric manager or fia_rsp (the fabric I/O process) terminates repeatedly.

The possible causes are:

Initialization of the fabric I/O fails for some reason

Fabric self-test fails

The synchronization between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 Perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the RSP card.

show controllers fabric fia link-status location <0/RSP0/CPU0> 
 
   
show controllers fabric fia bridge sync-status location
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Dump the PFM errors for the card.

show pfm location <0/rsp0/cpu0> 
 
   

Step 4 Collect the fabric I/O/Punt counters.

show controllers fabric fia stats location <0/rsp0/CPU0> 
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at LC ROMMON and reboot the RSP again to see if this clears the problem.

Step 2 Pull out the RSP and reinsert it to see if it can boot up.

Step 3 Swap the slot (put the RSP card into the other RSP slot) and see if it can boot up properly.


Standby RSP Is Crashing

In this scenario, the active RSP is up and running, but the standby RSP keeps crashing. The RSP console shows that the standby fabric manager or fia_rsp (the fabric I/O process) terminates repeatedly.

The possible causes are:

Initialization of the standby fabric I/O fails for some reason

Fabric self-test on the standby card fails

The sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Communication between the active and standby card is not working

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 If not already done, perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the RSP card.

show controllers fabric fia link-status location <0/RSP0/CPU0> 
 
   

Step 3 Dump the PFM errors for the card.

show pfm location <0/rsp0/cpu0> 
 
   

Step 4 Dump the redundancy status.

show redundancy
 
   

Step 5 Collect the fabric I/O/ punt counters.

show controllers fabric fia stats location <0/1/CPU0> 
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at the ROMMON and reboot the standby RSP again to see if this clears the problem.

Step 2 Pull out the RSP and reinsert it to see if it can boot up.

Step 3 Swap the slot (put the RSP card into the other RSP slot) and see if it can boot up properly.


LC Is Crashing

In this scenario, a LC keeps crashing and the RSP console shows that fia_lc (the fabric I/O process) terminates repeatedly.

The possible causes are:

Initialization of the LC fabric I/O fails for some reason

Fabric self-test on the LC fails

The synchronization between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Communication between the LC and the RSP is not working properly

There is a sync problem between the fabric I/O and the bridge

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 If not already done, perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of the fabric on the LC.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   

Step 4 Collect the fabric I/O/ bridge counters.

show controllers fabric fia stats location <0/1/CPU0> 
 
   
show controllers fabric fia bridge stats location <0/1/CPU0> 

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at the LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up.

Step 3 Swap the slot (pull out the LC and insert it into another LC slot) and see if it can boot up properly.

Step 4 Put a different LC of same type to see if that card can booting up properly.


Troubleshooting Complete Loss of Traffic

This section explains how to troubleshoot scenarios in which the system is active but traffic does not go through. It includes the following topics:

No Traffic from LC to LC

No Traffic Between RSP and LC

No Traffic from LC to LC

In this scenario, you have configured the system and the RSP and LC have come up and are stable, but no LC-to-LC traffic is going through.

The possible causes are:

Traffic dropped at the interface

Traffic dropped at NP3

Traffic dropped at the bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 Perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the LC.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 5 Collect the fabric I/O/bridge counters on both the source and destination cards.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   

Step 6 Collect redundancy information.

show redundancy
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at the LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up and carry traffic.

Step 3 Stop other streams of traffic to see if this failed stream can go through.

Step 4 Run online diagnostics to locate errors in the system. For additional information on diagnostics, see the "Using Diagnostic Commands" section on page 1-59.


No Traffic Between RSP and LC

In this scenario, you have configured the system and the RSP and LC have come up and are stable, but no protocol or ping traffic (punt path traffic) is going through.

The possible causes are:

Traffic dropped at the interface

Traffic dropped at NP3

Traffic dropped at the bridge

Traffic dropped at the fabric I/O

Sync between the fabric I/O and the fabric NP or fabric arbiter NP has a problem

Traffic has wrong vqi

Traffic dropped at the punt FPGA

Traffic dropped at the protocol level

Unknown failures

Locate the Problem and Take Corrective Action

Follow this procedure to locate the problem. After you locate the problem, take corrective action based on your findings. Corrective action might include, for example, configuration updates or hardware/software version upgrades.


Step 1 If not already done, perform the procedures in the "Getting Started with Fabric Troubleshooting" section to verify that you have the correct versions of the hardware and software.

Step 2 Collect the sync status of fabric on the LC.

show controllers fabric fia link-status location <0/1/CPU0> 
 
   
show controllers fabric fia bridge ddr-status location <0/1/cpu0> 
 
   
show controllers fabric fia bridge sync-status location 0/1/cpu0 
 
   

Step 3 Collect configuration information.

show run 
 
   

Step 4 Dump the PFM errors for the card.

show pfm location <0/1/cpu0> 
 
   
show pfm location <0/rsp0/cpu0> 
 
   

Step 5 Collect the fabric I/O/bridge counters on both the RSP and LC.

show interfaces 
 
   
show controllers np counters all 
 
   
show controllers fabric fia stats location 0/1/CPU0 
 
   
show controllers fabric fia bridge stats location 0/1/CPU0 
 
   
show controllers fabric fia stats location 0/rsp0/CPU0 
 
   

Step 6 Collect redundancy information.

show redundancy
 
   

Where to Go Next

If you have not been able to locate or correct the problem, you might be able to clear it by performing the following steps. However, these steps might delete information that would help you perform additional troubleshooting with Cisco Technical Support. Some of the steps involve stopping or reducing traffic streams, which might not be appropriate on a deployed system. Consult with your network administrator before you perform any of these steps.


Caution Before you follow these next steps, consider contacting Cisco Technical Support. Some of these steps can cause loss of data that would be useful for future analysis and troubleshooting, or could cause loss of traffic.


Step 1 Perform `reset -h' at the LC ROMMON and reboot the LC again to see if this clears the problem.

Step 2 Pull out the LC and reinsert it to see if it can boot up and carry traffic.

Step 3 Pull out the RSP card and reinsert it to see if it can boot up and carry traffic.

Step 4 Stop other streams of traffic to see if this failed stream can go through.

Step 5 Run online diagnostics to locate errors in the system. For additional information on diagnostics, see the "Using Diagnostic Commands" section on page 1-59.


Gathering Fabric Information Before Calling TAC

If you need support from Cisco to troubleshoot the fabric, we recommend that you gather the following information if time permits:

Output of the following commands (this will display software version, and the line card, fabric card, FPGA, and ASIC versions)

show version 
show inventory raw 
show diag 
show hw-module fpd location 
 
   

Information on chassis type

(admin) show inventory 
 
   

Platform-related information

show platform
 
   

Ingress interface(s), egress interface(s), and expected packet path

Drop counters

Logs (capture all logs on the RSP console port)

Table 7-1