Guest

Cisco MDS 9100 Series Multilayer Fabric Switches

MDS 9148 Slow Drain Counters and Commands

Document ID: 116401

Updated: Oct 03, 2013

Contributed by Ed Mazurek, Cisco TAC Engineer.

   Print

Contents

Introduction

This document describes the commands and counters that increment on a Cisco MDS 9148 Multilayer Fabric Switch with a device that withholds R_RDY signals from the switch. This is typically called a slow drain device. The MDS 9148 is also known as Sabre.

Two tests were run:

  1. Slow port emulation with R_RDY delay of 1500000us (1.5 seconds)
  2. Port-monitor - slow port emulation with R_RDY delay of 1500000us (1.5 seconds)

Notes:

Use the Command Lookup Tool (registered customers only) in order to obtain more information on the commands used in this document.

The Output Interpreter Tool (registered customers only) supports certain show commands. Use the Output Interpreter Tool in order to view an analysis of show command output.

Topology

All ports are 4Gbps.

Single MDS 9148 switch running NX-OS 5.2(8)
                                       172.18.121.30
Agilent 103/3--fc1/13 rtp-san-23-02-9148  fc1/25--Agilent 103/2
fcid 0xe20200               NX-OS 5.2(8)                           fcid 0xe20300
Traffic------------------------------------------------------> slow drain device

rtp-san-23-02-9148# show version
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Documents: http://www.cisco.com/en/US/products/ps9372/
tsd_products_support_series_home.html
Copyright (c) 2002-2012, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://www.gnu.org/licenses/gpl.html.

Software
  BIOS:      version 1.0.19
  loader:    version N/A
  kickstart: version 5.2(8)
  system:    version 5.2(8)
  BIOS compile time:       02/01/10
  kickstart image file is: bootflash:///m9100-s3ek9-kickstart-mz.5.2.8.bin
  kickstart compile time:  12/25/2020 12:00:00 [12/07/2012 19:48:00]
  system image file is:    bootflash:///m9100-s3ek9-mz.5.2.8.bin
  system compile time:     11/9/2012 11:00:00 [12/07/2012 20:47:26]

Hardware
  cisco MDS 9148 FC (1 Slot) Chassis ("1/2/4/8 Gbps FC/Supervisor-3")
  Motorola, e500v2  with 1036300 kB of memory.
  Processor Board ID JAF1406ASTK

  Device name: rtp-san-23-02-9148
  bootflash:    1023120 kB
Kernel uptime is 4 day(s), 23 hour(s), 10 minute(s), 33 second(s)

Last reset at 26277 usecs after  Fri Jan  4 20:08:48 2013

  Reason: Reset due to upgrade
System version: 5.2(1)
Service:
rtp-san-23-02-9148#

Restrictions in Cisco NX-OS Software Releases

Cisco NX-OS Software Release 5.2(8)

These commands do not work. See Cisco Bug ID CSCud98114, "MDS9148 -show logging onboard  flow-control request-timeout - syntax err." This bug was fixed in Cisco NX-OS Software Release 6.2(1) and later.

  • show logging onboard flow-control request-timeout
  • show logging onboard flow-control pause-count
  • show logging onboard flow-control pause-events
  • show logging onboard flow-control timeout-drops - This command works but returns a syntax error.

These counters are listed in the fc-mac counters, but do not show up in the onboard failure logging (OBFL) error-stats. See Cisco Bug ID CSCud93587, "MDS9148 OBFL doesn't contain FCP_CNTR_TX_WT_AVG_B2B_ZERO." This bug is not resolved yet.

  • FCP_CNTR_TX_WT_AVG_B2B_ZERO
  • FCP_CNTR_RX_WT_AVG_B2B_ZERO

The slow drain port-monitor policy does not contain tx-credit-not-available. If you attempt to configure this counter, the error message "This counter is not supported on this platform" appears. No Simple Network Management Protocol (SNMP) traps are sent, and the show system internal snmp credit-not-available command does not return anything.

Cisco NX-OS Software Releases Earlier than 5.2(6)

These counters are not being generated. See Cisco Bug ID CSCts04123, "Slow drain support for atlantis/sabre." This bug was fixed in Cisco NX-OS Software Release 5.2(6) and later.

  • FCP_CNTR_TX_WT_AVG_B2B_ZERO
  • FCP_CNTR_RX_WT_AVG_B2B_ZERO

Test 1: Slow Port Emulation with R_RDY Delay of 1500000us (1.5 Seconds)

This is the procedure for a slow port emulation test with an R_RDY delay of 1500000us (1.5 seconds).

fc1/13 is the port connected to the sender, and fc1/25 is the port connected to slow drain device

Only a single test was run.

  1. Issue initial set of commands.
  2. Start Agilent traffic 103/3 > 103/2.
  3. Let it run for 30 seconds or so.
  4. Issue set of commands on rtp-san-23-02-9148.
  5. Wait 30 seconds.
  6. Issue set of commands on rtp-san-23-02-9148.
  7. Stop test.
  8. Gather show tech-support details.

rtp-san-23-02-9148 fc1/13 - Port Connected to Sender

Interface Counters - fc1/13

These commands were issued:

show interface fc1/13
show interface fc1/13 counters

These are the changes, if any:

input discards - 0
input OLS - 0
input LRR - 0
input NOS - 0

output discards - 0
output OLS - 0
output LRR - 0
output NOS - 0

transmit B2B credit transitions from zero - 0 - No change from previous value
receive B2B credit transitions from zero  - +7408
receive B2B credit remaining - 32 - No change from previous value
transmit B2B credit remaining- 128 - No change from previous value

Note: 'receive B2B credit transitions from zero' indicates the MDS withheld B2B credits from the device connected to fc1/13. This allows the receive B2B credits to transition to zero, which prevents the attached device from being able to send during the time it is at zero. Note that there is no indication of time in this counter. In effect, this is applying back-pressure to the sender so that it sends less packets into the MDS.

show hardware internal errors - fc1/13

This command gives this example output:

show hardware internal fc-mac port 13 error-statistics

* -----------------------------------------------------------------------------
* Port Error Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 44, port(s): 13
*
ADDRESS     STAT                                                   COUNT
__________ ________                                           __________________
0xffffffff FCP_CNTR_RX_WT_AVG_B2B_ZERO                                     0x1c

Note: This indicates the MDS withheld B2B credits from the device connected to fc1/13 for at least 100ms. This is in effect applying back-pressure to the sender so that it sends less packets into the MDS.

show hardware internal packet-flow dropped - fc1/13

There are no results applicable to port fc1/13.

show hardware internal packet-dropped-reason - fc1/13

There are no results applicable to port fc1/13.

show hardware internal statistics - fc1/13

This command gives this example output:

rtp-san-23-02-9148# show hardware internal statistics module 1

----------------------------------------
Hardware stats as reported in module 1
----------------------------------------
...
show hardware internal fc-mac port 13 statistics

* -----------------------------------------------------------------------------
* Port Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 44, port(s): 13
*

ADDRESS     STAT                                         COUNT    60 sec Delta
__________ ________                            ___________  ____________
0x00000042 FCP_CNTR_MAC_CREDIT_IG_XG_MUX_SEND_RRDY_REQ   0x2b61       +0x2b61
0x00000061 FCP_CNTR_MAC_DATA_RX_CLASS3_FRAMES            0x2b61       +0x2b61
0x00000069 FCP_CNTR_MAC_DATA_RX_CLASS3_WORDS            0x16a9edc    +0x16a9edc
0x0000041d FCP_CNTR_RCM_RBBZ_CH0                         0x1cf0       +0x1cf0
0x0000041f FCP_CNTR_RCM_FRAME_CNT_CH0                    0x2b61       +0x2b61
0x0000031b FCP_CNTR_RHP_FRM                              0x2b61       +0x2b61
0xffffffff FCP_CNTR_RX_WT_AVG_B2B_ZERO                    0x1c2        +0x1c2
0x00000533 FCP_CNTR_TMM_CH0                               0x1f         +0x18
0x00000536 FCP_CNTR_TMM_LB                                0x1f         +0x18

Note: FCP_CNTR_RCM_RBBZ_CH0 is the same as 'receive B2B credit transitions from zero.'

show logging onboard error-stats - fc1/13

There are no results applicable to port fc1/13. 

show logging onboard flow-control timeout-drops - fc1/13

There are no results applicable to port fc1/13.

show process creditmon credit-loss-events - fc1/13

There are no results applicable to port fc1/13.

show system internal snmp credit-not-available - fc1/13

There are no results applicable to port fc1/13. See the note on the slow drain port-monitor policy.

slot 1 show hardware internal fc-mac port 13 statistics

See show hardware internal statistics - fc1/13.

slot 1 show hardware internal fc-mac port 13 error-statistics

This command gives this example output:

rtp-san-23-02-9148# slot 1 show hardware internal fc-mac port 13 error-statistics

* -----------------------------------------------------------------------------
* Port Error Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 44, port(s): 13

ADDRESS     STAT                                                   COUNT
__________ ________                                           __________________
0xffffffff FCP_CNTR_RX_WT_AVG_B2B_ZERO                                     0x1c2

slot 1 show hard internal credit-info port 13

This command gives this example output:

rtp-san-23-02-9148# slot 1 show hard internal credit-info port 13

    ======== Device Credit Information - RX ========
+------+------+----------------------+------------+---------+--------+
| PORT | SI/  |     DEVICE NAME      |  CREDITS   | CREDITS |   BW   |
|  NO  | PRIO |                      | CONFIGURED |  USED   |  MODE  |
+------+------+----------------------+------------+---------+--------+
|  13  |  0/0 |            Sabre-fcp |       0x20 |     0x0 |   Full |
+------+------+----------------------+------------+---------+--------+

    ======== Device Credit Information - TX ========
+------+------+----------------------+------------+---------+--------+
| PORT | SI/  |     DEVICE NAME      |  CREDITS   | CREDITS |   BW   |
|  NO  | PRIO |                      | CONFIGURED |  USED   |  MODE  |
+------+------+----------------------+------------+---------+--------+
|  13  |  0/0 |            Sabre-fcp |       0x80 |     0x0 |   Full |
+------+------+----------------------+------------+---------+--------+

slot 1 show port-config internal link-events

There are no results applicable to port fc1/13 since nothing went up or down.

rtp-san-23-02-9148 fc1/25 - Port Connected to Slow Drain Device

Interface Counters - fc1/25

These commands were issued:

show interface fc1/25
show interface fc1/25 counters

These are the changes, if any:

input discards - 0
input OLS - 0
input LRR - +57
input NOS - 0

output discards - 3808
output OLS - 0
output LRR - 0
output NOS - 0

transmit B2B credit transitions from zero +224
receive B2B credit transitions from zero  +57
receive B2B credit remaining - 32 - No change from previous value
transmit B2B credit remaining- 127 - -1

Note: 'transmit B2B credit transitions from zero' indicates that the attached device withheld B2B credits from the device connected to fc1/13. This allows the MDS transmit B2B credits to transition to zero, which prevents the MDS from being able to send on this port during the time it is at zero. Note that there is no indication of time in this counter. In effect, the device is applying back-pressure to the MDS so that it send less packets to the attached device. This causes back pressure to the sending port fc1/13.

show hardware internal errors - fc1/25

This command gives this example output:

show hardware internal fc-mac port 25 interrupt-counts

* -----------------------------------------------------------------------------
* Port Interrupt Counts for device Sabre-fcp
* dev inst: 0, dev intf: 10, port(s): 25
*

INTERRUPT                                                       COUNT    THRESH
_________                                                      ________  ______
IP_FCMAC_INTR_PRIM_RX_SEQ_LRR                                       114       0
IP_FCMAC_INTR_PRIM_RX_SIG_IDLE                                       57       0

show hardware internal fc-mac port 25 error-statistics

* -----------------------------------------------------------------------------
* Port Error Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 10, port(s): 25
*
ADDRESS     STAT                                                   COUNT
__________ ________                                           __________________
0x0000052d FCP_CNTR_TMM_NORMAL_DROP                                        0xee0
0x00000539 FCP_CNTR_TMM_TIMEOUT                                            0xee0
0x00000540 FCP_CNTR_TMM_TIMEOUT_DROP                                       0xee0
0xffffffff FCP_CNTR_CREDIT_LOSS                                             0x39
0xffffffff FCP_CNTR_TX_WT_AVG_B2B_ZERO                                     0x23a

Note: Since the attached device is waiting for 1.5 seconds, the MDS initiates Credit Loss recovery at 1 second. This involves sending a Link Reset (LR) and getting a Link Reset Response (LRR).  While the port is at 0 Tx credits, the MDS is dropping packets for this interface as shown by the three DROP counters. 

show hardware internal packet-flow dropped - fc1/25

This command gives this example output:

show hardware internal packet-flow dropped

        Module: 01      Dropped Packets: YES

        -------- Dropped Packet Flow Details --------

+------------------+------------------+-------------------------------------+
|    DEVICE NAME   |       PORTS      |            DROPPED COUNT            |
|                  |                  |      RX (Hex)    |      TX (Hex)    |
+------------------+------------------+-------------------------------------+
|        Sabre-fcp |               25 |                0 |              ee0 |
+------------------+------------------+-------------------------------------+

show hardware internal packet-dropped-reason - fc1/25

This command gives this example output:

rtp-san-23-02-9148# show hardware internal packet-dropped-reason

show hardware internal packet-dropped-reason

        Module: 01       Dropped Packets: YES

+-----------+---------------+-------------------+------------------------------+
|           |               |      DROPS        |                              |
|   PORTS   |  DEVICE NAME  |-------------------|          COUNTER NAME        |
|           |               | Rx(Hex) | Tx(Hex) |                              |
+-----------+---------------+---------+---------+------------------------------+
|25         |Sabre-fcp      |    -    |EE0      |FCP_CNTR_TMM_NORMAL_DROP      |
|           |               |    -    |EE0      |FCP_CNTR_TMM_TIMEOUT_DROP     |
|           |               |    -    |1dc0     |TOTAL                         |
+-----------+---------------+---------+---------+------------------------------+

show hardware internal statistics - fc1/25

This command gives this example output:

rtp-san-23-02-9148# show hardware internal statistics module 1

----------------------------------------
Hardware stats as reported in module 1
----------------------------------------
...
show hardware internal fc-mac port 25 statistics

* -----------------------------------------------------------------------------
* Port Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 10, port(s): 25
*

ADDRESS     STAT                                          COUNT    60 sec Delta
__________ ________                            ___________  ____________
0x00000042 FCP_CNTR_MAC_CREDIT_IG_XG_MUX_SEND_RRDY_REQ       0x39         +0x39
0x00000043 FCP_CNTR_MAC_CREDIT_EG_DEC_RRDY                   0x39         +0x39
0x00000061 FCP_CNTR_MAC_DATA_RX_CLASS3_FRAMES                0x39         +0x39
0x00000069 FCP_CNTR_MAC_DATA_RX_CLASS3_WORDS               0x2010       +0x2010
0x0000041d FCP_CNTR_RCM_RBBZ_CH0                             0x39         +0x39
0x0000041f FCP_CNTR_RCM_FRAME_CNT_CH0                        0x39         +0x39
0x0000031b FCP_CNTR_RHP_FRM                                  0x39         +0x39
0x00000065 FCP_CNTR_MAC_DATA_TX_CLASS3_FRAMES              0x1cba       +0x1cba
0x0000006d FCP_CNTR_MAC_DATA_TX_CLASS3_WORDS             0xee666c     +0xee666c
0x00000514 FCP_CNTR_TMM_TBBZ_CH0                             0x70         +0x70
0x00000515 FCP_CNTR_TMM_TBBZ_CH1                             0x70         +0x70
0x0000052d FCP_CNTR_TMM_NORMAL_DROP                         0xee0        +0xee0
0x00000539 FCP_CNTR_TMM_TIMEOUT                             0xee0        +0xee0
0x00000540 FCP_CNTR_TMM_TIMEOUT_DROP                        0xee0        +0xee0
0x00000533 FCP_CNTR_TMM_CH0                                  0x58         +0x51
0x00000534 FCP_CNTR_TMM_CH1                                0x2b61       +0x2b61
0x00000536 FCP_CNTR_TMM_LB                                   0x1f         +0x18
0xffffffff FCP_CNTR_CREDIT_LOSS                              0x39         +0x39
0xffffffff FCP_CNTR_TX_WT_AVG_B2B_ZERO                      0x23a        +0x23a
0xffffffff FCP_CNTR_LRR_IN                                   0x39         +0x39
0xffffffff FCP_CNTR_LINK_RESET_OUT                           0x39         +0x39

Note: Note that  FCP_CNTR_RCM_TBBZ_CHx is the same as 'transmit B2B credit transitions from zero.'

show logging onboard error-stats - fc1/25

This command gives this example output:

rtp-san-23-02-9148# show logging onboard starttime 01/10/13-00:00:00 error-stats

----------------------------
    Supervisor Module:
----------------------------
----------------------------
    Module:  1
----------------------------
--------------------------------------------------------------------------------
 ERROR STATISTICS INFORMATION FOR DEVICE ID 127 DEVICE Sabre-fcp
--------------------------------------------------------------------------------
    Interface      |                                |         |    Time Stamp  
      Range        |    Error Stat Counter Name     |  Count  |MM/DD/YY HH:MM:SS
                   |                                |         |                
--------------------------------------------------------------------------------
fc1/25             |FCP_CNTR_CREDIT_LOSS            |57       |01/10/13 20:36:21
fc1/25             |FCP_CNTR_TMM_TIMEOUT_DROP       |3808     |01/10/13 20:36:21
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3808     |01/10/13 20:36:21
fc1/25             |FCP_CNTR_TMM_NORMAL_DROP        |3808     |01/10/13 20:36:21
fc1/25             |FCP_CNTR_CREDIT_LOSS            |47       |01/10/13 20:36:11
fc1/25             |FCP_CNTR_TMM_TIMEOUT_DROP       |3196     |01/10/13 20:36:11
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3196     |01/10/13 20:36:11
fc1/25             |FCP_CNTR_TMM_NORMAL_DROP        |3196     |01/10/13 20:36:11
fc1/25             |FCP_CNTR_CREDIT_LOSS            |38       |01/10/13 20:36:01
fc1/25             |FCP_CNTR_TMM_TIMEOUT_DROP       |2584     |01/10/13 20:36:01
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |2584     |01/10/13 20:36:01
fc1/25             |FCP_CNTR_TMM_NORMAL_DROP        |2584     |01/10/13 20:36:01
fc1/25             |FCP_CNTR_CREDIT_LOSS            |29       |01/10/13 20:35:51
fc1/25             |FCP_CNTR_TMM_TIMEOUT_DROP       |1972     |01/10/13 20:35:51
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |1972     |01/10/13 20:35:51
fc1/25             |FCP_CNTR_TMM_NORMAL_DROP        |1972     |01/10/13 20:35:51

...and so on...

Note: OBFL is updated in this platform every ten seconds. In each interval, any counters that have incremented are captured and the current values shown. So, FCP_CNTR_CREDIT_LOSS (credit loss recovery), increased from 47 to 57 in 10 seconds. This is exactly correct because it is initiated at most every  second when the MDS is at 0 Tx credits. 

show logging onboard flow-control timeout-drops - fc1/25

This command gives this example output:

rtp-san-23-02-9148# show logging onboard flow-control timeout-drops

----------------------------
   Supervisor Module:
----------------------------
Syntax error while parsing show logging onboard module 1 flow-control timeout-drops

Cmd exec error.

----------------------------
    Module:  1
----------------------------

--------------------------------------------------------------------------------
 ERROR STATISTICS INFORMATION FOR DEVICE ID 127 DEVICE Sabre-fcp
--------------------------------------------------------------------------------
    Interface      |                                |         |    Time Stamp  
      Range        |    Error Stat Counter Name     |  Count  |MM/DD/YY HH:MM:SS
                   |                                |         |                
--------------------------------------------------------------------------------
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3808     |01/10/13 20:36:21
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3196     |01/10/13 20:36:11
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |2584     |01/10/13 20:36:01
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |1972     |01/10/13 20:35:51
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |1360     |01/10/13 20:35:41
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |748      |01/10/13 20:35:31
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |136      |01/10/13 20:35:21
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3910     |01/10/13 20:11:51
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3638     |01/10/13 20:11:41
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |3026     |01/10/13 20:11:31
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |2414     |01/10/13 20:11:21
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |1802     |01/10/13 20:11:11
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |1156     |01/10/13 20:11:01
fc1/25             |FCP_CNTR_TMM_TIMEOUT            |544      |01/10/13 20:10:51

show process creditmon credit-loss-events - fc1/25

This command gives this example output:

rtp-san-23-02-9148# show process creditmon credit-loss-events

show process creditmon credit-loss-events

        Module: 01      Credit Loss Events: YES

----------------------------------------------------
| Interface |  Total |          Timestamp          |
|           | Events |                             |
----------------------------------------------------
| fc1/25    |    512 | 1. Thu Jan 10 20:36:21 2013 |
|           |        | 2. Thu Jan 10 20:36:19 2013 |
|           |        | 3. Thu Jan 10 20:36:18 2013 |
|           |        | 4. Thu Jan 10 20:36:17 2013 |
|           |        | 5. Thu Jan 10 20:36:16 2013 |
|           |        | 6. Thu Jan 10 20:36:15 2013 |
|           |        | 7. Thu Jan 10 20:36:14 2013 |
|           |        | 8. Thu Jan 10 20:36:13 2013 |
|           |        | 9. Thu Jan 10 20:36:12 2013 |
|           |        |10. Thu Jan 10 20:36:11 2013 |
----------------------------------------------------

Note: This shows a more detailed time-stamped version of when the switch invokes credit loss recovery. 

show system internal snmp credit-not-available - fc1/25

There are no results applicable to port fc1/25. See the note on the slow drain port-monitor policy.

slot 1 show hardware internal fc-mac port 25 statistics

See show hardware internal statistics - fc1/25.

slot 1 show hardware internal fc-mac port 25 error-statistics

This command gives this example output:

rtp-san-23-02-9148# slot 1 show hardware internal fc-mac port 25 error-statistics

* -----------------------------------------------------------------------------
* Port Error Statistics for device Sabre-fcp
* dev inst: 0, dev intf: 10, port(s): 25
*
ADDRESS     STAT                                                   COUNT
__________ ________                                           __________________
0x0000052d FCP_CNTR_TMM_NORMAL_DROP                                        0xee0
0x00000539 FCP_CNTR_TMM_TIMEOUT                                            0xee0
0x00000540 FCP_CNTR_TMM_TIMEOUT_DROP                                       0xee0
0xffffffff FCP_CNTR_CREDIT_LOSS                                             0x39
0xffffffff FCP_CNTR_TX_WT_AVG_B2B_ZERO                                     0x23a

Note: This is a good initial command for display of the most important counters for slow drain. It does not include FCP_CNTR_RCM_RBBZ_CHx and FCP_CNTR_TMM_TBBZ_CHx, but those are not considered errors.

slot 1 show hard internal credit-info port 25

This command gives this example output:

rtp-san-23-02-9148# slot 1 show hard internal credit-info port 25

    ======== Device Credit Information - RX ========
+------+------+----------------------+------------+---------+--------+
| PORT | SI/  |     DEVICE NAME      |  CREDITS   | CREDITS |   BW   |
|  NO  | PRIO |                      | CONFIGURED |  USED   |  MODE  |
+------+------+----------------------+------------+---------+--------+
|  25  |  0/0 |            Sabre-fcp |       0x20 |     0x0 |   Full |
+------+------+----------------------+------------+---------+--------+

    ======== Device Credit Information - TX ========
+------+------+----------------------+------------+---------+--------+
| PORT | SI/  |     DEVICE NAME      |  CREDITS   | CREDITS |   BW   |
|  NO  | PRIO |                      | CONFIGURED |  USED   |  MODE  |
+------+------+----------------------+------------+---------+--------+
|  25  |  0/0 |            Sabre-fcp |       0x80 |     0x1 |   Full |
+------+------+----------------------+------------+---------+--------+

slot 1 show port-config internal link-events

There are no results applicable to port fc1/25 since nothing went up or down.

Test 2: Port-monitor - Slow Port Emulation with R_RDY Delay of 1500000us (1.5 Seconds)

This is the procedure for a port-monitor, slow port emulation test with an R_RDY delay of 1500000us (1.5 seconds).

Default Slow Drain Policy

By default, the slow drain policy is active. See the note on the slow drain port-monitor policy.

This is the default slow drain policy:

rtp-san-23-02-9148# show port-monitor active

Policy Name  : slowdrain
Admin status : Active
Oper status  : Active
Port type    : All Access Ports
---------------------------------------------------------------------------------------------------------
Counter                  Threshold  Interval Rising Threshold event Falling Threshold  event PMON Portguard
-------                  ---------  -------- ---------------- ----- ------------------ ----- --------------
Credit Loss Reco         Delta      1        1                4     0                  4     Not enabled
----------------------------------------------------------------------------------------------------------
rtp-san-23-02-9148#

Create Policy

Create and activate a policy named edm. Include all counters in order to see which ones are generated:

rtp-san-23-02-9148# show port-monitor active

Policy Name  : edm
Admin status : Active
Oper status  : Active
Port type    : All Ports
---------------------------------------------------------------------------------------------------------
Counter                  Threshold  Interval Rising Threshold event Falling Threshold  event PMON Portguard
-------                  ---------  -------- ---------------- ----- ------------------ ----- --------------
Link Loss                Delta      60       5                4     1                  4     Not enabled
Sync Loss                Delta      60       5                4     1                  4     Not enabled
Signal Loss              Delta      60       5                4     1                  4     Not enabled
Invalid Words            Delta      60       1                4     0                  4     Not enabled
Invalid CRC's            Delta      60       5                4     1                  4     Not enabled
TX Discards              Delta      60       200              4     10                 4     Not enabled
LR RX                    Delta      60       5                4     1                  4     Not enabled
LR TX                    Delta      60       5                4     1                  4     Not enabled
Timeout Discards         Delta      60       200              4     10                 4     Not enabled
Credit Loss Reco         Delta      1        1                4     0                  4     Not enabled
RX Datarate              Delta      60       80%              4     20%                4     Not enabled
TX Datarate              Delta      60       80%              4     20%                4     Not enabled
----------------------------------------------------------------------------------------------------------
rtp-san-23-02-9148#

Rerun Test

Start the Agilent again with the fc1/25 connected to the slow drain device with R_RDY Delay 1500000us (1.5 seconds) for approximately 60 seconds.

View Threshold Manager Log

Navigate to Device Manager > Logs > Switch Resident > Threshold Manager in order to see the Threshold Manager Log.

 116401-trouble-mds9148-01.jpg

This is the Threshold Manager Log in text format:

4, 121    2013/01/12-11:49:56    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 122    2013/01/12-11:50:03    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 123    2013/01/12-11:50:04    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 124    2013/01/12-11:50:14    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 125    2013/01/12-11:50:15    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 126    2013/01/12-11:50:25    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 127    2013/01/12-11:50:26    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 128    2013/01/12-11:50:36    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 129    2013/01/12-11:50:37    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 130    2013/01/12-11:50:47    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 131    2013/01/12-11:50:48    fcIfCreditLoss.16875520=1 >= 1:65500, 4 WARNING(4)Rising
4, 132    2013/01/12-11:50:50    fcIfCreditLoss.16875520=0 <= 0:65500, 4 WARNING(4)Falling
4, 133    2013/01/12-11:50:55    fcIfOutDiscards.16875520=3197 >= 200:65500, 4 WARNING(4)Rising
4, 134    2013/01/12-11:50:55    fcIfLinkResetOuts.16875520=49 >= 5:65500, 4 WARNING(4)Rising
4, 135    2013/01/12-11:50:55    fcIfTimeOutDiscards.16875520=3197 >= 200:65500, 4 WARNING(4)Rising

Note: 16875520 is ifindex, which is is 0x01018000 and corresponds to fc1/25.

rtp-san-23-02-9148# show port internal info interface-id 0x01018000
fc1/25 - if_index: 0x01018000, phy_port_index: 0xa
     local_index: 0x18

Appendix

Counter Definitions

FCP_CNTR_CREDIT_LOSS

Explanation:

This counter indicates that one full second has elapsed with the transmit buffer-to-buffer (Tx B2B) credit counter at zero. The switch has initiated credit loss recovery by transmitting a Link Reset (LR). If a Link Reset Response (LRR) is received, the full allocation of Tx B2B credits is restored, and the port can once again resume transmitting. If an LRR is not received in 90ms, an 'LR Rcvd B2B' condition is raised, and the port is brought down.

Reference: 

  • FCP_CNTR_LINK_RESET_OUT
  • IP_FCMAC_INTR_PRIM_RX_SEQ_LRR
  • FCP_CNTR_LRR_IN
  • show process creditmon credit-loss-events

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x error-statistics
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics
  • show logging onboard error-stats

FCP_CNTR_TMM_TIMEOUT_DROP

Explanation:

A  packet destined for this port has timed out in the switch. By default, packets time out after 500ms. If a packet cannot be tramitted out its egress port, it is discarded, and this counter is incremented. This is adjustable with use of the system timeout congestion-drop number mode {E|F} command.

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x error-statistics
  • show hardware internal packet-dropped-reason
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics
  • show logging onboard error-stats

FCP_CNTR_TMM_TIMEOUT

Explanation:

See FCP_CNTR_TMM_TIMEOUT_DROP.

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x error-statistics
  • show hardware internal packet-dropped-reason
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics
  • show logging onboard module 1 flow-control timeout-drops
  • show logging onboard error-stats

FCP_CNTR_TMM_NORMAL_DROP

Explanation:

This is an aggregate counter that includes other counters such as FCP_CNTR_TMM_TIMEOUT_DROP.

Commands:

  • show hardware internal errors all show hardware internal fc-mac port x error-statistics
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics
  • show logging onboard error-stats

transmit B2B credit transitions from zero

Explanation:

This counter increments when the remaining Tx B2B value has transitioned from zero to a non-zero value.

This is the FCP_CNTR_TMM_TBBZ_CHx statistic. While this can happen normally, large numbers typically indicate a problem with the attached device. If the FCP_CNTR_TX_WT_AVG_B2B_ZERO counter was at zero for 100ms or more, it is incremented.

Commands:

  • show interface fcx/y counters and aggregate-counters

receive B2B credit transitions from zero

Explanation:

This counter increments when the remaining receive (Rx) B2B value has transitioned from zero to a non-zero value.

This is the FCP_CNTR_TMM_RBBZ_CHx statistic. While this can happen normally, large numbers typically indicate that the switch is congested in the direction away from this port and is back pressuring the port in order to prevent it from sending additional packets into the storage area network (SAN). If the FCP_CNTR_RX_WT_AVG_B2B_ZERO counter was at zero for 100ms or more, it is incremented.

Commands:

  • show interface fcx/y counters and aggregate-counters

IP_FCMAC_INTR_PRIM_RX_SEQ_LRR

Explanation:

This counter increments each time an LRR is received. This is typically caused by the switch when it initiates credit loss recovery. 

Reference:

  • FCP_CNTR_CREDIT_LOSS

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x interrupt-counts

FCP_CNTR_TX_WT_AVG_B2B_ZERO

Explanation:

This counter increments when the remaining Tx B2B value is at zero for 100ms or more. This typically indiciates the attached device is evidencing congestion (slow drain).

This should generate a fcIfTxWtAvgBBCreditTransitionToZero SNMP trap and put an event in the output from the show system internal snmp credit-not-available command. However, this part of the counter is not supported. See the note on the slow drain port-monitor policy.

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x error-statistics
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

FCP_CNTR_RX_WT_AVG_B2B_ZERO

Explanation:

This counter increments when the remaining Rx B2B value is at zero for 100ms or more. This typically indicates the switch is withholding R_RDYs (B2B credits) from the attached device due to upstream congestion (congestion away from this port).

Commands:

  • show hardware internal errors all
  • show hardware internal fc-mac port x error-statistics
  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

FCP_CNTR_RCM_RBBZ_CH0

Explanation:

This counter increments when the remaining Rx B2B value has transitioned from zero to a non-zero value.

This is the receive B2B credit transitions from zero counter under the show interface counters and aggregate counters command. While this can happen normally, large numbers typically indicate that the switch is congested in the direction away from this port and is back pressuring the port in order to prevent it from sending additional packets into the SAN. If the FCP_CNTR_RX_WT_AVG_B2B_ZERO counter was at zero for 100ms or more, it is incremented.

Commands:

  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

FCP_CNTR_TMM_TBBZ_CHx - x is 0 or 1

Explanation:

This counter increments when the remaining Tx B2B value has transitioned from zero to a non-zero value.

This is the transmit B2B credit transitions from zero under the show interface counters and aggregate counters command. While this can happen normally, large numbers typically indicate a problem with the attached device. If the FCP_CNTR_TX_WT_AVG_B2B_ZERO counter was at zero for 100ms or more, it is incremented.

Commands:

  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

FCP_CNTR_LRR_IN

Explanation:

This counter increments each time an LRR is received. This is typically due to the switch initiating credit loss recovery. 

Reference:

  • FCP_CNTR_CREDIT_LOSS
  • FCP_CNTR_LINK_RESET_OUT
  • IP_FCMAC_INTR_PRIM_RX_SEQ_LRR

Commands:

  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

FCP_CNTR_LINK_RESET_OUT

Explanation:

This counter increments each time an LR is transmitted. This is typically caused by the switch when it initiates credit loss recovery. 

Reference:

  • FCP_CNTR_CREDIT_LOSS
  • FCP_CNTR_LRR_IN
  • IP_FCMAC_INTR_PRIM_RX_SEQ_LRR

Commands:

  • show hardware internal statistics
  • show hardware internal fc-mac port x statistics

MDS9148 Arbiter Information

The MDS9148 has two central arbiters and 12 port-groups of four ports each. Each arbiter handles half of the egress port groups. As a packet is received on an ingress port, the Ingress Credit Buffer (ICB) requests a grant to send a received packet to a specific Destination Index (DI). The ICB sends a grant request to arbiter 0 for port-groups 0-5 and to arbiter 1 for port-groups 6-11. If there is space in the transmit buffers of the DI, the arbiter returns a grant to the requesting ingress port, and the frame can be transmitted.

Arbiter requests and grants can be seen in this command-line interface (CLI) example:

MDS9148# slot 1 show hardware internal icb 0 statistics | i ARB
0x00000d14 PG0_ICB_ARB0_REQ_CNT 0xf8e
0x00000d18 PG0_ICB_ARB1_REQ_CNT 0x2e93
0x00000d1c PG0_ICB_ARB0_GNT_CNT 0xf8e
0x00000d20 PG0_ICB_ARB1_GNT_CNT 0x2e93
0x00000d14 PG1_ICB_ARB0_REQ_CNT 0x3e1c
0x00000d1c PG1_ICB_ARB0_GNT_CNT 0x3e1c
...snip
0x00000d14 PG10_ICB_ARB0_REQ_CNT 0x3e1c
0x00000d1c PG10_ICB_ARB0_GNT_CNT 0x3e1c
0x00000d14 PG11_ICB_ARB0_REQ_CNT 0x3e1c
0x00000d1c
PG11_ICB_ARB0_GNT_CNT 0x3e1c

MDS 9148 Commands for Queued Packets

The MDS (Sabre) has specific commands in order to check for queued packets. These commands are similar to, but not nearly as useful as, the show hardware internal up-xbar 0 queued-packet-info command that is available in the Cisco MDS 9500 Series Multilayer Directors.

If the configured credits are less than than the available credits, there are frames pending for that device interface (DI). In this example, fc1/13 is sending to the slow drain device that is connected on fc1/25. fc1/25 shows two packets queued:

module-1# show hardware internal arb 0 cell-frame-credits
CCC = Cell Credits Configured.
CCA = Cell Credits Available - Live from hardware.
FCC = Frame Credits Configured.
FCA = Frame Credits Available- Live from hardware.
STA = Cell/Frame Credit status reported by hardware.
+----+---+----+-------------------------+--------------------------+
|    |   |Port|       PRIORITY 0        |        PRIORITY 1        |
|Port| DI|Mode| CCC|CCA|STA| FCC|FCA|STA|  CCC|CCA|STA| FCC|FCA|STA|
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
|   1| 35|   E|  36| 36|  Y|  36| 36|  Y|   36| 36|  Y|  36| 36|  Y|
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
|   2| 34|   E|  36| 36|  Y|  36| 36|  Y|   36| 36|  Y|  36| 36|  Y|
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
...
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
|  13| 44|   E|  36| 36|  Y|  36| 36|  Y|   36| 36|  Y|  36| 36|  Y|
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
...
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
|  25| 10|   E|  36| 34|  Y|  36| 35|  Y|   36|  2|  Y|  36| 34|  Y|  << 36 - 34 = 2 packets queued
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+  << 36 - 2 = 34 packets queued
|  26| 11|   E|  36| 36|  Y|  36| 36|  Y|   36| 36|  Y|  36| 36|  Y|
+----+---+----+----+---+---+----+---+---+-----+---+---+----+---+---+
...

Packet headers for packets currently queued can be viewed with the slot 1 show hardware internal icb 0 port-grp 3 pkt-hdr 0 linecard command. Each port-group comprises four ports, so the proper port-group of the ingress port must be chosen. The packet header is displayed in real time.

In this example, packets are being received on an ISL port fc1/13 (port-group 3) and egressing to port fc1/1, which is slow. Destination FCID 0xcd0000 exists on fc1/1.

MDS9148# slot 1 show hardware internal icb 0 port-grp 3 pkt-hdr 0

==== PACKET (Sabre & FC) HEADER in PG 3 BUFFER NUMBER : 0 ====

+---------------------+---------------------+---------------------+
| SS : 0x1 | VER : 0 | AT : 0 |
| BC : 0 | GA : 0 | SOF : 0x6 |
| HL : 0 | PLEN : 0 | TTL : 0xff |
| UP : 0 | DI : 0 | SI : 0x2c |
| CTL : 0 | TSTMP : 0xbd48 | STA : 0 |
| SP : 0 | VSAN : 0xed | CSUM : 0x59 |
+---------------------+---------------------+---------------------+
| R_CTL : 0 | D_ID : 0xcd0000 | CS_CTL : 0 |
| S_ID : 0x960280 | TYPE : 0 | F_CTL : 0x280000 |
| SEQ_ID : 0 | DF_CTL : 0 | SEQ_CNT: 0 |
| OX_ID : 0x8000 | RX_ID : 0 | PARAM : 0 |
+---------------------+---------------------+---------------------+
MDS9148#

Command Set Issued

  • show clock
  • show interface fc1/13
  • show interface fc1/25
  • show interface fc1/13 counters
  • show interface fc1/25 counters
  • show hardware internal errors all
  • show hardware internal packet-flow dropped
  • show hardware internal packet-dropped-reason
  • show hardware internal statistics module 1
  • show logging onboard starttime 01/10/13-00:00:00 error-stats
  • show logging onboard flow-control timeout-drops
  • show process creditmon credit-loss-events
  • show system internal snmp credit-not-available
  • slot 1 show hardware internal fc-mac port 13 statistics
  • slot 1 show hardware internal fc-mac port 13 error-statistics
  • slot 1 show hardware internal fc-mac port 25 statistics
  • slot 1 show hardware internal fc-mac port 25 error-statistics
  • slot 1 show hard internal credit-info port 13
  • slot 1 show hard internal credit-info port 25
  • slot 1 show port-config internal link-events
  • **end
Updated: Oct 03, 2013
Document ID: 116401