Fibre Channel Slow
Drain Device Detection and Congestion Avoidance- An Overview
All data traffic
between end devices in the SAN fabric is carried by Fibre Channel Class 3, and
in some cases, Class 2 services, that use link-level, per-hop-based, and
buffer-to-buffer flow control. These classes of service do not support
end-to-end flow control. When slow devices are attached to the fabric, the end
devices do not accept the frames at the configured or negotiated rate. The slow
devices lead to an Inter-Switch Link (ISL) credit shortage in the traffic that
is destined for these devices and they congest the links. The credit shortage
affects the unrelated flows in the fabric that use the same ISL link even
though destination devices do not experience a slow drain.
This feature provides
various enhancements that enable you to detect slow drain devices are cause
congestion in the network and also provide congestion avoidance.
The enhancements are
mainly on the edge ports that connect to the slow drain devices to minimize the
frames stuck condition in the edge ports due to slow drain devices that are
causing an ISL blockage. To avoid or minimize the stuck condition, configure
lesser frame timeout for the ports. You can use the no-credit timeout to drop
all packets after the slow drain is detected using the configured thresholds. A
smaller frame timeout value helps to alleviate the slow drain condition that
affects the fabric by dropping the packets on the edge ports sooner than the
time they actually get timed out (358 ms). This function frees the buffer space
in ISL, which can be used by other unrelated flows that do not experience slow
drain condition.
Note |
This
feature supports edge ports that are connected to slow edge devices. Even
though you can apply this feature to ISLs as well, we recommend that you apply
this feature only for edge F ports and retain the default configuration for
ISLs as E and TE ports. This feature is not supported on Generation 1 modules.
|
How to Configure a Stuck Frame Timeout Value
Configuring a Stuck
Frame Timeout Value
The default stuck
frame timeout value is 358 ms. The timeout value can be incremented in steps of
10. We recommend that you retain the default configuration for ISLs and
configure a value that does not exceed 500 ms (100 to 200 ms) for fabric F
ports.
SUMMARY STEPS1.
switch#
configure
terminal
2.
switch(config)#
system timeout
congestion-drop
seconds
mode E |
F
3.
switch(config)#
system timeout
congestion-drop default mode E | F
DETAILED STEPS | Command or Action | Purpose |
---|
Step 1 | switch#
configure
terminal
|
Enters global
configuration mode.
|
Step 2 | switch(config)#
system timeout
congestion-drop
seconds
mode E |
F
|
Specifies the
stuck frame timeout value in milliseconds and the port mode for the switch.
|
Step 3 | switch(config)#
system timeout
congestion-drop default mode E | F
|
Specifies the
default stuck frame timeout port mode for the switch.
|
This example shows
how to configure a stuck frame timeout value of 100 ms:
switch# configure terminal
switch(config)# system timeout congestion-drop 100 mode F
switch(config)# system timeout congestion-drop default mode F
How to Configure a No-Credit Timeut Value
Configuring a
No-Credit Timeout Value
When the port does
not have the credits for the configured period, you can enable a no-credit
timeout on that port, which results in all frames that come to that port
getting dropped in the egress. This action frees the buffer space in the ISL
link, which helps to reduce the fabric slowdown and congestion on other
unrelated flows that use the same link.
The dropped frames
are the frames that have just entered the switch or have stayed in the switch
for the configured timeout value. These drops are preemptive and clear the
congestion completely.
The no-credit
timeout feature is disabled by default. We recommend that you retain the
default configuration for ISLs and configure a value that does not exceed 358
ms (200 to 300 ms) for fabric F ports.
You can disable this
feature by entering the
no system timeout
no-credit-drop mode F command.
Note |
The
no-credit timeout value and stuck frame timeout value are interlinked. The
no-credit timeout value must always be greater than the stuck frame timeout
value.
|
SUMMARY STEPS1.
switch#
configure
terminal
2.
switch(config)#
system timeout
no-credit-drop
seconds
mode
F
3.
switch(config)#
system timeout no-credit-drop
default mode F
DETAILED STEPS | Command or Action | Purpose |
---|
Step 1 | switch#
configure
terminal
|
Enters global
configuration mode.
|
Step 2 | switch(config)#
system timeout
no-credit-drop
seconds
mode
F
|
Specifies the
no-credit timeout value and port mode for the switch. The
seconds value
is 500ms by default. This value can be incremented in steps of 100.
|
Step 3 | switch(config)#
system timeout no-credit-drop
default mode F
|
Specifies the
default no-credit timeout value port mode for the switch.
|
This example shows
how to configure a no-credit timeout value:
switch# configure terminal
switch(config)# system timeout no-credit-drop 100 mode F
switch(config)# system timeout no-credit-drop default mode F
Displaying Credit
Loss Counters
Use the following
commands to display the credit loss counters per module per interface for the
last specified minutes, hours, and days:
Command
|
Purpose
|
show process creditmon
{credit-loss-event-history |
credit-loss-events |
force-timeout-events |
timeout-discards-events}
|
Displays Onboard Failure
Logging (OBFL) credit loss logs.
|
Displaying Credit
Loss Events
Use one of the
following commands to display the total number of credit loss events per
interface with the latest three credit loss time stamps:
Command
|
Purpose
|
show process creditmon
credit-loss-events [module
module number]
|
Displays
the credit loss event information for a module.
|
show
process creditmon credit-loss-event-history [module
module number]
|
Displays
the credit loss event history information.
|
Displaying Timeout
Drops
Use the following
command to display the timeout drops per module per interface for the last
specified minutes, hours, and days:
Command
|
Purpose
|
show logging onboard
flow-control timeout-drops
[last
mm
minutes] [last
hh
hours] [last
dd
days] [module
module number]
|
Displays
the Onboard Failure Logging (OBFL) timeout drops log.
|
Displaying the
Average Credit Not Available Status
When the average
credit nonavailable duration exceeds the set threshold, you can error-disable
the port, send a trap with interface details, and generate a syslog with
interface details. In addition, you can combine or more actions or turn on or
off an action.
The port monitor feature provides the command line
interface to configure the thresholds and action. The threshold configuration
can be a percentage of credit non-available duration in an interval.
The thresholds for
the credit nonavailable duration can be 0 percent to 100 percent in multiples
of 10, and the interval can be from 1 second to 1 hour. The default is 10
percent in 1 second and generates a syslog.
Use the following command to display the average credit-not-available
status:
Command
|
Purpose
|
show system internal snmp
credit-not-available {module |
module-id}
|
Displays
the port monitor credit-not-available counter logs.
|
How to Configure a Port Monitor
Port
Monitoring
You can use port
monitoring to monitor the performance of fabric devices and to detect slow
drain devices. You can monitor counters and take the necessary action depending
on whether the portguard is enabled or disabled. You can configure the
thresholds for various counters and trigger an event when the values cross the
threshold settings. Port monitoring provides a user interface that you can use
to configure the thresholds and action. By default, portguard is disabled in
the port monitoring policy.
Two default policies,
default and default slowdrain, are created during snmpd initialization. The
default slowdrain policy is activated when the switch comes online when no
other policies are active at that time. The default slowdrain policy monitors
only credit-loss-reco and tx-credit-not-available counters.
When you create a
policy, it is created for both access and trunk links. The access link has a
value of F and the trunk link has a value of E.
Enabling Port
Monitor
SUMMARY STEPS1.
switch#
configure
terminal
2.
switch(config)#
[no]
port-monitor
enable
DETAILED STEPS | Command or Action | Purpose |
---|
Step 1 | switch#
configure
terminal
|
Enters global
configuration mode.
|
Step 2 | switch(config)#
[no]
port-monitor
enable
|
Enables
(default) the port monitoring feature. The
no version of
this command disables the port monitoring feature.
|
Configuring a Port
Monitor Policy
SUMMARY STEPS1.
switch#
configure
terminal
2.
switch(config)#
port-monitor name
policyname
3.
switch(config-port-monitor)#
port-type all
4.
switch(config-port-monitor)#
counter {credit-loss-reco |
timeout-discards |
tx-credit-not-available }
poll-interval
seconds {absolute |
delta}
rising-threshold
value1
event
event-id1
falling-threshold
value2
event
event-id2
5.
switch(config-port-monitor)# [no]
counter {credit-loss-reco |
timeout-discards |
tx-credit-not-available }
poll-interval
seconds {absolute |
delta}
rising-threshold
value1
event
event-id1
falling-threshold
value2
event
event-id2
DETAILED STEPS | Command or Action | Purpose |
---|
Step 1 | switch#
configure
terminal
|
Enters global
configuration mode.
|
Step 2 | switch(config)#
port-monitor name
policyname
| Specifies the
policy name and enters the port monitor policy configuration mode.
|
Step 3 |
switch(config-port-monitor)#
port-type all
| Applies the
policy to all ports.
|
Step 4 | switch(config-port-monitor)#
counter {credit-loss-reco |
timeout-discards |
tx-credit-not-available }
poll-interval
seconds {absolute |
delta}
rising-threshold
value1
event
event-id1
falling-threshold
value2
event
event-id2
|
Specifies the
poll interval in seconds, the thresholds in absolute numbers, and the event IDs
of events to be triggered for the following reasons:
|
Step 5 |
switch(config-port-monitor)# [no]
counter {credit-loss-reco |
timeout-discards |
tx-credit-not-available }
poll-interval
seconds {absolute |
delta}
rising-threshold
value1
event
event-id1
falling-threshold
value2
event
event-id2
|
Turns on
monitoring for the specified counter.
The
no form of
this command turns off monitoring for the specified counter.
|
This example shows
how to specify the poll interval and threshold for timeout discards:
switch# configure terminal
switch(config)# port-monitor cisco
switch(config-port-monitor)# counter timeout-discards poll-interval 10
This example show
how to specify the poll interval and threshold for credit loss recovery:
switch# configure terminal
switch(config)# port-monitor cisco
switch(config-port-monitor)# counter credit-loss-reco poll-interval 20 delta rising-threshold 10 event 4 falling-threshold 3 event 4
Activating a Port
Monitor Policy
SUMMARY STEPS1.
switch#
configure
terminal
2.
switch(config)#
port-monitor activate
policyname
3.
(Optional) switch(config)#
port-monitor
activate
4.
(Optional) switch(config)#
no port-monitor
activate
policyname
DETAILED STEPS | Command or Action | Purpose |
---|
Step 1 | switch#
configure
terminal
|
Enters global
configuration mode.
|
Step 2 | switch(config)#
port-monitor activate
policyname
|
Activates the
specified port monitor policy.
|
Step 3 | switch(config)#
port-monitor
activate
| (Optional)
Activates the
default port monitor policy.
|
Step 4 | switch(config)#
no port-monitor
activate
policyname
| (Optional)
Deactivates the
specified port monitor policy.
|
This example shows
how to activate a specific port monitor policy:
switch# configure terminal
switch(config)# port-monitor activate cisco
Displaying Port
Monitor Policies
Use the following
command to display port monitor policies:
Command
|
Purpose
|
switch#
show port-monitor
policyname
|
Displays
details of the specified port monitor policy.
|
This example shows
how to display a specific port monitor policy: