Guest

IBM Networking

Troubleshooting DLSw+ Circuit Connectivity

Document ID: 17564

Updated: Jan 28, 2008

   Print

Introduction

This document explains the process to troubleshoot data-link switching plus (DLSw+) circuit connectivity.

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

This document is not restricted to specific software or hardware versions.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

For more information on document conventions, refer to the Cisco Technical Tips Conventions.

Check Circuit Status

This section explains DLSw circuit status, possible reasons why a DLSw circuit gets stuck at a particular state, and some troubleshooting steps that can be taken to achieve circuit connectivity. This section also explains, in graphical format, the circuit establishment states and the output of the show dlsw circuit command. Finally, this section discusses some of the most common DLSw issues, such as:

  • Causes for BADSSPHDR error messages.

  • Why DLSw version 2 circuits may fail to connect when passed through a firewall.

  • Issues that arise when you run DLSw on Multilayer Switch Feature Card (MSFC) or Multilayer Switch Feature Card 2 (MSFC2).

  • Direct LAN connections of 802.1q trunks into DLSw+.

DLSw+ Circuit Establishment States

dlswts4_a.gif

Note: The most common cause for circuits to become stuck in the CKT_ESTABLISHED state is an inactive host Virtual Telecommunications Access Method (VTAM) Switched Major Node.

Circuit Start

Circuit start is a transient state that indicates that there is an outstanding response to a CANUREACH_CS message (null Exchange Identification [XID]) resolved by an ICANREACH_CS message. If you have a problem with a circuit stuck in the CKT_START state, it indicates an internal problem with the DLSw peer routers: a MAC or Service Access Point (SAP) pair is not being cleaned up, or there is a lack of available resources necessary to complete the state transition (for example, memory).

To troubleshoot a CKT_START problem, verify that the test poll and the null XID have both reached the peer partners, and verify that the peer partners have successfully responded. You should understand the network topology to the host; it is typically either the Front End Processor (FEP) or it is channel-attached through a Channel Interface Processor (CIP) card in a 7xxx router.

For FEP connections, verify that the router???s interface to the FEP is up and is working correctly. Ask the network operator to display (or display for yourself) the relevant LINE and Physical Unit (PU) definitions on the FEP, and verify that they are active. Verify that the Switched Major Node, for which the PU acts as a placeholder, is active.

If you are using a CIP card and you have verified connectivity to the host, then there could be a problem with the VTAM External Communications Adapter (XCA) Major Node. These are the most typical problems:

  • The XCA Major Node is not in an active state.

  • The path outwards from VTAM???called the Channel Unit Address???is not online or is not boxed within the channel subsystem.

Verify that you have free logical lines available underneath the XCA Major Node, for which VTAM CONNECT-IN can allocate a PU. In later versions of CIP microcode (CIP22.38, CIP24.15, CIP25.14, CIP26.10, and CIP27.4), the CIP adapter does not respond to test polls, if there are no more logical lines available.

Issue the show extended channel x/2 max-llc2-sessions command to verify that the maximum number of Logical Link Control (LLC) sessions has not been reached. The default is 256.

There could also be a problem with the SAP values in use. The CIP adapter listens to unique SAPs. All of the internal CIP adapters must be defined to VTAM in XCA Major Node definitions. The Adapter Number (ADAPNO) value on the XCA Major Node is used by VTAM as a reference to an internal adapter in the router. Each internal adapter configured on a CIP must have a unique ADAPNO for each media type. The XCA Major Node definition is where you configure which SAPs to open for each internal adapter.

The test poll and null XID verify that the XCA Major Node and the CIP adapter are listening to the correct SAP. If the CIP MAC adapter is open and has at least one SAP open, then it responds to tests without forwarding them to VTAM. Test frames are sent with DSAP 04 and SSAP 00. Verify the SAP values used between the end station, the CIP router, and the XCA Major Node with these commands:

NCCF     TME 10 NetView   CNM01 OPER6   03/31/00 13:56:01
C CNM01  DISPLAY NET,ID=DKAPPN,SCOPE=ALL
  CNM01  IST097I  DISPLAY  ACCEPTED
' CNM01
IST075I  NAME= DKAPPN , TYPE= XCA MAJOR NODE
IST486I  STATUS= ACTIV , DESIRED STATE= ACTIV
IST1021I MEDIUM=RING , ADAPTNO=1 , CUA=0401 , SNA SAP=4
IST654I  I/O TRACE= OFF, BUFFER TRACE= OFF
IST1656I VTAMTOPO= REPORT, NODE REPORTED= YES
IST170I  LINES:
IST232I  L0401000 ACTIV
IST232I  L0401001 ACTIV
IST232I  L0401002 ACTIV
IST232I  L0401003 ACTIV
IST232I  L0401004 ACTIV
IST232I  L0401005 ACTIV
IST232I  L0401006 ACTIV
IST232I  L0401007 ACTIV
IST232I  L0401008 ACTIV
IST232I  L0401009 ACTIV
IST232I  L040100A ACTIV
IST232I  L040100B ACTIV
IST232I  L040100C ACTIV
IST232I  L040100D ACTIV
IST232I  L040100E ACTIV
IST232I  L040100F ACTIV
IST314I  END

# show dlsw circuit details

Index  local addr (lsap)    remote addr (dsap)    state    uptime
194    0800.5a9b.b3b2 (04)  0800.5ac1.302d (04) CONNECTED  00:00:13
       PCEP: 995AA4         UCEP: A52274
       Port: To0/0          peer  172.18.15.166 (2065)
       Flow-Control-Tx SQ CW: 20, permitted: 28; Rx CW: 22, Granted: 25
Op:
IWO
       Congestion: LOW(02) , Flow OP: Half: 12/5 Reset 1/0
       RIF = 0680.0011.0640

Use these output examples and notes to help verify the XCA Major Node definitions:

NCCF     TME 10 NetView   CNM01 OPER6   03/31/00 13:56:01
C CNM01  DISPLAY NET,ID=DKAPPN,SCOPE=ALL

!--- NetView takes the DIS DKAPPN short form and converts
!--- it into the full D NET,ID=DKAPPN,SCOPE=ALL command.

  CNM01  IST097I  DISPLAY  ACCEPTED
' CNM01
IST075I  NAME= DKAPPN , TYPE= XCA MAJOR NODE

!--- Check that the XCA Major Node name is correct and that
!--- it is, in fact, an XCA MAJOR NODE.

IST486I  STATUS= ACTIV , DESIRED STATE= ACTIV

!--- Verify that the XCA Major Node is in an ACTIV status.
!--- Any other status is an error condition (see the comment after
!--- the Local Line for information about how to correct this error).

IST1021I MEDIUM=RING , ADAPTNO=1 , CUA=0401 , SNA SAP=4

!--- Verify that the Adapter Number is correct and matches the
!--- number used in the CIP definitions on the router.

!--- Also, verify that the Channel Unit Address (CUA) is correct.
!--- Issue the next command (below) to verify that it is either
!--- in status online (O) or, if in use, in status allocated (A).

!--- Finally, verify that the SAP number that is configured on
!--- the XCA Major Node matches the SAP number that is configured
!--- in the ADAPTER statement in the CIP router definition.

IST654I  I/O TRACE= OFF, BUFFER TRACE= OFF
IST1656I VTAMTOPO= REPORT, NODE REPORTED= YES
IST170I  LINES:
IST232I  L0401000 ACTIV

!--- Verify that the Logical Line is in an ACTIV status.
!--- Any other status is an error condition.
!--- Contact either the System Programmer or Network Operator to
!--- CYCLE, INACT then ACT, or take other action to get both the
!--- Local Line and the XCA Major Node into ACTIV status.

IST232I  L0401001 ACTIV
IST232I  L0401002 ACTIV
IST232I  L0401003 ACTIV
IST232I  L0401004 ACTIV
IST232I  L0401005 ACTIV
IST232I  L0401006 ACTIV
IST232I  L0401007 ACTIV
IST232I  L0401008 ACTIV
IST232I  L0401009 ACTIV
IST232I  L040100A ACTIV
IST232I  L040100B ACTIV
IST232I  L040100C ACTIV
IST232I  L040100D ACTIV
IST232I  L040100E ACTIV
IST232I  L040100F ACTIV

!--- Verify that you have free Logical Lines left for the VTAM
!--- CONNECTIN to allocate a PU.

IST314I  END

From the NetView prompt, issue the mvs d u,,,xxx,2 command, where xxx is the Channel Unit Address. This confirms that the CUA is in either online (O) or allocated (A) status:

NCCF     TME 10 NetView   CNM01 OPER6   03/31/00 16:08:27
* CNM01  MVS D U,,,401,2
" CNM01
IEE457I 16.07.29 UNIT STATUS 076
UNIT TYPE STATUS     VOLSER    VOLSTATE
0401 CTC  A
0402 CTC  A-BSY

This is a sample CIP configuration that shows the Virtual Interface, CIP VLAN, source-bridge statements, and the internal adapter number that matches the ADAPNO on the XCA Major Node; CIP assumes LSAP=04 from the XCA Major Node:


!--- Sample CIP configuration.

interface Channel4/2
 lan TokenRing 0
 source-bridge 88 1 100
 adapter 1 4000.7507.ffff

!--- Sample XCA Major Node configuration.

   VBUILD TYPE=XCA
*
APPNPRT PORT ADAPNO=1,
        CUADDR=401,        DEFAULT TABLE ENTRY
        MEDIUM=RING,       MODE TABLE FOR MODEL 3
        SAPADDR=4,         3270 DISPLAY TERMINAL

!--- This is the SAP number to which the XCA Major Node listens.
!--- If this value does not match with your end stations, then
!--- their XIDs will not receive responses.

        TIMER=20
*
APPNGRP GROUP DIAL=YES,    CU ADDRESS  PORT A01
        ANSWER=ON,         DEFAULT TABLE ENTRY
        DYNPU=YES,         MODE TABLE FOR MODEL 4
        AUTOGEN=(16,L,P),  INITIAL ACTIVE

!--- This automatically generates 16 Logical Lines, starting
!--- with the letter L, and generates 16 PUs, starting with
!--- the letter P.
!--- This can be seen in the previous DISPLAY NET output.

        CALL=INOUT         3270 DISPLAY TERMINAL

Circuit Established

A CKT_ESTABLISHED state indicates that the routers have set up the circuit successfully, but the end stations have not yet initiated their session across that circuit. Examine the Logical Link Control, type 2 (LLC2) session that has been established, to verify that this is the case.

router# show llc2

LLC2 Connections: total of 3 connections
Vitual-TokenRing0 DTE: 4000.7507.fff 4000.7507.0099 04 04 state NORMAL

!--- Vitual-TokenRing0 is the name of the interface on which the session
!--- is established.
!--- 4000.7507.fff and 4000.7507.0099 are the source and destination MAC
!--- addresses. This is the address of the interface on which the connection
!--- is established.
!--- NORMAL indicates that the current state of the LLC2 session is fully
!--- established and that normal communication is occurring.

 V(S)=15, V(R)=15, Last N(R)=15, Local window=7, Remote Window=127
 akmax=3, n2=10,
 xid-retry timer  0/0     ack timer   0/1000
 p timer          0/1000  idle timer  1220/10000
 rej timer        0/3200  busy timer  0/9600
 akdelay timer    0/100   txQ count   0/200
 RIF: 0830.0141.0641.0580

Circuits in this state can indicate a number of problems, such as problems with XID exchanges or devices not being varied on in VTAM. In Fast Sequenced Transport (FST) peers (or direct encapsulation peers that are not using local acknowledgement), the session is not locally terminated. The Routing Information Field (RIF)???for Token Ring???is terminated, but the session is completely pass-through. As such, you do not see circuits established for sessions across DLSw+ FST or direct peers (other than Frame Relay local-ack). Another common problem with XID exchange is having the wrong IDBLK/IDNUM or CPNAME values.

NCCF     TME 10 NetView   CNM01 OPER6   03/31/00 13:59:43
C CNM01  DISPLAY NET,ID=DKTN3270,SCOPE=ALL

!--- NetView takes the DIS DKTN3270 short form and converts
!--- it into the full D NET,ID=DKTN3270,SCOPE=ALL command.

  CNM01  IST097I DISPLAY ACCEPTED
' CNM01
IST075I NAME = DKTN3270    , TYPE = SW SNA MAJOR NODE
IST486I STATUS = ACTIV     , DESIRED STATE = ACTIV
IST1656I VTAMTOPO = REPORT , NODE REPORTED - YES
IST084I NETWORK RESOURCES:
IST089I DK3270DY TYPE = PU_T2.1       , ACTIV

!--- Verify that the PU is in ACTIV state.
!--- If the PU is in INACT or INOP status, then ask the System Programmer or
!--- Network Operator to activate it.
!--- If the PU is in CONNECT status, then you could have a definition error.
!--- Ask the System Programmer to verify the Switched Major Node definition.
!--- If the PU is in ACTIV status and you still can not establish a session,
!--- then verify that another end station is not using the the same PU.

IST089I DKDYLU0A TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU0B TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU1A TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU19 TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU18 TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU17 TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU16 TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU15 TYPE = LOGICAL UNIT  , ACT/S---X-
IST089I DKDYLU09 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU08 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU07 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU06 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU05 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU04 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU03 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU02 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DKDYLU01 TYPE = LOGICAL UNIT  , ACTIV---X-
IST089I DK3270ST TYPE = PU_T2         , CONCT
IST089I DKSTLU01 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU02 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU03 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU04 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU05 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU06 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU07 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU08 TYPE = LOGICAL UNIT  , CONCT
IST089I DKSTLU09 TYPE = LOGICAL UNIT  , CONCT
IST089I DKDLUR32 TYPE = PU_T2.1       , ACTIV--L--
IST089I DKDLDYPU TYPE = PU_T2.1       , ACTIV
IST089I DKDLSTPU TYPE = PU_T2.1       , ACTIV
IST089I DKDLST01 TYPE = LOGICAL UNIT  , ACTIV
IST089I DKDLST02 TYPE = LOGICAL UNIT  , ACTIV
??? ***
   VBUILD TYPE=SWNET
*
* TN3270 DYNAMIC LU BUILD
*
DK3270DY PU  ADDR=01,
         IDBLK=05D,
         IDNUM=03270,

!--- Verify that the end station is using the correct IDBLK and IDNUM values.

         PUTYPE=2,
         LUGROUP=BXLLUGRP,LUSEED=DKDYLU##
*        LUGROUP=BXLLUGRP,LUSEED=DKDYLU##
*
*
* TN3270 CP DEF FOR DLUR EN ON CIP
*
DKDLUR32 PU  ADDR=01,
         CPNAME=DK3270CP,

!--- Verify that the end station is using the correct CPNAME value.

         ISTATUS=ACTIVE,
         PUTYPE=2,
         CPCP=YES,
         NETID=NETA

Connected

The CONNECTED state is the normal condition when a DLSw circuit is successfully connected.

show dlsw circuit???When you are troubleshooting DLSw circuit status problems, issue the show dlsw circuits privileged EXEC command:

show dlsw circuits [detail] 
[mac-address address | sap-value value | circuit id]
  • detail???(Optional) Displays circuit state information in expanded format.

  • mac-address address ???(Optional) Specifies the MAC address to be used in the circuit search.

  • sap-value value ???(Optional) Specifies the SAP to be used in the circuit search.

  • circuit id ???(Optional) Specifies the circuit ID of the circuit index.

Refer to DLSw+ Configuration Commands and to the next diagram, to understand the output from this command.

dlswts4_b.gif

Common DLSw Issues

BADSSPHDR Error Messages

This error message may appear on some DLSw routers:

%DLSWC-3-BADSSPHDR: bad ssp hdr in proc ssp - received remote correlator from
different peer  = 0x200004B

-Traceback= 606FCD68 606FD008 606ED364 606F2B2C 6026B118 601F6438 601CAA10
6020F6B0 6020E350 6020E484 601B3048 601B3034
Nov 23 06:10:33: %DLSWC-3-RECVSSP: SSP OP = 4( ICR ) received from peer x.x.x.x(2065)
Nov 23 06:10:33: %DLSWC-3-RECVSSP: SSP OP = 4( ICR ) expected from peer y.y.y.y(2065)

!--- Where x.x.x.x and y.y.y.y are two different remote DLSw peers.

Those messages are informational, and this section explains why they might happen.

During address resolution (CANUREACH_EX), a router can get multiple responses back (ICANREACH_EX). The router that initiated the address resolution will cache all of the responses at the time of circuit bring-up. The originating router will send a directed CANUREACH message to one of the remote routers that responded during address resolution. The originating router runs a timer, to wait for an ICANREACH. If the ICANREACH is not received before the timeout, then the originating router sends another directed CANUREACH to one of the other remote routers that responded during address resolution. If???for some reason like congestion, slow links, and so forth???the ICANREACH from the first remote router arrives after the ICANREACH from the second remote router, you get the aforementioned error messages. The router receives an ICANREACH from IP address x.x.x.x, but it expected the ICANREACH from IP address y.y.y.y. If there are no connectivity problems, then these messages are displayed for informational purposes only; DLSw is considered to be working as designed. Refer to Cisco bug ID CSCdp50163 (registered customers only) for more information.

If, however, the DLSw network is experiencing connectivity problems, then the messages should be taken seriously and further investigation is required. Look for significant WAN delays, periodic DLSw peer timeouts in the network, or both. Additionally, determine if Network Address Translation (NAT) is used between the peers, because that might cause the connectivity problem. It might be worthwhile to turn off User Datagram Protocol (UDP) explorers, to see if these error messages cease: issue the dlsw udp-disable command, first introduced in Cisco IOS?? Software Release 11.2 F. If not, then a WAN trace of the Transmission Control Protocol (TCP) flows between the peers would be most helpful.

Note: The aforementioned error messages were also improperly reported in Cisco IOS Software Releases earlier than 11.2. Therefore, it is important that you run a release later than 11.2.

DLSw Version 2 and Firewalls

With the introduction of the Cisco DLSw UDP unicast feature in Cisco IOS Software Release 11.2(6)F, explorer frames and unnumbered information frames are sent via UDP unicast rather than TCP. Before DLSw version 2, this unicast feature required that a TCP connection existed before packets were sent via UDP. DLSw version 2, however, sends UDP/IP multicast and unicast before the TCP connection exists. Address resolution packets???such as CANUREACH_EX, NETBIOS_NQ_ex, and so forth???use multicast service, but the responses???ICANREACH_ex and NAME_RECOGNIZED_ex???are sent back via UDP unicast.

In a typical scenario, a firewall has been set up between the DLSw peers. Consequently, the DLSw circuits have to be established through the firewall. RFC 2166 leavingcisco.com (DLSw v2.0 Enhancements) states that the UDP source port can be any value. Cisco DLSw routers use source port 0. This presents a problem when DLSw circuits are passed through firewalls, which are typically setup to filter out port 0. This results in failures to connect DLSw circuits. The workaround is to enable the dlsw udp-disable global configuration command. If the dlsw udp-disable command is configured, then DLSw does not send packets via UDP unicast, and it does not advertise UDP unicast support in its capabilities exchange message.

For more information, refer to UDP/IP Multicast Service and Understanding the DLSw+ Introduction of the UDP Unicast Feature.

MSFC and DLSw Issues

There can be numerous issues when you run DLSw on a Multilayer Switch Feature Card (MSFC) or a Multilayer Switch Feature Card 2 (MSFC2). For comprehensive information about DLSw and MSFC, refer to DLSw+ and MSFC Frequently Asked Questions.

802.1q Trunks into DLSw+

The LLC2 from 802.1q encapsulated trunks into DLSw is first supported with DLSw TCP peers and transparent bridging by means of Cisco bug ID CSCdv26715 (registered customers only) . As of Cisco IOS Software Release 12.2(6) and later, 802.1q and DLSw works.

Additionally, by means of these DDTS support for DLSw, Ethernet redundancy and dot1Q encapsulation with native VLAN is made available. Refer to the Release-notes and the First Fixed-in Version fields of these DDTS reports:

  • Cisco bug ID CSCdv26715 (registered customers only) ???Brings the support for 802.1q into DLSw with TCP encapsulation only.

  • Cisco bug ID CSCdy09469 (registered customers only) ???Corrects the defect where DLSw does not work when the LAN interface is a FastEthernet interface that is configured for 802.1q encapsulation and native VLAN:

    interface FastEthernet0/0.500
         encapsulation dot1Q 500 native
         bridge-group 1
  • Cisco bug ID CSCdw65810 (registered customers only) ???Fixes the usage of DLSw Ethernet redundancy and 802.1q encapsulated trunks. There is still no support for DLSw FST with 802.1q.

If you select Cisco IOS Software Release12.2(13.4) and later, DLSw with TCP encapsulation, then DLSw Ethernet redundancy supports the LLC2 from 802.1q encapsulated trunks with or without the native keyword.

Related Information

Updated: Jan 28, 2008
Document ID: 17564