Guest

IBM Networking

Troubleshooting SNA Switching Services: A Guide To Problem Resolution

WHITE PAPER

SNASw TROUBLESHOOTING OVERVIEW

SNA Switching Services (SNASw), like many network technologies that perform a protocol integration function, has characteristics that can make problem resolution difficult. Because the product operates both in an SNA environment as well as an IP environment, it is sometimes necessary to gather information from several sources to debug a particular problem. There are a number of good tools available to help diagnose and resolve any problem you might encounter. However, it is important to understand the specific nature and scope of the problem so that you can determine where to do your analysis.
This guide is designed to provide a strategy for dealing with problems related to SNASw, to explain the tools available to help diagnose problems, and to provide specific actions to take for various types of problems you may encounter. It is in no way meant to replace the IOS Command Reference and Configuration Guides, which should be used to find out more about the snasw commands referenced in this document.

OVERVIEW OF SNASw ENVIRONMENT

SNASw is the Cisco® recommended solution for supporting SNA-related devices and traffic in an IP-based network. SNASw is available as a component of the Cisco IOS® Software and is supported on a wide variety of router platforms.
SNASw implements an Advanced Peer-to-Peer Networking (APPN) branch network node (BrNN). As such, it appears as a network node (NN) to the downstream devices that connect through the node and as an end node (EN) to other upstream APPN nodes. SNASw also supports the Enterprise Extender (EE) transport option that transmits SNA traffic natively using IP/User Datagram Protocol (UDP). With EE, flow control and Layer 4 connection functions are handled by a protocol known as High Performance Routing (HPR)/IP.
The very nature of the software and its ability to support both SNA and IP transport protocols facilitates network designs that adhere to one of two main topologies: SNASw remote (see Figure 1) and SNASw with Data-Link Switching Plus (DLSw+) (see Figure 2).

Figure 1 Figure 2

SNASw Remote SNASw with DLSw+

There are various connection segments within the network, and this is where problems are likely to be seen and where diagnosis takes place. The type of problem dictates the information that needs to be gathered and from which points in the network information should be obtained. For example, Figure 1 shows a network with HPR/IP as the upstream connection from SNASw. In connection segment 1, a downstream physical unit (PU type 2 or 2.1) connects to an interface on the router using Logical Link Control, Type 2 (LLC2) or Synchronous Data Link Control (SDLC) protocol. Figure 2 shows a network that uses DLSw+ to transport SNA traffic over an IP network. SNASw is used at the data center to provide enhanced redundancy to the mainframe Parallel Sysplex environment. In connection segment 3, the upstream connection from SNASw can be HPR/IP or SNA HPR using LLC2. In connection segment 2, DLSw+ peering connects data center and branch routers. As before, downstream devices connect using LLC2 or SDLC in connection segment 1.
A number of general actions take place in the network to provide data transport services for downstream devices. Figure 3 depicts a typical network with SNASw deployed at the branch using Enterprise Extender (HPR/IP) as the transport protocol for SNA traffic. In this environment, several actions permit connectivity and subsequent data transfer to occur. First, it is common to define the primary NN server and backup NN server as upstream links from SNASw. One link must be in an Active state with a control point (CP)-to-CP session for subsequent connectivity to occur. If the links are not active, then follow the troubleshooting steps outlined in the section Uplink Fails to Connect. Downstream devices also connect to SNASw.

Figure 3

High-Level Data Flows for SNASw

For example, Figure 3 shows the high-level data flows that occur in SNASw as a series of steps. In step 1, a downstream PU 2 establishes an LLC2 connection by initiating a TEST request to a Media Access Control (MAC) address locally defined to SNASw, followed by an eXchange IDentifier (XID). In step 2, SNASw, acting as the Dependent Logical Unit Requestor (DLUR), establishes a DLUR/Dependent LU Server (DLUS) path with the DLUS. SNASw passes a REQACTPU (containing the XID from the downstream PU) over the DLUR/DLUS pipe and the downstream PU and LUs are then activated (System Services Control Point [SSCP]-to-PU/LU sessions via ACTPU/LU from the DLUS). An end user (LU) then requests a session with an application LU, named APPLA, which resides on the APPN EN host. In step 3, SNASw passes the INITSELF or USSLOGON to the DLUS over the DLUR/DLUS pipe. In step 4, the NN server informs the application of the session request and provides a route to the SNASw BrNN that handles the LU. Because a connection network has been defined, in step 5, the EN host is able to establish a direct link to the SNASw router over IP. The LU-to-LU session is then activated and data transfer begins.
This is a very simplistic description of the connectivity actions that are required to establish data transfer in this environment. When troubleshooting, try to determine the specific sequence and point in the flow where your problem occurs and then start your debug activities targeting that particular point.

TROUBLESHOOTING BASICS

Internetworking of IP and SNA can be complex and this particular environment imposes some difficulty because of the historical differences inherent in TCP/IP and SNA networks and the skills required in each area. When faced with a problem that requires detailed analysis, using a standard troubleshooting methodology reduces the time it takes to isolate and resolve the problem.

Troubleshooting Methodology Overview

It is important that you approach any internetworking problem using a problem-solving model. First, establish a clear understanding and definition of the problem. Next, gather all relevant information using the tools and techniques described in this document. After analyzing the information, create an action plan to address the likely cause of the problem. If the symptoms are not resolved, try another action plan or gather additional information that might lead to another conclusion.
The troubleshooting methodology adopted in this document follows these general steps:

Diagram the problem¾Begin with a detailed diagram of the network. Maintaining accurate diagrams of the physical and logical components and their relationships is important to ensure continued operation and availability. It helps to further illustrate the components involved in the data path specific to the problem at hand.

Isolate the problem¾Gather detailed information about the problem. This includes configurations, protocols, data paths, and historical performance data. Determine the starting point as well as fault isolation procedures.

Correct the problem¾Make appropriate hardware, software, or configuration changes to correct the problem.

Verify that the trouble is corrected¾Perform operational tests to verify that the trouble is corrected.

The troubleshooting steps presented in Troubleshooting SNASw Operational Problems and the example scenarios presented in the appendix, Diagnostic Output Examples, generally follow this methodology in listing typical symptoms and provide associated diagnostics measures.

Tools

SNASw was designed with support in mind. The product includes trace analysis and debugging tools to help you and Cisco diagnose any problem that might be encountered. These tools, in conjunction with various show snasw commands, enable problem diagnosis, isolation, and resolution of most problems. At times, additional trace and debug information, such as that available from IBM hosts or LAN analyzers, is required.

snasw pdlog

SNASw contains its own problem determination logging facility known as the pdlog. This is a cyclic buffer that provides detailed information on recent state transitions, traffic, and events for SNASw. The pdlog is always enabled (cannot be turned off), but the size of the buffer and the type of abbreviated pdlog messages written to the router log is controlled with the snasw pdlog command:
snasw pdlog [problem | exception | info] [buffer-size buffer-size-value] [file filename timestamp]
All detailed pdlog records (problem, exception, and informational) are written to the internal pdlog buffer whether snasw pdlog is configured or not. However, the pdlog configuration command determines which level of associated pdlog messages are written to the router log. If not configured, the default is exception, which means that only problem and exception pdlog messages will be seen in the router log.
The buffer-size keyword determines the size of the pdlog buffer (in processor memory). With IOS 12.1 and 12.2, the maximum buffer-size is 16,000 KB. With Cisco IOS Release 12.3 and above, the maximum buffer-size is 64,000 KB. If not coded, the default is a 500-KB buffer.
You can display information from the pdlog buffer at the router console using the show snasw pdlog command:
show snasw pdlog [brief | detail] [all] [last] [next] [filter filterstring] [id recordid]
You can also copy the entire pdlog file to a file server or flash using the snasw dump command (which is explained later in this document).

Note: The snasw pdlog is a very useful tool, and should be one of the first places you look when diagnosing a problem. The pdlog messages you see in the router log have an identifier that can be used to examine the detailed entry in the pdlog cyclic buffer. This detailed entry often contains resource names, session identifiers, sense codes, and so on that can lead you directly to the next step in resolving the problem.

snasw dlctrace

The snasw dlctrace command traces frames arriving and leaving the SNASw stack within IOS:
snasw dlctrace [buffer-size buffer-size-value] [file filename [timestamp]] [frame-size frame-size-value | auto-terse] [format [brief | detail | analyzer]] [nostart]
This trace facility is designed for use by network support personnel to troubleshoot connectivity problems. The trace can be stopped and started using the snasw stop|start dlctrace command.
The buffer-size keyword determines the size of the dlctrace buffer (in processor memory). With Cisco IOS Releases 12.1 and 12.2, the maximum buffer-size is 16,000 KB. With Cisco IOS Release 12.3 and above, the maximum buffer-size is 64,000 KB. The larger the buffer you configure, the better the chance that important trace records will not be lost. If not coded, the default is a 500-KB buffer.
Unless a problem requires you to see application data in the trace, it is recommended that you configure frame-size auto-terse. This trims all application data from the trace records, allowing more trace records to fit within the cyclic buffer. You can also filter which records are written to the buffer (thus allowing a longer duration of trace) by using the snasw dlcfilter configuration command.
You can use the show snasw dlctrace command to examine the dlctrace records (they are printed to the router log), but it is often easier to copy the dlctrace file to a file server (see the snasw dump command later in this document).

Note: The snasw dlctrace is a very powerful tool. Because it has very little overhead (between 2 and 8 percent), Cisco recommends that it be enabled when testing new implementations. Some customers also choose to leave it enabled in production to facilitate collecting documentation in the event that a problem is encountered.

snasw ipstrace

The snasw ipstrace command, or interprocess signal trace, is used for debugging internal SNASw software problems. It copies internal signal flows into a cyclic memory buffer, which can affect router performance by as much as 20 percent. Therefore, you should use this command only as directed by Cisco service personnel:
snasw ipstrace [buffer-size buffer-size-value] [file filename timestamp]
The impact on performance can be reduced by configuring the snasw ipsfilter command prior to enabling the snasw ipstrace command. This allows you to specify only those internal components identified by Cisco personnel as being related to the problem at hand.
The buffer-size keyword determines the size of the ipstrace buffer (in processor memory). With Cisco IOS Releases 12.1 and 12.2, the maximum buffer-size is 16,000 KB. With Cicsco IOS Release 12.3 and above, the maximum buffer-size is 64,000 KB. The larger the buffer you configure, the better the chance that important trace records will not be lost. If not coded, the default is a 500-KB buffer.
After the trace has been captured, it can be copied to a server using the snasw dump command covered in the next section of this document. This enables the Cisco engineer to perform additional processing of the data contained in the file. There is also a show snasw ipstrace command, but its use is not recommended except as advised by a Cisco engineer.

snasw summary-ipstrace

snasw summary-ipstrace [buffer-size buffer-size-value] [file filename timestamp]
There is an abbreviated version of the ipstrace called a summary-ipstrace. This trace is always enabled (cannot be turned off), but the size of the buffer and the file url are specified via the snasw summary-ipstrace command. Because of its limited information, the summary-ipstrace is rarely used.

snasw dump

snasw dump all | dlctrace | ipstrace | summary-ipstrace | pdlog
The snasw dump command copies the pdlog and dlc, ips, and summary-ips traces from their internal buffers to an external file server or to Flash memory. The command uses the file destinations previously specified using the file keyword on the related snasw dlctrace, snasw ipstrace, and snasw pdlog commands. For example, to copy the pdlog to a Trivial File Transfer Protocol (TFTP) server, you would use the following configuration:
snasw pdlog problem buffer-size 10000 file tftp://myhost/path/pdlogfilename
If no file was specified on the configuration and you issue snasw dump dlctrace or ipstrace or summary-ipstrace or pdlog, then you will be prompted for the file name. However, snasw dump all does not prompt for file names (they must have been previously configured for the command to succeed).
The files that are generated for the pdlog and dlctrace (in format detail) may be up to twice as large as the configured buffer-size. This is because the binary data in the buffer is converted to ascii text and then written to the file. In Cisco IOS Releases 12.1 and 12.2, buffer-size was limited to 16 MB to avoid the dumped ascii file size from exceeding the TFTP maximum size of 32 MB. Beginning with Cisco IOS Release 12.3, SNASw allows buffer-size to be specified to a maximum value of 64 MB. If TFTP is the transport protocol being used, SNASw copies the first 32 MB of ascii text to the configured file name and then copies subsequent 32-MB files with .01, .02, etc. appended to the name.
The files that are generated for the dlctrace (in format analyzer which is SnifferPro™-compatible) and ipstrace consist of binary data, so their size is the same or less than the configured buffer-size.

snasw msgdump

The snasw msgdump command can be used to enable automatic dumping of the dlctrace, ipstrace, and pdlog files (and optionally to execute a write core command) when a specified SNASw pdlog message is written to the router log:
snasw msgdump pdlog_message_id [writecore]
This can be very helpful to trigger trace information capture following a particular event, and is usually configured at the direction of Cisco support when trying to collect documentation for a service request. This is a one-time-only trigger-you must remove snasw msgdump from the configuration (using the no form of the command) and add it in again to re-enable the automatic dumping.
When using snasw msgdump, it is important that you correctly configure the file keyword on the respective snasw pdlog, snasw dlctrace, and snasw ipstrace commands, otherwise when the msgdump is triggered, the files will be lost. Also, you may find it helpful to configure the timestamp keyword, which appends the time the file was dumped to the end of the file name. This allows you to know when the file was written and to avoid the file copy from failing because of a duplicate file name.
An enhancement was made to the snasw msgdump processing in Cisco IOS Release 12.3 to add SNA alert support. In order to take advantage of this, SNASw must have an Alert focal point. Use the show snasw node command to see if there is a cpname in the Alert focal point field. If not, and if SNASw has an active Network Node Server (NNS), on the NetView host you can issue the command:
FOCALPT CHANGE,FPCAT=ALERT,TARGET=snasw-cpname | snasw-nns-cpname
When the SNASw router (or its NNS) is in NetView's alert sphere of control, then when a msgdump event is triggered, SNASw will send an MDS-MU alert. The alert will have an identifier of x'DAED5B0B', and will contain the pdlog entry which triggered the msgdump. This informs the network operator that a monitored event has occurred (so they know to retrieve the pdlog/trace files dumped by SNASw), and can trigger host automation to collect VTAM traces or take other appropriate action. This is especially useful when tracking down DLUR/DLUS-related issues.
If writecore is specified, a write core command is attempted whenever the msgdump condition is triggered (in addition to the dumping of the pdlog, dlc and ips traces). The write core command is issued using the existing configuration parameters: server host, transfer protocol, user name, and password. For the write core command to be successful, the exception dump statement must be configured to specify the destination server. Cisco also recommends that the compress option be used for the core file name in the exception core command to save space on the server.
exception dump <host name or address>
exception core-file <core file name> compress
If no exception protocol is configured, the write core operation would be attempted using tftp; the core file is written under the /tftpboot directory. If ftp is specified for exception, then the user name and password information must be configured:
ip ftp user <userid>
ip ftp password <password>
exception protocol ftp

Note: The user must be aware that the write core operation puts a load on the router and may momentarily cause some network disruption. Therefore, the writecore option should be used only at the explicit request from Cisco TAC.

snasw arbdata

The HPR protocol in SNASw utilizes the Adaptive Rate Based (ARB) algorithm to monitor the available bandwidth of the network and obtain the best throughput for HPR connections. When performance problems are detected with HPR connections, it is sometimes necessary to gather real-time values of ARB algorithm variables as the data flows through the HPR connection. This can be accomplished by issuing:
snasw start | stop arbdata local-tcid
Output from the start form of this command is written to the router log and can consist of many lines of text per second, so it is best to make sure that you have configured logging buffered and no logging console before issuing this command. You issue the stop form of the command to stop the messages from being written to the router log.

Note: Interpreting output from the snasw arbdata command requires a detailed understanding of the HPR Architecture and ARB algorithm. Also, this command can result in a high volume of messages being written to the router log. For these reasons, Cisco recommends this command only be used under the direction of Cisco service personnel.

snasw event

By default, only defined links and DLUS events are sent to the pdlog console. To get more information for debug purposes, use the snasw event global configuration command:
snasw event [cpcp] [dlc] [implicit-ls] [port]

SNASw Debug Commands

Output from Cisco IOS debug commands provides a valuable source of information and feedback concerning state transitions and functions when assessing problems. However, the snasw dlctrace and snasw ipstrace commands should be favored over debug snasw dlc and debug snasw ips commands in an SNASw environment. The snasw trace commands write directly to a cyclic buffer rather than to the router log, thereby avoiding flooding and providing flexibility in accessing the trace records.
Other debug commands, such as debug dlsw or debug llc, may be useful in environments where DLSw+ is used in conjunction with SNASw or to trace activity at the interface level.

Core Files

If your router crashes, it is sometimes useful to obtain a full copy of the memory image (called a core dump) to identify the cause of the crash. Not all crash types produce a core dump. The following example configures a router to use FTP to dump a core file named dumpfile to the FTP server at 172.17.92.2 when it crashes:
ip ftp username red
ip ftp password blue
exception protocol ftp
exception dump 172.17.92.2
exception core-file dumpfile
Details covering the procedure for obtaining a core dump can be found at: http://www.cisco.com/univercd/cc/td/doc/product/software/ios122/122cgcr/ffun_c/fcfprt3/fcf013.htm.
You can initiate transfer of a core dump manually by entering:
write core
cheney#wr core
Remote host [172.18.60.179]?
Base name of core files to write [cheney-core]?
writing compressed ftp://172.18.60.179/cheney-coreiomem.Z
!!!!!!!!!!!!!!
Writing cheney-coreiomem.Z
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
EOF-inputbuf 4000000! [OK]
5242880 bytes copied in 41.4 secs (127875 bytes/sec)
writing compressed ftp://172.18.60.179/cheney-core.Z
!
Writing cheney-core.Z
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
EOF-inputbuf 83B00000 [OK]
61865984 bytes copied in 321.672 secs (192728 bytes/sec)
cheney#
You might be requested to perform this action by the Cisco Technical Assistance Center (TAC) for debug purposes even if your router did not crash.

Note: The snasw ipstrace and snasw dlctrace commands provide additional important debug information for the development engineers. If these trace facilities are activated, stop them by issuing snasw stop dlctrace and snasw stop ipstrace before forcing wr core. If the traces are not stopped prior to the wr core command, some corruption of trace records can occur.

Host Commands and Traces

In some cases, it may be necessary or helpful to perform traces and displays from the host in addition to traces at the SNASw router. This section provides examples of various host-based trace facilities and associated job control information.

VTAM Internal Trace

Because VTAM is a NNS and Dependent Logical Unit Server (DLUS) for SNASw, it is often necessary to collect a VTAM Internal Trace (VIT) to diagnose problems. The VIT has many options and can be collected in several ways (GTF or Data Space), so it is best to refer to the VTAM Diagnosis Guide (a licensed manual) for details.

Collecting Buffer and CCW Traces Using the Generalized Trace Facility

Use JCL similar to the following statements to invoke the generalized trace facility (GTF):
//GTFNEW PROC MEMBER=GTFPARM
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES', *
// TIME=1440,REGION=2880K
//IEFRDER DD DSNAME=SYS4.TRACE,DISP=SHR
//SYSLIB DD DSNAME=SYS1.PARMLIB(&MEMBER),DISP=SHR
Follow these steps to perform a buffer trace. First, start the GTF:
s gtf
AHL103I TRACE OPTIONS SELECTED --USR,RNIO
*10 AHL125A RESPECIFY TRACE OPTIONS OR REPLY U
r 10,u
Then start the buffer trace by entering the statement:
f net,trace,type=buf,id=<resource>
Take action to recreate the problem you are tracing. When complete, stop the buffer trace with the statement:
f net,notrace,type=buf,id=<resource>
Then, stop the GTF by displaying the job and using the purge command:
d a,gtf
RESPONSE=DEREK
IEE115I 12.02.03 2001.303 ACTIVITY 203
JOBS M/S TS USERS SYSAS INITS ACTIVE/MAX VTAM OAS
00000 00011 00002 00024 00004 00002/00010 00001
GTF 0342 IEFPROC NSW S A=002C PER=NO SMC=000
PGN=001 DMN=005 AFF=NONE
CT=000.770S ET=076.525S
WUID=STC00229 USERID=++++++++
ADDR SPACE ASTE=05323B00
4020000 DEREK 01303 12:01:22.91 STC00229 00000090 AHL031I GTF INITIALIZATION COMPLETE
p gtf.342
If you need to get a channel command word (CCW) trace, start the GTF and enter these statements where addr is the address to trace, len is how much of each datastream to collect (the default is 256), and num ccws is the number of CCWs to collect:
s gtf
r nn,trace=siop,iop,ccwp
r nn,io=sio=<addr>,ccw=(data=<len>,ccwn=<num ccws>),end
r nn,u
S GTF
IRR813I NO PROFILE WAS FOUND IN THE STARTED CLASS FOR 214
GTF WITH JOBNAME GTF. RACF WILL USE ICHRIN03.
$HASP100 GTF ON STCINRDR
IEF695I START GTF WITH JOBNAME GTF IS ASSIGNED TO USER
++++++++
$HASP373 GTF STARTED
IEF403I GTF - STARTED - TIME=13.32.44
AHL121I TRACE OPTION INPUT INDICATED FROM MEMBER GTFPARM OF PDS
SYS1.PARMLIB
TRACE=RNIO,USR
00010000
AHL103I TRACE OPTIONS SELECTED --USR,RNIO
*11 AHL125A RESPECIFY TRACE OPTIONS OR REPLY U
R 11,TRACE=SIOP,IOP,CCWP
IEE600I REPLY TO 11 IS;TRACE=SIOP,IOP,CCWP
TRACE=SIOP,IOP,CCWP
AHL138I SIO TRACE OPTION REPLACED BY SSCH TRACE OPTION
*12 AHL101A SPECIFY TRACE EVENT KEYWORDS --IO=,SSCH=,CCW=,IO=SSCH=
R 12,IO=SIO=400,CCW=(DATA=1024,CCWN=50),END
IEE600I REPLY TO 12 IS;IO=SIO=400,CCW=(DATA=1024,CCWN=50),END
IO=SIO=400,CCW=(DATA=1024,CCWN=50),END
AHL103I TRACE OPTIONS SELECTED --IO=SSCH=(0400)
AHL103I CCW=(SI,CCWN=50,DATA=1024)
13 AHL125A RESPECIFY TRACE OPTIONS OR REPLY U
R 13,U
IEE600I REPLY TO 13 IS;U
Recreate your problem and then stop the trace facility. After you have trace data, you can format the trace information using interprocess communications subsystem (IPCS). The following example shows sample JCL to format a buffer trace with IPCS:
//IPCSRUN JOB CLASS=A,MSGCLASS=X,NOTIFY=WINNETT
//IPCS EXEC PGM=IKJEFT01,DYNAMNBR=20,REGION=1500K
//SYSPROC DD DSN=SYS1.SBLSCLI0,DISP=SHR
//IPCSPRNT DD SYSOUT=*
//IPCSTOC DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
DELETE 'SYS4.IPCS.DDIR.CLUSTER' PURGE CLUSTER
BLSCDDIR DSNAME('SYS4.IPCS.DDIR.CLUSTER') VOLUME(OPWK01)
IPCS NOPARM
SETDEF DSNAME('SYS4.TRACE') LIST NOCONFIRM NOPRINT
GTF USR(FEF)
END
This example shows how to format a CCW trace with IPCS:
//IPCSRUN JOB CLASS=A,MSGCLASS=X,NOTIFY=WINNETT
//IPCS EXEC PGM=IKJEFT01,DYNAMNBR=20,REGION=1500K
//SYSPROC DD DSN=SYS1.SBLSCLI0,DISP=SHR
//IPCSPRNT DD SYSOUT=*
//IPCSTOC DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
DELETE 'SYS4.IPCS.DDIR.CLUSTER' PURGE CLUSTER
BLSCDDIR DSNAME('SYS4.IPCS.DDIR.CLUSTER') VOLUME(OPWK01)
IPCS NOPARM
SETDEF DSNAME('SYS4.TRACE') LIST NOCONFIRM NOPRINT
GTF USR(ALL) CCW(SI) SSCHIO(400)
END

TROUBLESHOOTING SNASw OPERATIONAL PROBLEMS

Problems related to SNASw can be categorized into several basic areas:

• Upstream link activation problems

• Downstream device activation problems

• DLUR/DLUS problems

• Session failures

• Performance-related problems

In addition, problems can be categorized as being associated with particular types of equipment or issues with particular data paths. Because of similarities between symptoms and problems for these different situations, this document addresses these diagnostic topics collectively.
Begin troubleshooting by following the process suggested in the section Troubleshooting Methodology Overview. The diagnostics summaries in this section address the troubleshooting process using three basic stages:
1. Identifying symptoms
2. Isolating problems
3. Resolving problems
Each diagnostic section includes suggestions for identifying and isolating problems. It is assumed that relevant network topology diagrams have been obtained for reference prior to troubleshooting. Specific diagnostic output is included to illustrate how network entities react to failures and how to discern specific failures. Sample output for some of the commands is shown in Appendix B, Diagnostic Output Examples. If you need additional help in debugging or analyzing a particular problem, please contact the Cisco TAC at www.cisco.com.

Uplink Fails to Connect

If the uplink fails to connect, follow these troubleshooting steps:
Step 1. Verify the status of the defined uplinks using the sh snasw link command.
Step 2. Use show snasw pdlog detail all | include linkname to see if there are any entries specific to this link name in the pdlog. If so, you will need to show or dump the entire pdlog and find out which records apply.
Step 3. A defined uplink will attempt to establish connection when SNASw is started if the nostart parameter is not used. Issue the sh run | include snasw command to see the configuration for SNASw.
Step 4. If the link definition uses ip-dest ip-address, then the link is defined for HPR/IP EE. Verify IP connectivity to the host by issuing an extended ping originating from the interface associated with the hpr-ip port.
Step 5. If the ping fails, continue troubleshooting the IP connectivity issue using commands such as trace, sh ip ospf. Check neighbor connectivity to the host, Open Shortest Path First (OSPF) definitions on the host, and routing to the Virtual IP Addressing (VIPA) address.
Step 6. If the ping succeeds, then check the interaction between Virtual Telecommunications Access Method (VTAM) and the TCP/IP stack, including that vtam TCPNAME points to the correct stack (see d net,vtamspts) and VTAM external communication adapter (XCA) major node is set for medium=HPRIP.
Step 7. If the upstream link is SNA LLC2, then begin troubleshooting connectivity at Layer 2.
Step 8. Verify that the remote MAC and remote service access point (SAP) is defined on the snasw link.
Step 9. Trace the LLC2 layer using debug llc2 state and debug llc2 packet. Be sure to set access-list 1100 to limit debug output.
Step 10. You can also use the SNASw trace facility to trace the SNASw DLC layer. Issue snasw dlctrace. Try the link again with the trace on. Display the trace using the sh snasw dlctrace or snasw dump dlctrace command.
Step 11. Examine and collect system or netlog information from the host network node.
Step 12. Use the information gathered to diagnose the cause of the failure.
Step 13. Check the VTAM definitions, node type, and CPNAME.

Downstream PU Does Not Activate

The downstream PU may be having trouble trying to connect to SNASw or the problem may be upstream of SNASw for DLUR PUs. Follow these troubleshooting steps:
Step 1. Issue the sh snasw link command.
Step 2. Examine the pdlog for error or exception data related to this link or PU.
Step 3. Issue the sh snasw dlus command. The DLUS will be active if, and only if, there is DLUR traffic.
Step 4. Issue the sh snasw pu command.
Step 5. Check to see if other PUs have established connectivity through this node.
Step 6. Gather host information, including D NET,ID=puname; D NET,DLURS.
Step 7. Issue the sh snasw port command. Determine the port through which the downstream PU connects and verify that it is active.
Step 8. If the port is not active, troubleshoot as a configuration problem or interface problem.
Step 9. Check that the MAC/SAP on the downstream device matches that defined for SNASw on interface.
Step 10. Ensure that you have enabled port event notifications using the snasw event dlc implicit-ls port command. This causes messages to be written to the router log for certain events.
Step 11. Examine snasw dlctrace data.

Problem Establishing Connection for Downstream End Node

You may encounter a problem with an APPN device downstream from SNASw. After you have established that no physical or lower layer problem exists, begin troubleshooting the connection establishment sequence using these steps:
Step 1. Issue the sh snasw link command.
Step 2. Try starting the link on the downstream device and observe pdlog messages. If there is a problem with the XID between the downstream device and snaswitch, the detailed pdlog message will have additional information, including the last sent and received XIDs.
Step 3. Check that the MAC/SAP on the downstream device matches that defined for SNASw on interface.
Step 4. Verify the node name and node type.
Step 5. Trace the link activation using the snasw dlctrace command.
Step 6. Examine the snasw dlctrace data.

User Cannot Connect to Application

The reason an end user cannot connect to the application may be due to a network problem in establishing a dynamic link to the EN host. In this case, use the following troubleshooting steps:
Step 1. Issue the sh snasw link command. Determine whether an active link has been established to the EN data host.
Step 2. If no link exists, perform snasw dlctrace on the upstream port.
Step 3. Verify the connection network definition (that is, the virtual routing node [VRN] name) between the SNASw and host definitions.
Step 4. Observe any error messages in