Cisco Transport Manager User's Guide, 9.0
Appendix K: Troubleshooting

Table Of Contents

Troubleshooting

K.1  Overview

K.2  CTM Troubleshooting Tools

K.3  Installation Problems

K.3.1  CTM Installation Fails Before the Database Is Created

K.3.2  CTM Installer Hangs or Quits

K.3.3  Default Group Name Does Not Appear During New Installation

K.3.4  Insufficient RAM when Specifying the Destination Directory

K.3.5  SQL Errors

K.3.6  Uninstaller Quits While Uninstalling the CTM Server

K.3.7  Things to Keep in Mind when Installing or Uninstalling CTM

K.4  CTM Database Problems

K.4.1  PM Data, Audit Log Data, and Error Log Data Are Not Collected Correctly—CTM R7.0 and Earlier

K.4.2  Database Crashes

K.4.3  Database Troubleshooting Tools

K.5  Server Problems

K.5.1  Conditions that Affect CTM Server Performance

K.5.2  CTM Server Does Not Respond

K.5.3  Cannot Connect to the CTM Server

K.5.4  How Do I Restart the CTM Server when the Network Contains Many NEs?

K.5.5  NE Connection State Is Listed as Unavailable

K.5.6  Launching Tables Results in Database Errors

K.5.7  SNMP Traps Are Not Forwarded from NEs

K.5.8  Trap Port Is Unavailable

K.5.9  Problems with ONS 15530 and ONS 15540 Configuration

K.5.10  Performance Monitoring Data Is Not Displayed

K.5.11  CTC-Based NE Is Not Discovered

K.5.12  CTC-Based NE Is Not Reachable

K.5.13  ONS 1530x Communication Problems

K.5.14  ONS 15501, ONS 15540, or ONS 15530 Is Not Reachable

K.5.15  ONS 1580x NE Is Not Discovered

K.5.16  Changing the NE Operational State from the UNIX CLI

K.5.17  Circuits Are Not Displayed

K.5.18  Circuit State for Monitor Circuits Reads "Duplicate ID"

K.5.19  NE Model Type Appears as Unknown

K.5.20  Cannot Copy the CTC Binary to the CTM Server

K.5.21  Memory Backup, Memory Restore, or Software Download Fails

K.5.22  Memory Autobackup, Software Commit, or Software Revert Fails

K.5.23  The getinfo.sh Script Fails

K.6  Client Connectivity Problems

K.6.1  Database Is Not Available

K.6.2  Database Timeout Occurs

K.6.3  Are the CTM Client and the CTM Server Connected?

K.6.4  Cannot Log In as Provisioner or Operator

K.6.5  Cannot Authenticate User Message Appears

K.6.6  Socket Over TL1 Issues

K.7  Client Operational Problems

K.7.1  Model Index Is Unknown

K.7.2  Cannot Delete an NE

K.7.3  CTC Fails to Start

K.7.4  Added a New Software Version to the Wrong NE

K.7.5  Cannot Delete a Subnetwork

K.7.6  Cannot Move an NE Between Subnetworks

K.7.7  NEs Change Subnetworks Rapidly

K.7.8  Cannot Schedule Jobs

K.7.9  CTM Does Not Receive Autonomous Alarms from ONS 1530x NEs

K.7.10  ONS 15800, ONS 15801, or ONS 15808 NE Generates Too Many Alarms

K.7.11  Bandwidth-Intensive Operations Are Blocked

K.7.12  NE Displays Incorrect Configuration Management Data

K.7.13  Cannot Customize the Network Map

K.7.14  CTM Client Machine Displays Incorrect Colors

K.7.15  Common Topology Problems

K.7.16  Common VLAN Problems

K.7.17  PM Data Collection Fails

K.7.18  Common L1 Circuit Provisioning Problems

K.7.19  Common Equipment Provisioning Problems

K.7.20  How Do I Collect Solaris Client Thread Dumps?

K.7.21  How Do I Collect Windows Client Thread Dumps?

K.7.22  How Do I Collect Server Thread Dumps?

K.7.23  How Do I Enable or Disable the Automatic Refresh Data Feature?

K.7.24  How Do I Replace the Alarm Interface Panel?

K.8  Client Debug Messages

K.9  CTM GateWay/TL1 Problems

K.10  CTM GateWay/CORBA Problems

K.10.1  CTM GateWay/CORBA Is Installed After Installing the CTM Server

K.10.2  Testing the CTM GateWay/CORBA

K.11  Problems with MGX Voice Gateway Devices

K.11.1  Discovery Mechanism

K.11.2  Discovery Issues at Startup

K.11.3  Discovery Issues at Runtime

K.11.4  Equipment Management Problems

K.11.5  Configuration Center, Chassis View, Diagnostics Center, and Statistics Report Problems

K.11.6  Chassis View Problems

K.11.7  Configuration Center Management

K.11.8  Connection Management Problems

K.11.9  Diagnostics Center Problems

K.11.10  Performance Management Collection and Parsing Problems

K.11.11  Statistics Report Problems

K.11.12  Service Agent Problems

K.11.13  Miscellaneous Problems


Troubleshooting


This appendix offers troubleshooting steps to help solve high-level problems while using CTM or CTM GateWay. Refer to the troubleshooting procedures in this appendix before contacting the Cisco Technical Assistance Center (TAC) at http://www.cisco.com/tac.

This appendix includes the following troubleshooting information:

Overview

CTM Troubleshooting Tools

Installation Problems

CTM Database Problems

Server Problems

Client Connectivity Problems

Client Operational Problems

Client Debug Messages

CTM GateWay/TL1 Problems

CTM GateWay/CORBA Problems

Problems with MGX Voice Gateway Devices


Note For information about troubleshooting CiscoView problems, see Appendix J, "Using CiscoView to Configure and Monitor ONS 15501, ONS 15530, and ONS 15540 NEs."


K.1  Overview

Troubleshooting involves:

1. Identifying the source of the problem—Which devices, links, interfaces, hosts, or applications have the problem?

2. Locating the problem on the network—On what VLAN, subnet, or segment is the problem occurring?

3. Comparing current network performance against an established baseline—Is the performance better or worse?

4. Finding out when the problem started—When did you first see the problem? Is it recurring?

5. Determining the extent of the problem—How widespread is the problem? Is it getting worse?

This appendix assumes that the server is installed under the default /opt/CiscoTransportManagerServer directory and the client is installed under the default /opt/CiscoTransportManagerClient directory (Solaris) or C:\Cisco\TransportManagerClient directory (Windows). If a directory other than the default installation directory is specified, replace the default path with the installed path.

K.2  CTM Troubleshooting Tools

You can use all of the following tools to troubleshoot your system:

Error and Audit Logs—Most NE communication problems that occur when adding a new NE to the CTM domain can be diagnosed by looking at the Error and Audit Logs. See 8.5.1  Viewing the Audit Log, page 8-82 and 9.6.2  Viewing the Error Log, page 9-62.

EMS alarms—Using the Dashboard, launch the Alarm Browser window to display only the EMS alarms. See 1.3.1  Dashboard, page 1-5.

Self Monitor table—Launch the Self Monitor table to view server resource historical data. See 10.3.13  Using the Self Monitor Table, page 10-20.

Debug options—Use this tool only when recommended by the Cisco TAC. See 9.6.5  Setting Debug Options, page 9-66. In some cases, you may need to run the CTM client in debug mode. To run the CTM client in debug mode, close the existing session and execute one of the following commands:

In Windows, enter:

\Cisco\TransportManagerClient8_5\ctm-<config>-debug.exe

In Solaris, enter:

/opt/CiscoTransportManagerClient/ctmcdebug-start -<config>

where <config> is the network size (small, medium, large, or highend).

Getinfo.sh script—Provides UNIX and CTM configuration information. The following is an example of how to execute the getinfo.sh script:

#cd /opt/Cis*r/bin
#./getinfo.sh

The getinfo.sh script generates two files, logtext.tar.Z and techinfo.txt.Z. These files are created under the /opt/CiscoTransportManagerServer directory. If one of these files is not created, run the getinfo.sh script in debug mode to see where the script is failing. The following is an example of how to run the script in debug mode:

#cd /opt/Cis*r/bin
#sh -x ./getinfo.sh

If the script fails because of the following error: "Permission denied while running rsh," it means that the server is not configured to perform the rsh command without authentication. Remote users are required to enter a password after issuing the rsh command.

System parameter information—Output of commands such as prstat and top provides system parameter information that can be used to troubleshoot performance issues.


Caution Avoid changing the system parameters. Changing parameters such as the system timing results in substantial difference in poll and current times, which leads to server crashes.

CTM server log files—See 9.6.2  Viewing the Error Log, page 9-62 for a list of server logs.

In general, there are two sets of log files for each CTM service: service log and service error log. The service log is generated when you select trace error level in the Control Panel services. The service error log is generated when CTM detects an exception with minor or higher severity.

The ctmop.log file is used to determine whether the server was stopped by the user or whether it shut down abnormally. If the server shuts down abnormally, check the core_pmon.log file to see if any of the critical services went down. Services that are considered critical are Oracle, SMService, OSAgent, and GateWay/CORBA.

CTM server log files are located in the /opt/CiscoTransportManagerServer/log directory.

In the Process Monitoring tab of the Recovery Properties pane of the Control Panel, if a process is not running and is marked as critical, an alarm is generated and CTM shuts down. The core_pmon.log contains the reason for the shutdown, while ctmop.log contains the ctms-abort command entry. Oracle, OSAgent, and SMService are mandatory critical processes while CTM GateWay/CORBA is not considered a mandatory critical process.

Performance parameters—CTM periodically monitors vital performance parameters and raises EMS alarms when they cross a predefined threshold value. See 9.4.2  Setting Up and Viewing Alarm Configuration Parameters, page 9-17 for a list of performance parameters. All performance data that is collected at each poll cycle is listed in the Self Monitor table. You can export the data from the Self Monitor table to a spreadsheet to produce graphics and identify trends.

UNIX commands:

netstat—Shows the state of all sockets, all routing table entries, and all physical and logical interfaces.

snoop—Captures and inspects network packets.

/usr/platform/sun4u/sbin/prtdiag—Displays system diagnostic information, including the number of CPUs and the RAM.

vmstat—Reports virtual memory statistics.

K.3  Installation Problems

When the CTM installation runs in debug mode, useful information is dumped to the console. Complete the following steps to run the CTM server installation in debug mode:


Step 1 Enter the following command in the shell where you launch the installation:

setenv LAX_DEBUG true

Step 2 Enter the following command to start the CTM server installation:

ctmsetup.sh -dev


K.3.1  CTM Installation Fails Before the Database Is Created

If the CTM installation fails before the database is created, do the following before reinstalling:

Use the ps -ef | grep setup command to verify that the previous installation is not running. The installation script is ctmsetup.sh or setup.sh. If the installation process is still running even after exiting it from InstallAnywhere, terminate the installation with the kill -9 <pid> command, where <pid> is the process identifier that is returned by the ps -ef | grep setup command.

Verify that the /tmp directory is not full. During the CTM installation, the /tmp directory is used to store a temporary copy of the installation scripts and the JRE used for installation. Use the df -k command to check if there is enough space on the other file systems (for example, in the /opt directory).

Make sure that you follow the installation procedures explicitly. Even minor steps, such as creating the /tftpboot directory, must be completed. If you deviate from the documented installation procedures, you will encounter problems during installation.

If you are not installing CTM from the CD—for example, you are installing CTM from your hard drive—make sure that all of the required scripts and files are available and have the correct ownership and permissions.


Note It is not recommended that you install CTM from CD files copied to your hard drive.


You can use the /temp/dbinit.log for debugging when a new CTM installation fails.

K.3.2  CTM Installer Hangs or Quits

Check the disk space in the root and user installation directories. Also, check the disk space in the /tmp and /temp directories. If disk space is not a problem, check for available RAM.

K.3.3  Default Group Name Does Not Appear During New Installation

Verify the following when the default group name (root/other) does not appear in the Group Info panel during a new installation, or, if supplied, an "Invalid Groupname" error message appears.

The /tmp folder is created in the correct directory.

The installation scripts are executable. If not, use the chmod a+r* and chmod a+x* commands.

K.3.4  Insufficient RAM when Specifying the Destination Directory

Use the prtconf command to check your RAM and validate it against the specifications listed in the Cisco Transport Manager Release 9.0 Installation Guide.

K.3.5  SQL Errors

If you encounter SQL errors, check the log or err files in the /temp or /tmp directories and do one of the following:

Check whether the /opt/Cisco*Server/cfg/CTMServer.cfg file exists.

Make sure that the database password is correct. Use the password to connect to Oracle.

SQL problems occur when there is no connection to Oracle or there are Oracle command syntax violations in SQL.

K.3.6  Uninstaller Quits While Uninstalling the CTM Server

While uninstalling the CTM server, the uninstallation process might quit with the following error message: "CTM server is running." Use the showctm command to check whether the CTM server is running. If it is running, use the ctms-stop command to stop the server and relaunch the uninstaller.

K.3.7  Things to Keep in Mind when Installing or Uninstalling CTM

After the main installation phase in the GUI, wait for several minutes while certain files and scripts are copied to the installation directory. Before proceeding, wait for the message "Please reboot your system."

Stop the CTM server before starting the migration process.

Stop the CTM server before uninstalling it.

The upgrade network size configuration is allowed only in medium-to-large, medium-to-high end, and large-to-high end configurations.

The CTM GateWay/CORBA option is disabled during migration. It will be reinstalled if it was previously installed.

Use the /opt/Cisco*Server/UninstallerData/installvariables.properties file to make sure that all of your specifications during the preinstallation process are correct.

All installation files are available in the /tmp or /temp directories.

K.4  CTM Database Problems

K.4.1  PM Data, Audit Log Data, and Error Log Data Are Not Collected Correctly—CTM R7.0 and Earlier

Collect the following information:

Error logs in the /opt/CiscoTransportManagerServer/logs directory

CTM database size

The following cases indicate that the problem is related to the partitioning of the database tables:

The size of the CTM database is medium, large, or high end.

The log files contain ORA-01440 errors.

If the problem is related to the Audit Log or Error Log, collect the output of the following query:

select partition_name from user_segments where segment_name='ERROR_LOG_TABLE'

If PM data is not collected correctly, collect the output of the following query:

select partition_name from user_segments where segment_name=<PM_table>

An example of the last line of the query result is P01012007. This means that the latest partition was added on January 1, 2007. You have to manually add the partitions if the date when you performed the query is not January 1, 2007.

K.4.2  Database Crashes

If the database crashes, obtain a copy of all of the log files from the /oracle/admin/CTM/bdump and /oracle/admin/CTM/udump directories. The .trc file located in the udump directory contains the session status logs. The alert_CTM.log file located in the bdump directory contains the issues related to the different database processes and gives the process description with a time stamp.

K.4.3  Database Troubleshooting Tools

K.4.3.1  Checkdbinfo.sh Script

To debug Oracle performance issues, use the checkdbinfo.sh script located in the /opt/CiscoTransportManagerServer/bin directory. The /oracle/dbcheck.log file contains the output of the script.

K.4.3.2  Toad

Run a procedure using Toad and you can check the values of the procedure variables.

K.4.3.3  CTM Database Functions

The displayphyloc and displayip functions that are defined in the CTM database can be used to view data in a readable format. For example, the following query displays the list of installed nodes and the related IP address in readable format:

SELECT nesysid, display(neipaddr) FROM ne_info_table

See Chapter 3 of the Cisco Transport Manager Release 9.0 Database Schema for more information.

K.4.3.4  Tnsping Utility

For remote databases, performance issues might cause process timeouts. Use the tnsping utility to confirm that process timeouts occur.

K.5  Server Problems

This section describes troubleshooting procedures for CTM server-related problems.


Note Log in as the root user on the Solaris workstation where the CTM server is installed to perform any operations on the Solaris workstation.


K.5.1  Conditions that Affect CTM Server Performance

The "System Requirements" chapter in the Cisco Transport Manager Release 9.0 Installation Guide lists server, CPU, CPU speed, disk space, and RAM requirements for different types of CTM installations. These requirements are derived from guidelines based on scalability simulation testing.

Actual CTM server performance varies for each customer and is affected by:

The total number of NEs actively managed by CTM

The number of NE-related services (including northbound services) running on the server

The rate of circuit provisioning and the methods used for provisioning

The rate of network growth (for example, adding 50 NEs to CTM at once)

The rate of configuration change updates received from the NEs

The number of circuits provisioned with no signals that generate threshold crossing alerts (TCAs)

The number of unacknowledged alarms in the CTM database

The rate of alarm bursts received from the NEs

The actual number of circuits provisioned in the CTM database

The following conditions might indicate that your CTM installation is near or at capacity:

Your daily CPU utilization is consistently above 80%

You experience loss of connectivity (LOC) to 1% to 5% of nodes at random

Your RAM usage is 100% or your system swaps frequently

You receive many configuration updates from the NEs

Your system experiences high Java garbage collection time


Note The Java garbage collection time is determined by engineering after analyzing your CPU utilization data.


If you are experiencing any of these conditions, contact your account representative, who can engage Cisco's Advanced Services team to audit your CTM server and recommend a higher-capacity server and resources.

K.5.2  CTM Server Does Not Respond


Step 1 Log in as the root user on the Solaris workstation where the CTM server is installed.

Step 2 Enter the following command to view the status of the CTM server processes:

showctm

If you do not have root user privileges but you belong to the UNIX group that can use sudo functionality to run commands as nonroot, enter the following command:

sudo showctm

If there is a line containing /CTMServer, the CTM server is running.

If there is no line containing /CTMServer, the CTM server is not running. Proceed to Step 3.


Note You can also check the ctmop.log file in /opt/CiscoTransportManagerServer/log to determine whether the server was stopped by another user or if it stopped abnormally. If it stopped abnormally, proceed to Step 3.


Step 3 Run the getinfo.sh CTM server tool and send the data to the Cisco TAC for analysis.

If you do not have root user privileges but you belong to the UNIX group that can use sudo functionality to run commands as nonroot, enter the following command:

sudo getinfo.sh

Step 4 Start the CTM server by using the ctms-start script in the /opt/CiscoTransportManagerServer/bin directory.

a. Log in as the root user.

b. Change the directory to /opt/CiscoTransportManagerServer/bin and enter the following command:

ctms-start

If you do not have root user privileges but you belong to the UNIX group that can use sudo functionality to run commands as nonroot, enter the following command:

sudo ctms-start


If the preceding procedure does not solve the problem, complete the following steps:


Step 1 Verify whether the /opt/CiscoTransportManagerServer/cfg/CTMServer.cfg file is corrupt. The file should contain the db-config-mode = auto parameter in the [database] section. If the entry is missing, the CTM server configuration file is corrupt. Reinstall the CTM server.

Step 2 Verify whether the first entry in the /var/opt/oracle/oratab file looks similar to CTM8_5:/oracle/product/10.2.0:Y. If this entry is missing, the Oracle database might not be installed. The Oracle database is a prerequisite for installing the CTM server.

Step 3 To improve system performance, regenerate the database statistics, either before or after a significant system workload. See 4.2.11  Regenerating Statistics in the Database, page 4-25. Internal Oracle statistics allow Oracle to work efficiently, especially during data query operations. If you experience CTM system performance degradation during normal database activities, the database might be using stale statistics.


K.5.3  Cannot Connect to the CTM Server


Step 1 Ping the server's IP address from the client PC or workstation.

Step 2 If the ping fails, resolve the IP connectivity problem and try again.

Step 3 Telnet to the server and log in as the root user.

Step 4 Enter the following command to verify that CTM is running:

showctm

If you do not have root user privileges but you belong to the UNIX group that can use sudo functionality to run commands as nonroot, enter the following command:

sudo showctm

Step 5 The server should have at least the following processes running:

root 520 12.0 0.02855219536 Dec_19 14:18 CTM Server 
root 489 0.0 0.117384 Dec_19 0:00 CTM Server
root 749 0.6 5.7255104112848 Dec_19 346:27 SnmpTrapService
root 541 0.1 5.8284232115256 Dec_19 15:02 SMService
root 507 0.0 0.2 5512 3496 ? Dec 19 Apache Web Server

Step 6 If you see fewer than four processes running, enter the following command to stop the server manually:

ctms-stop

Step 7 If you changed the server IP address, verify that the configuration files shown in Table 4-14 on page 4-49 have been updated. See 4.4.2  Changing the IP Address when CTM and Oracle Are on the Same Server, page 4-47.

Step 8 Verify that the Oracle database is accepting connections:

a. Enter the following command to log in as the Oracle user:

su - oracle

b. Enter the following command to open an SQL*Plus session:

omu-u60-3% sqlplus ctmanager/ctm123!

Step 9 Reboot the server if you receive another error message and the SQL prompt does not appear. Wait for the server to boot up and try to run the client. The SQL prompt indicates that the Oracle database is running and accepting connections.

Step 10 Enter the following commands to restart CTM manually:

SQL> exit
omu-u60-3% exit
omu-u60-3% logout
ctms-start

If you do not have root user privileges but you belong to the UNIX group that can use sudo functionality to run commands as nonroot, enter the following command:

sudo ctms-start

Step 11 Wait for 5 minutes and run the client.


K.5.4  How Do I Restart the CTM Server when the Network Contains Many NEs?

If your network contains a high number of NEs, it often takes a long time for all of the NEs to synchronize after a CTM server restart. To avoid this NE synchronization delay, complete the following steps:


Step 1 In the Domain Explorer, choose Administration > Control Panel.

Step 2 In the Control Panel window, expand the NE Service node and click any NE with a green arrow (which indicates that there are services running).

Step 3 In the Status tab, click the Deactivate button to deactivate the NE service. This ensures that when the CTM server is restarted, no service consumes too much CPU and the NE resynchronization goes smoothly.

Step 4 For CTC-based NEs, expand the NE Service node and click CTC-Based SDH NEs or CTC-Based SONET NEs. In the Status tab, click the Deactivate button to deactivate the network service.

Step 5 Expand the PM Service node and click any NE with a green arrow. In the Status tab, click the Deactivate button to deactivate the PM service.

Step 6 Enter the showctm command. The command output should not show any active NE or PM services. The output should look similar to the following sample:

CTM Processes for Cisco Transport Manager Server Version: 9.0 Build: 831
-------------------------------------------------------------------------------------
USER   PID    %CPU   %MEM               START         TIME   PROCESS
-------------------------------------------------------------------------------------
root   14023  0.0    0.117360           11:00:13      0:00   CTM Server
root   14101  0.0    0.82607214968      11:00:18      0:01   CTM Server
root   14351  0.2    5.7263488113176    11:01:31      1:05   SnmpTrapService
root   14123  0.0    4.728914494384     11:00:20      0:57   SMService
-------------------------------------------------------------------------------------

Step 7 Enter the ctms-stop command.

Step 8 After the ctms-stop command is complete, re-enter the showctm command to verify that there are no CTM processes running. The output should look similar to the following sample:

CTM Processes for Cisco Transport Manager Server Version: 9.0 Build: 831
-------------------------------------------------------------------------------------
USER   PID    %CPU   %MEM               START         TIME   PROCESS
-------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------

Step 9 If any process is still running, identify the process ID and enter the following command to stop it:

kill -9 <process_ID>

Step 10 After all processes have been stopped, enter the ctms-start command at the command prompt. This command starts the CTM server but not the NE or PM services.

Step 11 Activate the NE service instances individually after most of the NEs in a partition have started synchronizing. That is, wait until the NE service instance displays an hourglass icon in the Control Panel tree before you activate the next NE service instance. To do this, expand the NE Service node; then, expand the appropriate NE node and in the NE Service Instance properties pane, click the Start Service Instance button.

Step 12 For CTC-based NEs, activate the network service instances individually after most of the NEs in a partition have started synchronizing. To do this, expand the NE Service node; then, expand the CTC-Based SDH NEs or CTC-Based SONET NEs node. In the Network Service Instance properties pane, click the Start Service Instance button.

Step 13 Activate the PM service instances. To do this, expand the PM Service node; then, expand the appropriate NE node and in the PM Service Instance properties pane, click the Start Service Instance button.


K.5.5  NE Connection State Is Listed as Unavailable

If the connection state of an NE is listed as Unavailable in the Domain Explorer window, a connectivity or configuration problem exists. Wait 5 to 10 minutes after adding the NE to the CTM domain; then, complete the following steps:


Step 1 To see the NE IP address, select the NE in the Domain Explorer window. The Address tab of the Network Element Properties pane lists the IP address of the selected NE.

Step 2 From the CTM server, enter the following command to verify connectivity between the CTM server and the NE:

ping <IP_address>

Step 3 If the ping fails, a physical or configuration problem exists in the data communications network (DCN).

a. If connectivity existed earlier:

If there were no configuration changes made to any routers in the DCN or to the CTM server, the problem is a physical problem.

If the configuration was changed, the change might have introduced problems. Verify the changed configuration.

b. If connectivity was never established:

Verify that there are no problems on the physical level.

Verify the DCN configuration.

Step 4 If the ping succeeds, verify that the NE software version is listed in the Supported NE table (Administration > Supported NE Table).

Step 5 For ONS 15501, ONS 15530, and ONS 15540 NEs, SNMP read and write community strings must be set up correctly on each NE. The running configuration on each NE should contain entries similar to the following example:

snmp-server community public RO
snmp-server community private RW

The community strings configured on the server should match the strings configured on the NE. For information on configuring community strings in CTM for ONS 1550x NEs, see 3.5.1.4  Prerequisites for Adding ONS 15501, ONS 15530, and ONS 15540 NEs, page 3-10.


K.5.6  Launching Tables Results in Database Errors

If the Oracle database or Oracle listener that CTM is using is down, launching tables will generate database errors. To troubleshoot table launching errors:


Step 1 Log in as the Oracle user and enter the following command:

sqlplus ctmanager/ctm123!

If the login succeeds, an SQL> prompt appears, which indicates that the Oracle database has been installed and the database server is up and running. If the login fails, either the Oracle database has not been installed or the database server is not running.

Step 2 To start the Oracle database, log into the Solaris workstation as the Oracle software user and enter the following command at the shell prompt:

dbstart

Step 3 Enter the following command at the shell prompt to start the Oracle listener:

lsnrctl start

If there are still problems with starting the Oracle database or the Oracle listener, refer to the Oracle documentation or contact Oracle support.


K.5.7  SNMP Traps Are Not Forwarded from NEs

SNMP traps might not be forwarded, either because the trap port is already in use, or because the NE is not configured correctly.

K.5.8  Trap Port Is Unavailable

The CTM server requires exclusive access to the SNMP trap port to receive SNMP traps from the NE.


Step 1 Enter the following command to verify that the standard SNMP trap port (port 162) is not in use by another application running on the same Solaris workstation:

netstat -a | grep 162

If the following line is present, the SNMP trap port is in use by another application:

*.162 Idle

Step 2 If the trap port is in use by another application, stop the other application.


K.5.9  Problems with ONS 15530 and ONS 15540 Configuration

ONS 15530 and ONS 15540 NEs must be configured correctly to send traps to the server.


Step 1 The running configuration on the NE should contain entries similar to the following example:

snmp-server enable traps snmp authentication warmstart 
snmp-server enable traps bgp 
snmp-server enable traps oscp 
snmp-server enable traps config 
snmp-server enable traps entity 
snmp-server enable traps fru-ctrl 
snmp-server enable traps topology throttle-interval 60 
snmp-server enable traps optical monitor min-severity not-alarmed 
snmp-server enable traps rf 
snmp-server enable traps aps 
snmp-server enable traps patch 
snmp-server enable traps cdl all 
snmp-server enable traps alarms 
snmp-server enable traps threshold min-severity degrade

snmp-server host 172.20.126.214 version 2c public snmp bgp oscp  
config entity fru-ctrl topology optical rf aps patch cdl alarms threshold

In this example, the host 172.20.126.214 receives the specified traps from the NE.

Step 2 You can also run the following commands on the server and device:

a. As the root user, enter the following command on the server:

# snoop udp port 162

b. On the device, enter the following command to generate a configuration change:

wr mem

If the device is configured correctly, the snoop command reports a configuration change trap.


K.5.10  Performance Monitoring Data Is Not Displayed


Step 1 Use the Domain Explorer > Network Element Properties pane to verify whether PM collection is disabled for the selected NE.

Step 2 Choose Administration > Control Panel and expand PM Service. Select the NE type to verify whether the service status is Active or Not Active.

Step 3 Choose Administration > Audit Log to verify whether PM was ignored for the selected NE.

Step 4 Choose Administration > Control Panel and click PM Service. Look at the PM Data Storage field. Choose Optimized to save only nonzero values in the database. Choosing Optimized requires less space than saving all values, including zero values. If you choose Normal in the PM Data Storage field, the database saves all PM values, including zero values.


Note The Optimized PM data storage feature is available only for CTC-based and ONS 155xx NEs.


Step 5 Choose Administration > Error Log to verify whether there are any critical, major, or minor errors against the PM service.

Step 6 Verify that the NE is reachable during the duration that PM data is collected. If the NE is not reachable, there is an entry in the Audit Log table (Administration > Audit Log).


K.5.11  CTC-Based NE Is Not Discovered


Step 1 Verify that the NE is up and running.

Step 2 Verify that the NE IP address and default route are configured correctly.

Step 3 To verify that the NE is running the correct version of system software, open a CTC session to the NE and check the NE software version.

Step 4 Verify that the NE that acts as the gateway NE (GNE) (through which this node is discovered) is added to CTM and is available.

Step 5 Open CTC from the GNE and verify that CTC can discover the NE.


K.5.12  CTC-Based NE Is Not Reachable


Step 1 Verify that the NE is up and running.

Step 2 Verify that the NE IP address and default route are correctly configured.

Step 3 Wait for five poll cycles while CTM re-establishes connectivity with the NE.

Step 4 To test IP connectivity to the NE from the CTM server, enter the following command from the Solaris workstation where CTM is running:

ping <NE_IP_address>

Step 5 To verify that the NE is running the version of system software supported by CTM, open a CTC session to the NE and check the NE software version.

Step 6 Verify that the username and password that CTM uses to reach the NE exist on the NE.


K.5.13  ONS 1530x Communication Problems


Step 1 Make sure that there are no other ONS 1530x NEs with the same name.

Step 2 Test the IP connectivity from the server to the NE.

Step 3 Telnet to the NE from the CTM server. If you are prompted for a password, the NE is running.

Step 4 Connect to the NE using Cisco Edge Craft to verify that SNMP is enabled.


Note Only users connecting from servers that are listed in the SNMP Community table are allowed to log into the NE.


Step 5 Verify that the NE you are trying to connect to is running a software version supported by CTM.

Step 6 Check whether the product name is one of the following:

ONS 15302

ONS 15305

AXXEDGE

If the product name does not match any of the recognized names, enter the change-system-name command.

Step 7 For ONS 15305 CTC R3.x NEs, note the following possible connectivity issues and limitations:

The username and password that CTM uses to reach the ONS 15305 CTC must exist on the NE.

ONS 15305 CTC NEs support up to two simultaneous client connections, as follows. Other configurations are not supported and might cause connectivity problems:

One CTM plus one CTC connection

Two CTM connections

Two CTC connections

CTC cannot be launched on the same workstation where the CTM server is running and managing ONS 15305 CTC NEs, or connectivity problems might occur.

ONS 15305 CTC NEs do not support the ONS 15454 SDH proxy server implementation. If ONS 15305 CTC NEs are connected to ONS 15454 SDH NEs through optical links, enabling the proxy server on ONS 15454 NEs (and then the GNE/ENE configuration) causes connectivity problems on the ONS 15305 CTC NEs. Verify that proxy server is not enabled on any ONS 15454 SDH NEs that are optically connected to ONS 15305 CTC NEs.


K.5.14  ONS 15501, ONS 15540, or ONS 15530 Is Not Reachable


Step 1 Enter the following command on the server to verify IP connectivity between the server and the NE:

ping <NE_IP_address>

Step 2 If the NE is reachable through IP, verify that the correct read community string is configured on the server.

On the server, choose Administration > ONS 155XX > ONS 155XX SNMP Settings Table to view the community strings set in CTM.

On the NE, enter the following command to view the read community string set on the NE:

show run | begin SNMP

In the running configuration, the community strings are shown in entries similar to the following:

snmp-server community public RO
snmp-server community private RW

In this example, the read string is public, and the read/write string is private.

Step 3 If IP connectivity and community strings are verified and the device is still unavailable, check the Error Log for any errors reported by ONS 155xx NE Service.


K.5.15  ONS 1580x NE Is Not Discovered

If CTM cannot discover a newly added ONS 1580x NE, complete the following steps:


Step 1 Make sure that there are no other ONS 1580x NEs with the same name.

Step 2 Connect to the NE using the local craft tool and provide the current running level. The running level must be six to enable CTM to connect to the NE. If the running level is three, only the local craft is able to connect.

Step 3 Complete the following substeps to check all available connections on the TL1 socket (port 1000):


Note The TL1 Agent supports up to five simultaneous connections. CTM uses one TL1 connection for the NE service, one for the PM service, and another for CTM GateWay/TL1, if enabled.


a. Open a Telnet connection.

b. At the pSOS prompt, enter the netstat command.

Step 4 Make sure that the board name and slot position inside the shelf match those reported by the TL1 Agent.


K.5.16  Changing the NE Operational State from the UNIX CLI

This procedure is useful when troubleshooting NE connectivity problems, alarm, or configuration synchronization issues and you do not have access to the CTM client GUI.


Step 1 Use the vwdata.sh script to identify the NE database ID number.

Step 2 Run the UNIX CLI tool and connect to the server. The tool name is ctm and is located in the /opt/CiscoTransportManagerServer/bin directory.

Step 3 Use the command set nestate to change the NE operational state.


K.5.17  Circuits Are Not Displayed

If CTM fails to display some circuits or if the information displayed by CTC and CTM differs, complete the following steps:


Step 1 Wait for 2 minutes while the CTM server synchronizes with the NE. If the Circuit table is not in autorefresh mode, click the Refresh Data tool when the tool flashes.

Step 2 Verify that the Circuit table is opened from either the source or destination NE of the circuit to be viewed.

Step 3 Verify that the NEs that are part of the circuit (source, destination, or drop) are available in CTM.

Step 4 Open another CTC session and compare the circuit information in the newly opened CTC session with the circuit information displayed by CTM.


K.5.18  Circuit State for Monitor Circuits Reads "Duplicate ID"

CTM generates a unique number that is appended to the circuit name to make the monitor circuit name unique. On rare occasions, CTM might create two or more monitor circuits with the same name. The circuit state for these circuits reads "Duplicate ID." This is a known issue that has been tracked as DDTS number CSCdz87566.

When the circuit state reads "Duplicate ID," you cannot see the actual circuit state. In this case, you must change the duplicate name to a unique name, so that you can see the correct circuit state. Use the Modify Circuit wizard to enter a unique name for each monitor circuit.

K.5.19  NE Model Type Appears as Unknown

If an in-service NE is added to the Domain Explorer but the model type appears as unknown, the software version of the NE might not be prepopulated in the database. In other words, CTM cannot match the NE with a recognizable version.

Use the NE Software table to add the NE software version string to the CTM database. See 4.3.5  Viewing Software Versions and Restarting the NE with a New Software Image, page 4-37.

K.5.20  Cannot Copy the CTC Binary to the CTM Server

If a CTC binary fails to copy to the CTM server, the <CTM_install_directory>/cms/ directory might be missing or write-protected. If the directory is missing, create a cms directory with write-access permissions. Otherwise, change the permissions of the existing cms directory to allow write access.

K.5.21  Memory Backup, Memory Restore, or Software Download Fails


Step 1 In the Domain Explorer window, choose Administration > Job Monitor. The Job Monitor table shows the status of the operation. The reason for the failure is shown in the Additional Information column.

Step 2 Return to the Domain Explorer window and choose Administration > Error Log. The Error Log table shows information about the backup, memory restore, or download failure.

Step 3 See Appendix I, "Error Messages" for the correct action to take in response to the error.


K.5.22  Memory Autobackup, Software Commit, or Software Revert Fails


Step 1 In the Domain Explorer window, choose Administration > Audit Log. The Audit Log table shows the status of the operation.

Step 2 Return to the Domain Explorer window and choose Administration > Error Log. The Error Log table shows the reason for the failure.

Step 3 See I.2  CTM Server Error Messages, page I-99 for the correct action to take in response to the error.


K.5.23  The getinfo.sh Script Fails

You might experience a problem where the getinfo.sh script fails with the following error:

Error: permission denied while running `rsh.' Check the environment and make sure the root 
is permitted to run .rsh on host <IP_address>.

This problem occurs because the getinfo.sh script does not run unless an entry is made in the /.rhosts file. This problem occurs even on a local server and database configuration. This is a known issue that has been tracked as DDTS number CSCin66495.

The solution is to add the hostname to the /.rhosts file. For example:

(omu-e450) #more /.rhosts
omu-e450 +
(omu-e450) #

K.6  Client Connectivity Problems

The CTM client might not be able to connect to the CTM server for various reasons. Complete the following procedures in the order listed until the problem is resolved.

K.6.1  Database Is Not Available

If the CTM client cannot connect to the CTM server, verify that the database is available.


Step 1 Log into the CTM server as the Oracle user.

Step 2 Enter the following command to connect to the database:

sqlplus ctmanager/ctm123!

Step 3 If the error message "maximum processes exceeded" is received, the maximum number of database connections has been reached. Close several clients or ask the database administrator to increase the maximum number of processes for the database.


K.6.2  Database Timeout Occurs


Step 1 Reduce the scope of the query by selecting a group or an NE and not the entire domain before opening tables such as the Alarm Browser, Alarm Log, Audit Log, and PM tables.

Step 2 Increase the client database query timeout period by editing the ems-client.cfg file in the C:\Cisco\TransportManagerClient\config or /opt/CiscoTransportManagerClient/config directory.

Step 3 Add hardware resources to the Oracle database server.

Step 4 Ping the Oracle database server from the client to verify the response time. Increase the bandwidth if the round-trip response time is inadequate.


K.6.3  Are the CTM Client and the CTM Server Connected?

If the database is available, check connectivity between the client and the server.


Step 1 To see the CTM server IP address, enter the following command on the Solaris workstation where the CTM server is running:

ifconfig -a

The command output looks similar to the following example:

hme0:flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST>mtu 1500
inet 192.168.120.93 netmask ffffff00 broadcast 192.168.120.255

The IP address is the address following the inet field.

Step 2 To verify that the physical connection between the CTM client and the CTM server does not have problems, enter the following command from the CTM client:

ping <IP_address>

where the IP address belongs to the Solaris workstation where the CTM server is running.

Step 3 If the ping command fails, fix the physical connectivity; then, log into the CTM client.


K.6.4  Cannot Log In as Provisioner or Operator


Step 1 Use the default username and password to log into the CTM client:

Username: SysAdmin

Password: Ctm123!

Step 2 If the SysAdmin login succeeds but the provisioner or operator login fails, verify that the provisioner or operator exists and is not disabled. In the Domain Explorer window, choose Administration > CTM Users to view a table of all configured CTM users.

Step 3 If the provisioner or operator is not in the CTM Users table, the user is not configured. Configure the provisioner or operator; then, log in as that user.

Step 4 If the provisioner or operator is in the CTM Users table, select the row corresponding to that user and click the Modify User Properties tool to bring up the Modify User Properties wizard. Verify that the Login State field is set to Enabled. If the login state is disabled, enable it and log in as the provisioner or operator. The user might have been disabled after attempting to log in with an incorrect password.

Step 5 If the password is not correct, set a new password for the user and log in again.


K.6.5  Cannot Authenticate User Message Appears

If the "Cannot authenticate user" error message is received when logging into the CTM client, the CTM server might be initializing. Wait for 5 minutes while the CTM server finishes initializing; then, try to log in again. Alternately, check your username and password and enter them again. The username and password are case-sensitive.

K.6.6  Socket Over TL1 Issues

CTM supports NEs that are connected in non-IP networks such as Open System Interconnection (OSI) networks. The Cisco NEs will be connected through non-Cisco gateway NEs (NGNEs) and CTM opens a tunnel to connect to the NGNEs. Do the following in case of connectivity issues in a tunnel setup:


Step 1 Ping the NGNE. The NGNE must be connected to the IP network.

Step 2 Check if the NGNE requires login credentials to connect to the network. If it requires login credentials, its TID is listed in the GNE TID text box in the Domain Explorer NE Properties pane. The username and password of the NGNE are specified in the Domain Explorer NE Properties pane and Control Panel.

Step 3 Check that the NGNE encoding mode that is used to connect to the Cisco NEs is correct.

Step 4 (Optional) If you have a very large network (which can slow down the creation of tunnels), do the following to increase the default timeout value of 20 seconds:

a. In the Control Panel window, expand the NE Service and click CTC-Based SONET NEs or CTC-Based SDH NEs to open the NE Service pane.

b. In the Status tab, enter the new timeout value in the TL1 Tunnel Connection Timeout text box.

c. Click Save.


K.7  Client Operational Problems

This section describes troubleshooting procedures for CTM client operational problems.

K.7.1  Model Index Is Unknown

If a new NE is added to the CTM domain and marked as In Service, but the model index is unknown, complete the following steps:


Step 1 Verify that the IP address and other parameters are entered correctly.

Step 2 The CTM server might not be able to establish connectivity to the GNE for the NE. Verify that the GNE for the NE is marked as In Service, and that the GNE is correctly populated in the CTM domain.

Step 3 Verify that the GNE is DCC-connected to the NE.

Step 4 The CTM server might not recognize the NE's software version. In the Domain Explorer tree, select the NE and choose Fault > Test NE Connectivity. If the NE is available, contact the Cisco TAC and verify that the version of the software on the NE is compatible with the CTM server. Use the Supported NE table to provide the new NE software version that the CTM server will recognize.

Step 5 The CTM server might be configured to attempt communication with new NEs after a certain delay. To change the health poll frequency:

a. In the Domain Explorer window, choose Administration > Control Panel and click NE Service.

b. In the NE Poller tab, look at the NE Health Poll Interval field. The default is 240 seconds. If the value is too large, there might be a communication delay between CTM and the NEs. Under normal conditions, the CTM server attempts communication with new NEs for no more than twice the number of seconds specified in the NE Health Poll Interval field.


K.7.2  Cannot Delete an NE

If an NE cannot be deleted, verify that you logged in as a SuperUser or NetworkAdmin, and not as a Provisioner or Operator. Provisioners and Operators cannot delete NEs. If you logged in as a Provisioner or Operator, restart the CTM client session and log in as a SuperUser.

K.7.3  CTC Fails to Start

If CTC cannot be started for a CTC-based NE, complete the following steps:


Step 1 If an error occurs while opening the NE Explorer but the CTC progress screen and login dialog boxes appear, the CTC username or password might be incorrect. Do the following:

a. Choose Administration > CTM Users.

b. In the CTM Users table, choose Edit > View/Modify. The Modify User Properties wizard opens.

c. Click Next to move to the CTC/Craft User Properties pane and change the CTC username and password to match those configured on the device. Click Finish.

Step 2 If the CTC progress screen idles for a long time before the login dialog box appears, there might be problems with network connectivity to the device.

a. From a DOS command window or a Solaris workstation, ping the device.

b. If no response is received, set up routes so that the device is available; then, restart CTC. If CTC does not start, the workstation might have resource constraints.

c. Close some open applications and try again.

d. Close some instances of CTC that are managing another group of CTC-based NEs and try again.

Step 3 If the GNE for the selected NE was not found, there might be a mismatch between the GNE specified for the device and the available GNEs. This means that the data in the database is corrupt. Contact the Cisco TAC for assistance.


K.7.4  Added a New Software Version to the Wrong NE


Step 1 In the Domain Explorer window, choose Administration > Supported NE Table.

Step 2 In the Supported NE table, select the incorrect entry and choose Edit > Delete.

Step 3 Click OK in the confirmation dialog box.

Step 4 Select the correct NE row from the Supported NE table.

Step 5 Add the correct software version.

Step 6 In the Domain Explorer > Network Element Properties pane, set the operational state of all NEs that are behaving erroneously to Out of Service.

Step 7 Click Save.

Step 8 In the Network Element Properties pane, set the operational state of all the NEs back to In Service.

Step 9 Click Save.


K.7.5  Cannot Delete a Subnetwork

If a subnetwork cannot be deleted from the Subnetwork Explorer, complete the following steps:


Step 1 Verify that the user logged in as a SuperUser—not as a Provisioner or an Operator. Provisioners and Operators cannot delete subnetworks. If the user logged in as a Provisioner or Operator, restart the CTM client session and log in as a SuperUser.

Step 2 Verify that the target subnetwork does not contain NEs. Move all NEs to another subnetwork before deleting the empty target subnetwork.


K.7.6  Cannot Move an NE Between Subnetworks

When automatic grouping of NEs in subnetworks is enabled, you cannot move NEs between subnetworks. To move NEs from one subnetwork to another:


Step 1 In the Domain Explorer window, choose Administration > Control Panel.

Step 2 In the Control Panel, click UI Properties.

Step 3 Under Subnetwork Grouping, uncheck the Automatically Group NEs in Subnetworks check box.

Step 4 Click Save.


K.7.7  NEs Change Subnetworks Rapidly

NEs that are part of a subnetwork might change subnetworks rapidly if the status of the link or of all links connecting subnetworks fluctuates rapidly; this can occur if automatic grouping of subnetworks is enabled in CTM. Disabling the automatic grouping of subnetworks might prevent NEs from changing subnetworks when caused by links going up and down. To disable the automatic grouping of subnetworks:


Step 1 In the Domain Explorer window, choose Administration > Control Panel.

Step 2 In the Control Panel, click UI Properties.

Step 3 Under Subnetwork Grouping, uncheck the Automatically Group NEs in Subnetworks check box.

Step 4 Click Save.


K.7.8  Cannot Schedule Jobs

The CTM client can schedule the following types of administrative tasks:

Software download

Memory backup

Memory restore

CTM maintains an Error Log and Audit Log to track potential problems. To view the Error Log or Audit Log:


Step 1 In the Domain Explorer window, choose Administration > Audit Log or Error Log.

Step 2 Look for errors related to software download, memory backup, or memory restore.


K.7.9  CTM Does Not Receive Autonomous Alarms from ONS 1530x NEs

CTM can collect alarms or receive autonomous notifications from an ONS 1530x NE. Alarm collection occurs when the NE is first discovered and does not rely on spontaneous events sent from the NE.


Step 1 Verify that the CTM server is present in the SNMP Community table and that the trap is set to enable. If not, you can connect to the NE but CTM will not receive traps.

Step 2 If the CTM server and the NEs are in different subnetworks, make sure that there is no firewall filtering SNMP packets between the subnetworks.

Step 3 Verify that Cisco Edge Craft is receiving autonomous alarms.

Step 4 Verify the following:

SNMP traps reach the CTM server machine.

The command snoop udp and port 162 do not show the incoming traps.

There are no network problems.

Step 5 Check whether the CTM server is receiving traps.

Step 6 Check whether the ONS 1530x NE service is receiving SNMP traps.


K.7.10  ONS 15800, ONS 15801, or ONS 15808 NE Generates Too Many Alarms

Occasionally, ONS 1580x NEs generate so many alarms that the CTM client cannot process all of them. Set the operational state of the NE to Under Maintenance while debugging the cause of the alarms:


Step 1 In the CTM Domain Explorer window, right-click the ONS 15800, ONS 15801, or ONS 15808 that is generating the large number of alarms and choose Mark Under Maintenance.

Step 2 Debug the ONS 15800, ONS 15801, or ONS 15808 NE to see which card is in error. Find the card that is generating the alarms and mark it as Under Maintenance.

Step 3 While the card is under maintenance for debugging, set the NE back to In Service.


K.7.11  Bandwidth-Intensive Operations Are Blocked

Bottlenecks in the DCN affect bandwidth-intensive operations, such as software download. Tune the timeout and retry values to match DCN performance. Log into the NE and enter the following commands:

ip tftp min-timeout—Sets the minimum wait time, in seconds, before retrying the read/write request.

ip tftp max-timeout—Sets the maximum wait time, in seconds, before retrying the read/write request.

ip tftp backoff-delay—Sets the number of seconds to extend the wait time if the read/write request times out.

ip tftp write-retries—Sets the maximum number of times to retry the TFTP read/write request before declaring a failure.

K.7.12  NE Displays Incorrect Configuration Management Data


Step 1 In the Domain Explorer window, choose Administration > Service Monitor and verify that the NE Service is running. If it is not running, start the NE Service.

Step 2 Test NE connectivity:

a. In the Domain Explorer tree, select the NE and choose Fault > Test NE Connectivity. The message "<NE_name> (<IP_address>) is available" is generated.

b. If the NE is unavailable, establish connectivity to the NE before retrieving configuration data.


K.7.13  Cannot Customize the Network Map

If an image file is not displayed while changing the Network Map background or while changing a node icon, complete the following steps:


Step 1 Choose another image file. The file might be corrupt.

Step 2 Check the size of the image file. The image file might be larger than 100 KB, which is too big to load. If the file is too big, use a smaller image file.

Step 3 Verify that the image file exists in the \images\mapbkgnds\shapefiles directory or the \images\mapicons directory. If the file is missing, it has been deleted. Reinstall the CTM client.


Note The client is bundled primarily with shape files (*.shp) and only a few map background GIF files.



K.7.14  CTM Client Machine Displays Incorrect Colors

When using a UNIX machine as the CTM client host, the machine's display might show incorrect colors. This problem occurs when UNIX systems use a 256-color palette and the graphics require colors that are not available in the 256-color scheme. Although some UNIX systems can work around the missing colors, there are still some that have problems making adjustments. For these systems, the only solution is to disable any other graphical application (for example, Netscape) that is running and upgrade the system graphics capabilities to a 24-bit color display (for example, upgrade the video card or monitor).

K.7.15  Common Topology Problems


Note After marking NEs as Out of Service, wait until the circuits traversing over these NEs (and not involving any other NE) clear from the CTM Circuit table before marking the NEs as In Service again.


K.7.15.1  Cannot Load the Barebone Configuration on ML Cards

To load a fresh barebone configuration on an ML card, complete the following steps:


Step 1 In the Domain Explorer tree, select the NE that contains an ML card and choose Configuration > NE Explorer (or click the Open NE Explorer tool).

Step 2 In the NE Explorer tree, select the ML card (or double-click the ML card that is shown in the Shelf View).

Step 3 In the Slot Properties pane, click the Configuration tab.

Step 4 Click the File >> TCC button, specify the configuration file to load, and click OK.

Step 5 Reset the card and wait for it to come up.

Step 6 After the card comes up, enter one of the following commands to telnet to it, depending on the node mode:

If the node is configured in SSH mode, enter:

ssh <NE_IP_address> 40<card_slot_number>

If the node is not configured in SSH mode, enter:

telnet <NE_IP_address> 20<card_slot_number>

Step 7 Enter the username and password to log into the ML card. The default values are:

Username: CISCO15

Password: CTM123+

Step 8 After logging in, enter the following command:

wr mem


K.7.15.2  Topology Goes into Incomplete State

A topology goes into Incomplete state when one or more Layer 1 circuits related to the topology are missing.


Step 1 Launch the Circuit table and check whether the circuits related to the topology are discovered.

Step 2 If one or more Layer 1 circuits are missing, use the Create Circuit wizard to recreate the circuits. Once the Layer 1 circuits are created and CTM discovers them, the topology goes into Complete state.

Step 3 Check the Circuit table to verify whether the newly created Layer 1 circuits are discovered.

Step 4 If the topology still fails to go into Complete state, the topology discovery logic has failed. Complete the following substeps to restart the discovery process:

a. Mark the NE service related to the NE as Out of Service and then In Service.

b. Mark the NEs related to the topology as Out of Service and then In Service.

Step 5 If the problem persists, contact the Cisco TAC.


K.7.15.3  Topology Goes into L2ServiceNotReady State

A topology goes into L2ServiceNotReady state when there is a mismatch between data in the CTM database and data on the ML-series cards.


Step 1 Launch the Circuit table and check whether all circuits related to the topology are discovered.

Step 2 Launch the Equipment Inventory table and check whether the NE inventory is collected correctly.

Step 3 Complete one of the following options:

Launch the L2 Topology table and enable the L2 service. The CTM database and ML-series cards synchronize. During the synchronization process, the topology goes into In Progress state, and then into Complete state.

Do a forced resync of the database by launching the NE Explorer for the NEs in the topology and refreshing data from the NEs. This restarts the synchronization process. Once the NEs are up, the topology goes into Complete state.

Step 4 If the topology still fails to go into Complete state, try the following options to restart the topology discovery process:

Telnet to the cards and check whether the configuration parameters are corrupt.

Mark the NE service related to the NE as Out of Service and then In Service.

Mark the NEs related to the topology as Out of Service and then In Service.

Delete the topology, reload the barebone config file on the ML-series cards, and reset the cards. To make sure that the barebone config file is reloaded, telnet to the cards and enter the write memory command. Then, recreate the topology.

Step 5 Try one of the following options, depending on when the topology goes into L2ServiceNotReady state:

If the topology goes into L2ServiceNotReady state immediately after topology creation, choose Administration > Error Log and search for the BaseCardConfigWork file. Check the message associated with that file. If the message says "Base Card Config Work Failed," open the IosTransportModule.log server file and search from the bottom for the "INVALID" parameter. The command that failed is displayed with an INVALID error. Then, launch the CLI and check whether the card is configured correctly.

If the topology goes into Complete state but then changes to L2ServiceNotReady after several minutes, choose Administration > Error Log and search for the BaseCardDetectWork file. If the message associated with that file says "BaseCardDetectWork failed," check the DataService.log file (or the NE Service log for CTM R7.x or later) in the server and search from the bottom for the "BaseCardDetectWork" parameter. Several lines from the bottom, the file shows the exact reason from the base card detection failure.

Step 6 If the problem persists, contact the Cisco TAC with the relevant logs, as well as a configuration snapshot of the cards used in the topology.


K.7.15.4  Topology Goes into SyncFailed State

A topology can go into SyncFailed state for any of the following reasons:

A preprovisioned card was used to create the topology. There are two ways to determine whether a card is preprovisioned:

Launch the NE Explorer and check whether the card is preprovisioned.

Telnet to the card to verify that it is present.

The barebone configuration file is not present on all of the cards.

You have not specified the correct CTM-to-ML-series card login details. To verify the login details, launch the CTM Control Panel and click Security Properties. Click the appropriate NE tab and verify the username and password for CTM server connections to ML-series cards.

The database and the ML-series card are not synchronized.


Step 1 Launch the Circuit table and check whether the circuits related to the topology are discovered.

Step 2 Launch the Equipment Inventory table and check whether the NE inventory is collected correctly.

Step 3 Wait for the next synchronization cycle. The database should synchronize and the topology should go into In Progress state. If the topology goes into L2ServiceNotReady state, see Topology Goes into L2ServiceNotReady State.

Step 4 If the topology still fails to go into Complete state, try the following options:

Telnet to the cards and check whether the configuration parameters are corrupt.

Mark the NE service related to the NE as Out of Service and then In Service.

Mark the NEs related to the topology as Out of Service and then In Service.

Delete the topology, reload the barebone config file on the ML-series cards, and reset the cards. To make sure that the barebone config file is reloaded, telnet to the cards and enter the write memory command. Then, recreate the topology.

Step 5 If the problem persists, contact the Cisco TAC with the relevant logs, as well as a configuration snapshot of the cards used in the topology. Alternately, telnet to the NEs and enter the flmDeleteDb command.


K.7.15.5  Topology Goes into Partially Complete State

A topology goes into Partially Complete state when all of the cards are not synchronized with the CTM database.

This problem also occurs when the data service processes a CARD_CONFIG_DATA_FAILED event for any card in the topology. The topology remains in Partially Complete state until a CARD_CONFIG_DATA_FAILED event is received for the failed card. While the topology is in this state, base card configuration provisioning and detection are performed on other cards in the topology.


Step 1 Launch the Circuit table and check whether all circuits related to the topology are discovered.

Step 2 Launch the Equipment Inventory table and check whether the NE inventory is collected correctly.

Step 3 Wait for the next synchronization cycle. The database should synchronize and the topology should go into Complete state. If the topology goes into L2ServiceNotReady state, see Topology Goes into L2ServiceNotReady State.

Step 4 If the topology still fails to go into Complete state, try the following options:

Telnet to the cards and check whether the configuration parameters are corrupt.

Mark the NE service related to the NE as Out of Service and then In Service.

Mark the NEs related to the topology as Out of Service and then In Service.

Launch the NE Explorer for the related NE, click the Refresh button, and refresh data from the NE.

Delete the topology, reload the barebone config file on the cards, and reset the cards. To make sure that the barebone config file is reloaded, telnet to the cards and enter the write memory command. Then, recreate the topology.

Check whether the IOS Users table contains all of the entries for all cards included in a ring.

Step 5 If the problem persists, contact the Cisco TAC with the relevant logs, as well as a configuration snapshot of the cards used in the topology. Alternately, telnet to the NEs and enter the flmDeleteDb command.


K.7.15.6  CTM Cannot Discover the Topology


Step 1 Try the following options to restart the topology discovery process:

Launch the Circuit table and check whether all circuits related to the topology are discovered.

Mark the NE service related to the NE as Out of Service and then In Service.

Mark the NEs related to the topology as Out of Service and then In Service.

Step 2 If the topology is still not discovered, complete the following substeps to enable the trace level in the data service logs:

a. In the Domain Explorer window, choose Administration > Control Panel.

b. In the Control Panel window, click Logging Properties to open the Logging Properties pane.

c. Click the General tab.

d. In the Preferences area, select Trace from the Logging Level drop-down list.

e. Click Save.

Step 3 On the CTM server, enter the following command to telnet to the local host. The port number depends on the port to which the NE belongs:

telnet localhost {9500 | 9501}

This starts the data log collection on the server. Check whether the data service logs are populating.

Step 4 Check the data service logs to determine whether or not the circuit addition events are being received correctly. You can do this by searching for the string CIRCUIT_ADDED. If the circuit addition events are not being received correctly, there is a problem with Layer 1 circuit provisioning. If the events are received correctly but the topology is not discovered, the circuit deletion logic failed and you must contact the Cisco TAC with the relevant logs.


K.7.15.7  CTM Cannot Delete the Topology


Step 1 Launch the L2 Topology table and check whether the topology is in Deleting state. If the topology is not in Deleting state, there is a problem with the deletion logic in CTM.

Step 2 Check the state of the related Layer 1 circuits. They should also be in Deleting state. If they are in Deleting state but are not eventually deleted, there is a problem related to Layer 1 circuit provisioning.

Step 3 If not all of the circuits are deleted and they are not in Deleting state, use CTC to delete the circuits.

Step 4 After the circuits are deleted, if the topology is still not deleted, complete the following substeps to enable the data service trace logs and check whether the circuit deletion events are received correctly:

a. In the Domain Explorer window, choose Administration > Control Panel.

b. In the Control Panel window, click Logging Properties to open the Logging Properties pane.

c. Click the General tab.

d. In the Preferences area, select Trace from the Logging Level drop-down list.

e. Click Save.

Step 5 On the CTM server, telnet to the local host. This starts the data log collection on the server. Check whether the circuit deletion events are being received correctly. If they are not being received, there is a problem with the Layer 1 circuit provisioning. If they are being received, there is a problem with the topology deletion logic in CTM. Delete the topology from the L2 Topology table.

Step 6 Open the ML Cards table for the topology. Check the Additional Info column for the reason for the failure.

Step 7 If the problem persists, contact the Cisco TAC with the relevant logs.


K.7.15.8  How Do I Check Whether All Cards in the Topology Are Synchronized?

Use any of the following options to check whether all cards participating in a topology are synchronized:

In the Domain Explorer, choose Administration > CTC-Based NEs > IOS Users Table and check whether an entry for the user exists on the cards that will be used. If an entry exists, the card is synchronized with CTM.

Check the ONS1545_NE_Inventory_table for all of the NEs participating in the ring. The entry for the ML card should be present. This is the same as checking the Equipment Inventory table from the CTM client and verifying whether all cards are In Service.

Verify whether you can launch a CLI session on all of the cards:

1. Open the NE Explorer for the NE.

2. Select the card in the tree view.

3. Click the Configuration tab.

4. Click the Launch CLI button.

5. In the Launch CLI dialog box, enter the username and password.

If the CLI session does not launch, the card does not have the barebone configuration or there is a communication failure and CTM cannot read the configuration.

Verify whether all of the cards have synchronized correctly in the eqpt_info_table. Perform the following database query:

Select * from eqpt_info_table where objectindex = 221

If no rows are found, the card is not synchronized correctly. The card does not have the barebone configuration or there is a communication failure and CTM cannot read the configuration.

CTM R8.0 and later supports 802.17 RPR, which requires cards to be configured in 802.17 RPR mode. If you include a card in an 802.17 RPR that was used previously as a Cisco RPR, you must reload a fresh barebone configuration, perform a software reset on the card, and then create the topology.

K.7.16  Common VLAN Problems

K.7.16.1  VLAN Is Not Discovered

This problem occurs when the VLAN discovery process fails. Try the following options to restart the VLAN discovery process:

Telnet to the cards and check whether the VLAN configuration parameters are corrupt.

Mark the NE service related to the NE as Out of Service and then In Service.

Mark the NEs related to the topology as Out of Service and then In Service.

Launch the NE Explorer for the related NE, click the Refresh button, and refresh data from the NE.

K.7.16.2  VLAN Is Not Created

If CTM cannot create a VLAN, try the following options:

If the CTM server is resynchronizing, an error message indicates a VLAN creation failure. Wait for some time before attempting to create the VLAN again.

If you receive an error message that says you cannot create a bridge group, you might have entered invalid AV pair commands in the CLI. Check the IosTransportModule.log for invalid commands. Enable the Trace error level and check the server logs for exceptions.

When creating an IP SLA, verify that the associated topology and managed VLAN are synchronized. IP SLA is supported only on NE releases 6.0, 7.0, 7.2, and 8.0. Verify that you are using a supported NE version.

If cards that belong to the topology are not listed in the Manage VLANs dialog box, complete the following steps:


Step 1 Log into the server and enter the following command:

cd /opt/CiscoTransportManagerServer/bin

Step 2 Run the prune_invalid_l2topo.sh script.

Step 3 Restart the NE service.


K.7.16.3  VLAN Is Not Deleted

If CTM cannot delete a VLAN, try the following options:

If the CTM server is resynchronizing, an error message indicates a VLAN deletion failure. Wait for some time before attempting to delete the VLAN again.

Check the IosTransportModule.log for exceptions or invalid commands. Enable the trace error level and check the server logs for exceptions.

K.7.17  PM Data Collection Fails

If PM data collection fails, use a browser to collect PM data directly from the NE. Enter one of the following URLs:

For real-time PM data:

For 15-minute collection, use http://<NE_IP_address>/pm/15min/1

For 1-day collection, use http://<NE_IP_address>/pm/day/1

For historical PM data:

For 15-minute collection, use http://<NE_IP_address>/pm/15min/<bucket#>

For 1-day collection, use http://<NE_IP_address>/pm/day/<bucket#>

where <bucket#> = (current time - missed historical time) / 900000 + 1

K.7.18  Common L1 Circuit Provisioning Problems

The following levels of information are used to debug circuits problems:

CTM client level

The debug level in the Debug Options window is set to Info by default. Set the debug level to Debug so you can save all the log information into a file that can be used to debug problems while launching circuit-related windows or wizards; for example, Circuit Creation wizard, BLSR Creation wizard, Modify Circuit dialog box, Circuit Trace window, and J1 path trace.

CTM client level with database interaction

When you perform one of the following operations, the CTM client builds an SQL query and uses the Java Data AcQuisition (JDAQ) system to extract data from the database:

Find circuit—After performing this operation, the resulting circuits do not meet the criteria or the circuits that meet the criteria are not displayed.

Launching circuit, link, BLSR, and rolls tables—There are issues with the data displayed in the Circuit table, Circuit Span table, Link table, VLAN table, Link Utilization table, Rolls table, and BLSR table.

Enable the JDAQ data and JDAQ modules in the Debug Options window to resolve issues resulting from performing the above operations; then, verify that all the following are correct:

The SQL query that was constructed.

The parameter values.

The data set that results from the SQL query.

Logs of the actual data displayed in the tables can be collected by exporting the table data to a text file. Having this data in the form of a text file can be useful when debugging circuit issues.

CTM client level with CTM server interaction

Almost all the circuit provisioning features involve interaction with the CTM server. To debug the server side code, collect the NE service logs from the CTM server. The log files are located at /opt/CiscoTransportManagerServer/log. Identify the NE service for which the issue occurred and set the debug level of the NE Service logs in the Control Panel. If you encounter any of the following, collect the NE service logs from the CTM server:

Issues when launching the Circuit Trace window; or, the circuit trace does not display the correct information.

Issues related to bridge and roll operations.

Issues when creating, modifying, or deleting BLSRs.

Issues when launching the J1 path trace; or, the J1 path trace does not display the correct information.

Issues when creating, modifying, or deleting circuits.

CTM level

Ports are available for debugging. Port number 9410 and higher are allocated for debugging CTM-level circuit information. Each NE service or network service has its own port number starting from port 9410. Unfortunately, there is no way to identify which port number is used by which NE service or network service. So while debugging, start with port 9410 and all managed NEs. If the listed NE is the one you need, continue debugging. Otherwise, try port 9411 and so on.

The information that is displayed in the port is cached at the CTM server and is useful when resolving issues such as the following:

Circuits in CTM and CTC are not synchronized.

Deleted circuits are removed from CTC but not from CTM.

The following commands can be used for debugging:

Bye, Exit, Quit—Closes the socket.

Help—Prints messages.

History [#]—Prints the command history.

!<text>—Executes an item in your command history that matches <text>.

!!—Executes the last item in your command history.

AllmanagedNEs—Prints the details of all managed NEs.

Allcircuits—Prints the details for all circuits.

Logon<filename>—Logs all the information into a file.

Logoff—Stops logging all the information into a file.

Debugon—Enables the debug flag.

Debugoff—Disables the debug flag.

NCP level

Complete the following steps to determine the port number and debug circuit issues in the NCP level:


Step 1 Check the NE service log at /tmp/<Service name>.log and search for "Debug Telnetd: port is <port number>."

Step 2 Note the port number.


Note Each NE service has a different port number.


Step 3 Telnet to the localhost using the debug Telnet port and log in using CISCO15 as the username for CTM R7.2 and earlier and otbu+1 for CTM R8.0 and later.


Note Only one user can use the debug port at a time. If the Telnet command fails and the port closes immediately, check for other open sessions. If there is an open sessions that is not used, close the session immediately.



The debug port contains a lot of information regarding the various cached objects in the NCP. Depending on the faulty feature, you have to decide what logs are required. Use the List command to list all the objects available on a port. When debugging issues related to circuits, the following objects are useful:

CircuitEnd

Circuits

NetCcatCircuit

NetCircuit

NetCircuitEnd

NetCircuitManager

NetCircuitNode

NetCircuitSpan

NetCircuitSplicer

NetCircuitWatchDog

Route and network objects—Used for debugging routing or link issues.

Topology objects—Used for debugging link-related issues.

BlsrRingManager—Used for debugging BLSR-related issues.

Bnrpackage, rollmanager, and roll objects—Used for debugging bridge and roll-related issues.

K.7.18.1  Cannot Route Circuits


Step 1 Make sure that a bandwidth is available for the protected circuit.

Step 2 Enter the following commands to retrieve the dumps before and after the problem occurs:

> Net dump
> Topo dump
> Topo topo dump
> Network dump
> Circuits dump
> Splicetree dump

Step 3 In the Network Control Protocol (NCP) debug window, enter the following commands:

> net set
> nl set
> netlink set
> ncpp set
> preset route
> route set

Step 4 Create the circuit and dump all the circuit information into a file.

Step 5 Try manual routing on the circuit that you created and save all information in a file.


K.7.18.2  Circuit Creation Is Successful in CTC but Fails in CTM


Step 1 Enable the debug logs using the remote debugger.

Step 2 In the CTM Control Panel, set the debug level to Debug.

Step 3 Perform the steps listed in Cannot Route Circuits and collect the NE service logs.


K.7.18.3  Circuit Table Does Not Display the Correct Information


Step 1 Select the following debug options before opening the Circuit table:

Circuit Management

JDAQ

JDAQ Data

Step 2 Dump all the log information into a file.

Step 3 Open the Circuit table.

Step 4 Export all data from the Circuit table to a file.

Step 5 Identify the remote debug ports of the NEs on which the problem occurs.

Step 6 Log into the debug port and collect the dumps of all managed circuits.


K.7.18.4  Circuits Show Up in Incomplete State, Circuits Are Not Discovered, and Circuit Information in CTM Is Not Consistent with Circuit Information in CTC

Perform the following steps when you create a circuit on a set of NEs and the circuit shows as Incomplete in CTM and in CTC, which is launched from CTM. The circuit shows as Active if CTC is launched from a browser.


Step 1 In the NCP debug port window, enter the following commands:

>ncp set
>circuit dump
>splicetree dump
>node nmd hoconnections print
>node nmd loconnections print
>node nmd loedits print

Step 2 Telnet to the localhost using the debug Telnet port and log in using CISCO15 as the username for CTM R7.2 and earlier and otbu+1 for CTM R8.0 and later.

Step 3 Enter the following commands to create a debug log:

> log /tmp/debug.log
> ncpp set

Step 4 Create a circuit in CTM and enter the following command to check the circuit name:

> circuit dump

Step 5 Enter the following command if you created an STS circuit:

> node nmd hoconnections print

Step 6 Enter the following command if you created a VT circuit:

> node nmd loconnections print



Note If an NE is moved from one network partition to another, the NCP debug port number changes from one NE service to another.


K.7.18.5  Missing Links

The following commands are useful when resolving a missing link problem:


Note Enter all commands in the NCP debug port window.


TopologyX dump—Displays all the nodes and links in the topology. It also displays the current topology host. The X variable represents the topology identifier. Do not include the X variable if you want to display all topologies.

Network dump—Displays all the nodes and links in the network. It also displays the topology to which the nodes and links belong.

Network dumpwaitQ—Displays the nodes that have been added to the network but do not have valid node ID.

NettopologyX dump—Displays all the links in the network topology along with the SPT. The SPT determines the nodes and links that a circuit can be routed to or routed through. The X variable represents the network topology identifier. Do not include the X variable if you want to display all the network topologies.

Netelement_xxx-xxx-xxx-xxx dump—Displays the current state of the NE. The xxx-xxx-xxx-xxx variable represents the IP address in reverse order. Do not include the xxx-xxx-xxx-xxx variable to displays all NEs.

Nldump—Displays the current state of the each net link.

TopologyX dumpareas—(For multiple OSPF areas only.) Lists all the OSPF areas in the topology and indicates whether the OSPF area is actively monitored by a node.

TopologyX dumpagents—(For multiple OSPF areas only.) Lists all the topology agents or topology hosts and the OSPF areas that they are responsible for.

TopologyX dumpareatotopoagentmap—(For multiple OSPF areas only.) Lists all the topology agents or topology hosts that are responsible for maintaining the topology in an OSPF area.

K.7.18.6  Problems with Manual Links

Sometimes when manual link issues occur in CTM, you have to create the same manual link in CTC before resolving the issue in CTM. These links are called phantom links. Complete the following steps to create a phantom link:


Step 1 Create the phantom link in CTM.

Step 2 Access the CTC Telnet port and run the network dumpnls command to dump all the links with termination points listed in hexadecimal form.

Step 3 Convert the hexidecimal numbers to decimal.

Step 4 In CTC, launch the debug window by holding down the Ctrl key and then the Shift key and press the About key.

Step 5 Enter the following command to create the phantom link between ipaddress1 and ipaddress2, and between the termination points in decimal values:

network createPhantom <ipadress1> <decimal value1> <ipadress2> <decimal value2>


K.7.18.7  Debugging Server Trail Links

When debugging server trail link issues, use the same procedure as debugging any other links. Only the size and provision type are different.

If you telnet to an NCP debug port, you can extract server trail link information by using the network dump command.

K.7.18.8  Debugging Bridge and Roll-Related Problems

When bridge and roll-related problems occur, Telnet to the NCP debug port and enter the rolls dump or rollmanager set command.

If there is a mismatch between the bridge and roll information in CTM and CTC, complete the following steps:


Step 1 On the CTM server, telnet to the local host.

Step 2 Log in using CISCO15 or otbu+1 as the username.

Step 3 Enter the following commands:

> log /tmp/debug.log
> ncpp set

Step 4 Create a new roll in CTM.

Step 5 On the CTM server, enter the following command to check for the new roll:

> rolls dump

Step 6 Enter the following command to check for the new roll on a circuit:

> circuits dump

Step 7 Enter the following command if you created an STS circuit roll:

> node nmd hoconnections print

Step 8 Enter the following command if you created a VT circuit roll:

> node nmd loconnections print

Step 9 Log off.



Note If NEs are moved from one network partition to another, the NCP debug port number changes from one NE service to another.


If there are any problems while creating a roll or performing any bridge and roll operations, complete the following steps:


Step 1 On the CTM server, telnet to the local host.

Step 2 Log in using CISCO15 or otbu+1 as the username.

Step 3 Enter the following commands:

> log /tmp/debug.log
> ncpp set
> rolls set
> rollmanager set
> logoff


K.7.19  Common Equipment Provisioning Problems

You can perform equipment provisioning at the following levels:

Node Level

Card Level

K.7.19.1  Node Level

K.7.19.1.1  Cannot Create DCC Using OSI or OSI and IP (PPP)

If you cannot create a DCC after choosing to use the OSI or OSI and IP (PPP) Layer 3/Layer 2 configuration, go to the NE Explorer > OSI tab > Routers-Setup subtab and enable the router on which you want to create the DCC.

K.7.19.1.2  Cannot Modify DCC

If you cannot modify a DCC, check whether the same DCC is created in the SDCC and LDCC.

K.7.19.2  Card Level

The following are guidelines when configuring pluggable port modules (PPMs), loopback, and alarm profiles on the card level.

K.7.19.2.1  PPM Tab

You should provision PPMs before creating pluggable port rates. The following are guidelines when provisioning PPMs on specific cards:

MRC-12 cards

You can provision up to twelve PPMs.

If the MRC-12 card is provisioned on slots 1 to 4 or slots 14 to 17, PPM1 is provisioned with only the OC48 rate.

If the MRC-12 card is provisioned on slots 5, 6, 12, or 13, you can provision PPM1, PPM4, PPM7, and PPM10 with the OC48 rate.

The total available bandwidth of the ports is OC48 when the card is provisioned on slots 1 to 4 or slots 14 to 17.

The total available bandwidth of the ports is OC192 when the card is provisioned on slots 5, 6, 12, or 13.

The total available bandwidth depends also on the XC card.

MRC-2.5G-12 cards

You can provision up to twelve PPMs.

Only PPM1 allows provisioning of the STM16 rate. If PPM1 is provisioned with STM16, the other PPMs are unprovisionable.

The total bandwidth of the ports is STM16.

PPM1, PPM4, PPM7, and PPM10 only allow provisioning of STM4.

MRC-2.5G-4 cards

You can provision up to four PPMs.

Only PPM1 allows provisioning of the OC48 rate. If PPM1 is provisioned with OC48, the other PPMs are unprovisionable.

The total bandwidth of the ports is OC48.

STM64 XFP-based cards

You can only provision one PPM.

The only allowed port rate is STM64.

K.7.19.2.2  Loopback Tab

In the Loopback tab of the NE Explorer, you cannot change the admin state from IS to OOS, DSBLD. You should first change the admin state to OOS, MT or IS, AINS and then OOS, DSBLD.

K.7.19.2.3  Alarm Behavior Tab

You can only create alarm profiles at the node level. The Alarm Behavior tab in the card level of the NE Explorer only allows you to apply existing alarm profiles to the card.

K.7.20  How Do I Collect Solaris Client Thread Dumps?


Step 1 Open a terminal session.

Step 2 Run the client with the ctmc.sh script located in the /opt/CiscoTransportManagerClient directory.

Step 3 Enter the following command to obtain the client Java process ID:

ps -aef|grep java

Step 4 Enter the following command, where <pid> is the client Java process ID:

kill -QUIT <pid>


K.7.21  How Do I Collect Windows Client Thread Dumps?

To collect thread dumps for CTM R5.0.2 and earlier, complete the following steps:


Step 1 Open a DOS window.

Step 2 Run the ctmc.bat file located in the C:\Cisco\TransportManagerClient directory.

Step 3 Generate the thread dump by holding down the Ctrl key and pressing the Pause (Break) key.

Step 4 Copy and paste the output to a text file.


To collect thread dumps for CTM R6.0 and later, complete the following steps:


Step 1 Open a DOS window.

Step 2 Depending on your network size, enter one of the following commands:

ctm-small-debug.exe
ctm-medium-debug.exe
ctm-large-debug.exe
ctm-highend-debug.exe

Step 3 Generate the thread dump by holding down the Ctrl key and pressing the Pause (Break) key.

Step 4 Copy and paste the output to a text file.


K.7.22  How Do I Collect Server Thread Dumps?

Thread dumps are helpful references when debugging the CTM process. To collect thread dumps:


Step 1 Log into the server workstation as the root user.

Step 2 On the command line, enter the following command:

thread_dumper [\<Group/Service>\]

where:

Group is the group name for which to collect thread dumps. It can be:

SM

SNMPTRAP

NE

PM

GW

Service is the service name for which the thread dump is required. It can be:

SMService

SNMPTrapService

CORBAGWService

ONS15216NEService

ONS15302NEService

ONS15305NEService

ONS15310NEService

ONS15327NEService

ONS15454NEService

ONS15454SDHNEService

ONS15501NEService

ONS15530NEService

ONS15540NEService

ONS15600NEService

ONS15600SDHNEService

ONS15800NEService

ONS15801NEService

ONS15808NEService

UnmanagedNEService

ONS15216PMService

ONS15302PMService

ONS15305PMService

ONS15310PMService

ONS15327PMService

ONS15454PMService

ONS15454SDHPMService

ONS15501PMService

ONS15530PMService

ONS15540PMService

ONS15600PMService

ONS15600SDHPMService

ONS15800PMService

ONS15801PMService

ONS15808PMService


K.7.23  How Do I Enable or Disable the Automatic Refresh Data Feature?

The automatic Refresh Data feature automatically refreshes all data being displayed by CTM. When automatic refresh is enabled, you receive the following prompt:

Refresh Data action suggested. This action will result in closing all windows and might 
take some time. Do you want to continue? {Yes | No}

In an unstable environment where NEs are synchronizing or changing operational states frequently, you might receive the preceding prompt continuously. To disable the prompt:


Step 1 In the Domain Explorer window, choose Edit > User Preferences. The User Preferences dialog box opens.

Step 2 Click the Miscellaneous tab.

Step 3 To disable the prompt, uncheck the Enable Refresh Data Timer check box. (To enable the prompt, leave the check box checked.)

Step 4 Click OK.


K.7.24  How Do I Replace the Alarm Interface Panel?

The Alarm Interface Panel (AIP) provides surge protection for CTC-based NEs. This panel has a nonvolatile memory chip that stores the unique node address known as the MAC address. The MAC address identifies the nodes that support circuits. It allows CTM to determine circuit sources, destinations, and spans.

If an AIP fails, an alarm is generated and the LCD display on the fan-try assemblies of the NEs becomes blank. To perform an in-service replacement of the AIP, you must contact the Cisco TAC.

You can replace the AIP on an in-service system without affecting traffic by using the circuit repair feature. See 7.2.11  Repairing Circuits, page 7-138.

K.8  Client Debug Messages

On Windows, you can start the CTM client with a console window that displays exceptions and client debug messages. You can use these messages to help troubleshoot any client problems.


Step 1 Depending on your network size, enter one of the following commands to start the client with a console window:

/bin/ctm-small-debug.exe
/bin/ctm-medium-debug.exe
/bin/ctm-large-debug.exe
/bin/ctm-highend-debug.exe

Step 2 When the CTM client exits, the console window closes. If you want to capture and save the messages when the client exits, you must redirect the output to a file. In this case, the console will not be displayed. To save exceptions to a file, open one of the following files that corresponds to your network size:

/bin/ctm-small-debug.lax

/bin/ctm-medium-debug.lax

/bin/ctm-large-debug.lax

/bin/ctm-highend-debug.lax


Note It is strongly recommended that you save a copy of the original .lax file before modifying it.


Step 3 Using a text editor, open the appropriate .lax file and change the following lines by replacing "console" with a filename:

lax.stderr.redirect=console
lax.stdout.redirect=console

When specifying filenames, make sure to use escaped backslashes (such as c:\\myfolder\\client.log). For example, to save messages to the C:\myfolder\client.log file, change the lines to read:

lax.stderr.redirect=c:\\myfolder\\client.log
lax.stdout.redirect=c:\\myfolder\\client.log

Step 4 After modifying the .lax file, enter one of the following commands to launch the CTM client, depending on your network size:

/bin/ctm-small-debug.exe
/bin/ctm-medium-debug.exe
/bin/ctm-large-debug.exe
/bin/ctm-highend-debug.exe


On Solaris, you can start the CTM client with a console window that displays exceptions and client debug messages. You can use these messages to help troubleshoot any client problems.

Depending on your network size, enter one of the following commands to start the client with a console window:

ctmcdebug-start -small
ctmcdebug-start -medium
ctmcdebug-start -large
ctmcdebug-start -highend

K.9  CTM GateWay/TL1 Problems

If the OSS cannot connect to CTM GateWay/TL1, or if the OSS does not receive a response, check the cables, address bindings, and configuration.


Step 1 If the connection times out, check the cables and connections.

Step 2 Verify that the address bindings between the OSS and the Solaris workstation running the CTM GateWay/TL1 service are correct.

Step 3 Using the ping command, verify connectivity between the OSS and the Solaris workstation running the CTM GateWay/TL1 service.

Step 4 Verify that the TCP/IP socket connection is still open for the OSS-to-CTM GateWay/TL1 port.

Step 5 Choose Administration > Control Panel. Click GateWay/TL1 Service and confirm that the service status is Active.

Step 6 Verify that the username and password under Control Panel > Security Properties match the user configuration of the NEs that CTM GateWay/TL1 will manage.

Step 7 Verify that the username and password in the Domain Explorer > Network Element Properties > NE Authentication tab match the user configuration of the NEs that CTM GateWay/TL1 will manage.


K.10  CTM GateWay/CORBA Problems

K.10.1  CTM GateWay/CORBA Is Installed After Installing the CTM Server

If the CTM server is installed without the CTM GateWay/CORBA option, and the CTM GateWay/CORBA option is installed later, the Control Panel does not refresh automatically after the CORBA installation to show that CTM GateWay/CORBA is installed.

This problem occurs because the CTM GateWay/CORBA installation is handled by a script that notifies the database that the component is installed, but does not notify the CTM server.

To work around this problem, click the Refresh Data button in the Control Panel. It might take some time for the Control Panel to show that CTM GateWay/CORBA is installed. You might need to close and reopen the Control Panel to see the change.

K.10.2  Testing the CTM GateWay/CORBA

Whenever you encounter CTM GateWay/CORBA problems, complete the following to test CTM GateWay/CORBA:


Step 1 In the Control Panel, make sure that the status of the CTM Gate/Way CORBA is Active and that the service action is set to Stop. See 12.2.3.2  Starting or Stopping CTM GateWay/CORBA, page 12-56 and 12.2.3.3  Viewing the CTM GateWay/CORBA Service Pane, page 12-57 for more information.

Step 2 Check that your profile is created in the CTM Gateway/CORBA Users table. Use that profile and the correct password to connect the OSS client to the CTM server. See 12.2.3.4  Viewing the CTM GateWay/CORBA Users Table, page 12-59 and 12.2.3.5  Adding a CTM GateWay/CORBA User, page 12-59 for more information.

Step 3 If you are still encountering problems after performing steps 1 and 2, log into the server and collect the following logs:

/tmp/CORBAGWService.log

/tmo/0.log

/tmp/jcorbagw/log.bak

/opt/CiscoTransportManagerServer/log/ CORBAGWService-*.log

/opt/CiscoTransportManagerServer/openfusion/domains/OpenFusion/localhost/NotificationService/log/NotificationService. log

Step 4 Complete the following steps to change the CTM GateWay/CORBA notification service port to dynamic using the CTM client:

a. In the Domain Explorer window, choose Administration > Control Panel.

b. Click GateWay/CORBA Service to open the GateWay/CORBA Service pane.

c. In the Global tab > Status area, click the Stop button to stop the service.

d. In the GateWay/CORBA Configuration area, set the Notification Service Listening Port Number to zero.

e. Save the changes.


K.11  Problems with MGX Voice Gateway Devices

This section describes troubleshooting procedures for problems related to MGX Voice Gateway devices.

K.11.1  Discovery Mechanism

CTM manages the Private Network-Network Interface (PNNI). The ILMITopoc process uses the SNMP protocol to discover the physical PNNI network. All of the discovered nodes are displayed in the MGX NE GUIs (Configuration Center, Diagnostics Center, Statistics Report, and Chassis View). CTM is notified of all subsequent changes in the network through traps for MGX nodes.

Once the ILMITopoc process discovers all of the nodes, it sends the nodes to the topod process. Topod then sends these nodes and trunks to other processes in CTM such as ooemc, NM server, nts, and so on.

K.11.2  Discovery Issues at Startup

This section includes the following information:

No Nodes Are Discovered

Node Name Is Incorrect in the Database

Node IP Address Is Incorrect in the Database

Node Alarm Is Shown Incorrectly in the Database

Reachable Node Is Shown as Unreachable

CTM State Is Incorrect

K.11.2.1  No Nodes Are Discovered

If no nodes are in the CTM database, complete the following steps:


Step 1 Verify that the nodes are IP-reachable. Complete the following substeps:

a. Enter the following CLI commands to retrieve the primary IP address of the node:

dspndparms
dspipif

b. Enter the following command to check whether CTM has IP connectivity to the node:

ping <node_IP_address>

Step 2 If CTM has IP connectivity to the nodes, verify that the community strings of the gateway nodes are correct. Complete the following substeps:

a. Enter the following command on the switch CLI of the node that cannot be discovered:

dspsnmp

b. Compare this community string with the community string in the node_info table. The community string in the node_info table is encrypted. You will need to decrypt this string and verify it.

Step 3 For MGX nodes, if persistent topology is enabled on the switch gateway and some nodes in a peer group are not discovered, complete the following substeps:

a. Enter the following CLI command to verify that persistent topology is enabled on the gateway in the peer group:

dsptopogw

b. If the gateway flag is enabled, enter the following CLI command to verify that the node is part of the list of nodes on the gateway:

dsptopondlist 

Step 4 Collect the following defect information for analysis:

Topod.log and ILMITopoc.log

Data in the node_info table

Output of dspndparms, dspsnmp, and dspipif on the MGX node


K.11.2.2  Node Name Is Incorrect in the Database

If the node name of the standalone MGX node is incorrect, complete the following steps:


Step 1 If the node table contains the correct name, complete the following substeps:

a. Enter the following command to dump the cache of topod and ILMITopoc:

kill -USR1 <process>

b. From the data collected, verify which process contains the incorrect information. If the information is correct in all processes, open a new instance of the GUI and verify whether the problem still exists.

Step 2 If the node table contains the incorrect name, check the ILMITopoc.log file to see from which node it collected the wrong information. Complete the following substeps:

a. Enter the following command:

snmpget -c <community> <IP_address> 1.3.6.1.2.1.1.5

b. Verify that the name received as a response is correct. If the name received is incorrect, verify the name on the switch.

Step 3 Collect the following defect information for analysis:

ILMITopoc.log and topod.log

Dump outputs of ILMITopoc and topod, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.2.3  Node IP Address Is Incorrect in the Database


Step 1 If the node table contains the correct IP address, complete the following substeps:

a. Enter the following command to dump the cache of topod and ILMITopoc:

kill -USR1 <process>

b. From the data collected, verify which process has the incorrect information. If the information is correct in all processes, open a new instance of the GUI and verify whether the problem still exists.

Step 2 If the node table does not contain the correct IP address, verify whether the node is managed by the LAN IP address. In CTM, since the node does not have any PNNI trunks and the persistent topology feature is not enabled, the node is managed by the LAN IP address by default.

Step 3 Collect the following defect information for analysis:

ILMITopoc.log and topod.log

Dump outputs of ILMITopoc and topod, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.2.4  Node Alarm Is Shown Incorrectly in the Database


Step 1 Enter the following command to dump the cache of topod and ILMITopoc:

kill -USR1 <process>

Step 2 From the data collected, verify which process contains the incorrect information.

Step 3 If the node table does not have the node alarm status set correctly, check the ILMITopoc.log file and enter the following command:

snmpget -c <community string> <IP_address> 1.3.6.1.4.1.351.110.1.1.14

The snmpget command returns the alarm state of the node, where 1 = Clear, 2 = Minor, 3 = Major, and 4 = Critical.

Step 4 Collect the following defect information for analysis:

ILMITopoc.log and topod.log

Dump outputs of ILMITopoc and topod, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.2.5  Reachable Node Is Shown as Unreachable

If a node that is reachable from CTM is shown as unreachable in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 If the node table reports the alarm state of this node as minor, major, critical, or clear, complete the following substeps:

a. Enter the following command to dump the cache of NM server, topod, and ILMITopoc:

kill -USR1 signal <process>

b. Check the data collected in the /opt/svplus/log directory and verify whether the node alarm status is correct. If the alarm status is correct in the cache dump, open a new instance of the GUI and check whether the node shows the correct alarm status in that window.

Step 2 If the node table shows the alarm state of an MGX node as unreachable, complete the following substeps:

a. Verify that the node is IP-reachable from CTM.

b. Ping the active IP address of the node. The active IP address is the IP address of the node that is populated in the node table.

c. Enter the following CLI command to retrieve the community string of the node:

dspsnmp

d. Verify that the community string in the node_info table is the same as the community string on the node.

e. Check the nts logs to verify that nts has declared the node as reachable. See NTS. If nts has declared the node as reachable, check the topod.log for "nonRoutingNodeMsg" for the node. Verify that the alarm status in this message is the correct alarm status.

Step 3 Collect the following defect information for analysis:

ILMITopoc.log, topod.log, NMServer.*.log, nts.*.log, and ooemc*log

Dump outputs of ILMITopoc, topod, and NM server, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.2.6  CTM State Is Incorrect

If the management state of the node in the node table (mgmt_state column) shows as DOWN (2) or UNKNOWN (0) even when the node is reachable by IP, complete the following steps:


Step 1 Check the trap manager registration of the CTM station on the node. Log into the node and enter the following command to check whether the CTM station has registered to the node for traps. This is required for the node to be declared as manageable (mgmt_state = UP):

dsptrapmgr

Step 2 Enter the community strings (SNMP-GET and SNMP-SET) of the node through the Domain Explorer > Network Element Properties pane > NE Authentication tab. The SNMP strings might be incorrect in the node_info table. The strings can be verified if the decrypt tool for the encrypted strings is available in the database.

Step 3 Check the nts.log to verify that trap registration succeeded for the node. See NTS.

Step 4 Save ILMITopoc.log, topod.log, and ooemc*.log for analysis.


K.11.3  Discovery Issues at Runtime

This section includes the following information:

Node Is Not Shown in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

Node Name Change Is Not Updated in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

Node IP Address Change Is Not Updated in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

K.11.3.1  Node Is Not Shown in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

If a dynamically added routing node is not shown in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 If the node table contains an entry for the node, complete the following substeps:

a. Enter the following command to dump the cache of NM server, topod, and ILMITopoc:

kill -USR1 signal <process>

b. Check the data collected in the /usr/user/svplus/log directory and verify whether the node is present in the cache dumps. If the node is present, open a new instance of the GUI and check whether the node is visible.

Step 2 If the node table does not contain an entry for the node, complete the following substeps:

a. Check the ILMITopoc log and verify whether it received a 70005 and 70201 trap (if the persistent topology feature is enabled).

b. If you see the trap in the log, verify that ILMITopoc's SNMP requests issued after receiving the trap are successful. If an SNMP request does not go through, verify whether CTM has IP and SNMP connectivity to the newly added node.

c. If you do not see the traps in the ILMITopoc.log, see NTS.

Step 3 Complete the following substeps to collect defect information for analysis:

a. Save ILMITopoc.log, topod.log, NMServer.log, and nts.*.log.

b. Enter the following command to collect the dump outputs of ILMITopoc, topod, and NM server:

kill -USR1 signal <process>

c. Collect the output of the switch CLI, selnd, and dbnds.

Step 4 If the preceding steps do not solve the problem, delete the node from the network and add it again.


K.11.3.2  Node Name Change Is Not Updated in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

If changes to the node name are not reflected in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 If the node table contains an entry for the node with the correct name, complete the following substeps:

a. Enter the following command to dump the cache of NM server, topod, and ILMITopoc:

kill -USR1 signal <process>

b. Check the data collected in the /usr/user/svplus/log directory and verify whether the node name is correct in the cache dumps. If the node name is correct, open a new instance of the GUI and check whether the node name is correct.

Step 2 If the node table does not contain an entry for the node with the correct name, complete the following substeps:

a. Check the ILMITopoc log and verify whether it received a 60006 and 70202 trap.

b. If you see the trap in the log, verify that ILMITopoc updates its cache based on the trap. If it fails to update the node name, ILMITopoc.log contains the message "%ILMITopoc-3-updateFailed: <Node_c::updateName> Failed to update node name."

c. If you do not see the traps in the ILMITopoc.log, see NTS.

Step 3 Collect the following defect information for analysis:

ILMITopoc.log, topod.log, NMServer.log, and nts.*.log

Dump outputs of ILMITopoc, topod, and NM server, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.3.3  Node IP Address Change Is Not Updated in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

If the node IP address is not updated dynamically in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 If the node table contains an entry for the node with the correct IP address, complete the following substeps:

a. Enter the following command to dump the cache of NM server, topod, and ILMITopoc:

kill -USR1 signal <process>

b. Check the data collected in the /opt/svplus/log directory and verify whether the IP address is correct in the cache dumps. If the IP address is correct, open a new instance of the GUI and check whether the IP address is correct in that GUI.

Step 2 If the node table does not contain an entry for the node with the correct IP address, complete the following substeps:

a. Check the ILMITopoc log and verify whether it received a 60007 and 70202 trap.

b. If you see the trap in the log, verify that ILMITopoc updates its cache based on the trap.

c. If you do not see the traps in the ILMITopoc.log, see NTS.

Step 3 Collect the following defect information for analysis:

ILMITopoc.log, topod.log, NMServer.log, and nts.*.log

Dump outputs of ILMITopoc, topod, and NM server, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.4  Equipment Management Problems

This section describes troubleshooting procedures for equipment management problems on Cisco MGX devices.

K.11.4.1  Equipment Management Declares Successful Node Resync when Node Mode Is 3

MGX nodes are managed by the ooemc component in CTM. Equipment management (EM) uses cold start, warm start, and periodic resync to synchronize network data. Each of these resync methods involves configuration file upload, parsing, and database population. The node mode in the node table indicates the state of the synchronization process. The possible node modes are 1, 2, 3, and 5. The expected mode for a successful node resync is 3. If the node mode stays in other modes for a long time, it indicates a resync problem. Any synchronization problem could be caused by the switch, SNMP, FTP, or ooemc. You must investigate different areas to identify the root cause of any synchronization problem.

The following is the flow of events in the ooemc cold start synchronization process:

1. The topo process discovers a node in the network. The process changes the node mode in the node database table to -1. It then notifies the ooemc component for the discovery. The ooemc changes the node mode from -1 to 1 after the component begins managing the node.

2. When the nts component is able to register with the node, it sends a link up message to the ooemc that manages the node. The ooemc component changes the node mode from 1 to 2 to signify the start of the node synchronization.

3. The ooemc component sends an SNMP bulk file creation request to the switch. If the request is allowed, the switch sends a trap 60901 to CTM to signify the start of bulk file creation. When bulk file creation is complete on the switch, the switch sends a trap 60902 to CTM. After CTM receives the trap, it begins to FTP the config upload files.

4. If the upload and parsing of all config files completes without errors, the ooemc changes the node mode from 2 to 3. If the upload or parsing encounters errors for any service module on the switch, the node synchronizes partially and the ooemc changes the node mode from 2 to 4 at the end of the node synchronization process. If the error occurs on the shelf generic card file (CARD_01_CC.CF) or PNNI file (PNNI_01_CC_CF), the ooemc changes the node mode from 2 to 5 to signify node sync-up failure.

K.11.4.2  Network Setup and Configuration Required for OOEMC Sync-Up Process

The EM requires network setup and CTM configuration prerequisites before it can start the sync-up process. On the switch, complete the following steps:

Use the dsptrapmgr CLI command to verify that the CTM IP address has been added to the trap manager. If your CTM IP address is not registered with the switch, add the CTM IP address to the trap manager by entering the addtrapmgr <cwm_ip> 2500 CLI command, where 2500 is the port number and cwm_ip is the IP address of your CTM machine. You might need to delete other IP addresses from the trap manager list if the list is full. If the list is not full, the registration takes place automatically if CTM is configured to manage the node.

The trap IP address should have been configured correctly on the switch. Whenever you configure the switch to use any IP address (atm0 IP or lnPci0 IP) of the switch as the primary IP address, the topology component of CTM should use that IP address for node discovery. You can display the primary and secondary IP address of the switch by entering the dspndparms CLI command. To configure the primary and secondary IP addresses, you can enter cnfndparms and the CLI will prompt you for options. After you set up the primary and secondary IP addresses, you should also configure the primary IP address as the trap IP address on the switch by entering the cnftrapip configured_IP CLI command, where configured_IP is the IP address that you have chosen to use as the primary IP address in the CLI command cnfndparms.

On CTM, the following is a list of configurable parameters in the file /opt/svplus/config/emd.conf that you should set up for correct problem logging:

"OODebug"—For debugging purposes, set this parameter to level 6 or above; for example, the log level statement in emd.conf should be "OODebug Level 6."

"OOKeep"—For debugging purposes, set this parameter to a value depending on how many log files you want to keep; for example, the statement in emd.conf should be "OOKeep 100 ooemc log files per oochild."


Note Sometimes, a different network environment requires other parameters in the same files to be tuned for correct CTM functioning.


K.11.4.3  Node Mode Remains in 1

If the node has been discovered by the CTM network topology process, and its node mode has changed from -1 to 1 in the node table entry, but it stays in mode 1 for a long period of time, complete the following steps:


Step 1 If CTM stays in mode 1 for a long time after the CTM core has been started, check the trap manager on the switch. See Network Setup and Configuration Required for OOEMC Sync-Up Process for the trap manager and trap IP address setup.

Step 2 If Step 1 does not reveal the source of the problem, check for rtm link up messages that nts sends to EMD. The ooemc starts node resynchronization once EMD notifies the ooemc that the node is active. Therefore, the second step for debugging this mode 1 problem is to see whether the rtm link up message has been received by EMD and whether it has been forwarded to the ooemc.

For example, to find the rtm link up messages received by EMD for the node with ID 9, enter the following command:

grep RTM_LINK_UP emd* | grep "Node id 9" 

Once the location of the rtm link up message is found in the log, view the log file for more information on notification to ooemc.

Step 3 If the rtm link up message for the node is found and the log indicates that notification to ooemc has been sent, collect the log files and report the problem.

Step 4 If the rtm link up message cannot be found, search for link down messages. Grep RTM_LINK_DOWN from the emd log files. If an rtm link down message is found, the node is not reachable by the nts process. Check with the network administrator and see NTS if RTM_LINK_UP and RTM_LINK_DOWN messages are not found.

Step 5 Collect the ooemc and nts log files from the /opt/svplus/log directory for analysis.


K.11.4.4  Node Mode Remains in 2

After EMD receives the rtm link up message and the node active message has been sent to ooemc, ooemc changes the node mode from 1 to 2 and triggers the node resync process. The node resync time varies depending on the switch configuration and network activities. If the node mode stays in mode 2 for longer than the normal time for node resynchronization, there might be a problem with the node resync process.


Step 1 The debugging process for this problem focuses mainly on log file inspection. Make sure that the log level of all related CTM processes is set at the correct level. See Network Setup and Configuration Required for OOEMC Sync-Up Process.

Step 2 Go to the /opt/svplus/tmp directory and check for configuration upload files.

Step 3 Determine which files have been uploaded for a node. For example, enter the following command on a node with a node_id of 9:

ls -ltr *.9  

If files have been uploaded for this node, the node resync process is ongoing. Make sure that the time stamps for these files refer to the current time.

Step 4 If there are no files uploaded for a node, the trap IP address on the switch might not match the IP address used for node discovery. It is also possible that an SNMP request failure occurred. To check the IP addresses, complete the following substeps:

a. Use the dbnds CLI command to display the IP address used for node discovery.

b. Use the dsptrapip CLI command to display the trap IP address.

c. If there is a mismatch, use the cnftrapip CLI command to reconfigure the trap IP address. Use the IP address used by CTM for node discovery. See Network Setup and Configuration Required for OOEMC Sync-Up Process.

Step 5 If the trap IP address is not the issue, check whether there is an SNMP request failure. Normally, after the node resync process is triggered, ooemc sends SNMP requests to the switch to start the configuration upload file creation process. If the request fails, ooemc continues resending the SNMP requests until it exceeds the maximum number of retries and declares node resync failure with node mode equal to 5.


Note It should not take long to declare node resync failure because of the SNMP request failure. To verify that there is an SNMP failure, check the ooemc and snmpcomm log files.



Note Using ooemc10.6568.log as an example of an ooemc log filename, 10 is the child ID, and 6568 is the ooemc process ID. The child ID is calculated from the following formula: Remainder (NEDBACCSSID / number of ooemc child) + 1. To retrieve the process ID, use the psg em CLI command.


Step 6 Grep RESYNC from the log file. The grep result should include the node ID and node mode. Review the log messages in the log file to determine whether an SNMP failure occurred.

Step 7 Complete the following substeps:

a. Check the log file for trap 60903. If you find it, contact the Cisco TAC to investigate the bulk file creation failure on the switch.

Very often, the node mode 2 problem is caused by a bulk file creation quit process on the switch. If there are no SNMP request failures, the switch should send traps 60901 and 60902 to the ooemc. Trap 60901 indicates that the switch will start bulk file creation, while trap 60902 indicates bulk file creation is complete. When the ooemc receives a trap 60902, it starts to FTP the bulk file, which is a shelf-generic configuration file. This is also the first file to be uploaded in every resync mode (cold start, warm start, or periodic resync). The commonly seen mode 2 problem is that the switch quits the bulk file creation and sends trap 60903 to CTM. CTM reschedules the next SNMP request unless it exceeds the maximum retries.

b. Check the log file for trap 60902.

CARD_01_CC.CF.10 is a shelf-generic file and is the first file to be uploaded after CTM receives trap 60902 from the switch. The 10 is the node ID. If you find trap 60902 in the log file but no config file has been uploaded after a reasonable amount of time, it is possible that the FTP failed.

Step 8 To find out if an FTP failure occurred, complete the following substeps:

a. Grep "to ftp file" and the config filename, or "FTP" from the log files.

b. Trace the log messages in the log file.

For more detailed information on SNMP request failures and FTP failures, check the snmpcomm and cwmftpd log files and the cwmftp.request_log. Search for the filename at the time the error occurred in both files. The cwmftp.request_log gives the summary or final result of the FTP operation, and any error is reported. The cwmftp.log shows the FTP operation details.

CTM uploads and parses a set of config files from the switch. The following files are uploaded from the MGX NE, which contains VISM, AXSM, VXSM, SRM, and RPM/RPM-PR cards:

1. CARD_01_CC.CF

2. SM_1_slot#.CF

3. SM_1_slot#.CS

4. SM_CARD_01_slot#.CF

5. SM_CONN_01_slot#.CF

6. SM_ALARM_01_slot#.CF

7. SM_CON_UPDATE_01_slot#.CF

8. SM_CARD_01_SRM.CF

9. SM_CARD_01_RPM.CF

10. PNNI_01_CC.CF

Files 1 and 2 are uploaded for each NBSM. Files 4 to 7 are uploaded for each AXSM. File 9 is uploaded for all RPM/RPM-PR cards on the switch.

The following files are uploaded from the MGX NE, which includes VISM, AXSM, VXSM, SRM, RPM/RPM-PR, and RPM-XF cards:

1. CARD_01_CC.CF

2. SM_1_slot#.CF

3. SM_1_slot#.CS

4. SM_CARD_01_slot#.CF

5. SM_CONN_01_slot#.CF

6. SM_ALARM_01_slot#.CF

7. SM_CON_UPDATE_01_slot#.CF

8. SM_SC_slot#_transactionID_date.CF

9. SM_IC_slot#_transactionID_date.CF

10. SM_CARD_01_SRM.CF

11. SM_CARD_01_RPM.CF

12. PNNI_01_CC.CF

The difference between the items in these lists is in the files uploaded for AXSM cards and RPM-XF cards. For switch software release 4.0 or later, the static file (SM_SC_slot#_transactionID_date.CF) and incremental file (SM_IC_slot#_transactionID_date.CF) are uploaded. For switch software release 3.0 and earlier, conn, alarm, and conn update files are uploaded for each AXSM. Files 4 to 7 in the second list are uploaded for each RPM-XF card, and file 11 is uploaded for all RPM/RPM-PR cards on the switch.

Step 9 In another scenario in which the node mode stays in 2 for a very long time, the parsing of one particular config upload file takes a very long time and does not complete. To view the parsing process, enter the following command:

tail -f <ooemc_log_file>

Step 10 Collect the ooemc, nts, snmpcomm, and cwmftpd log files from the /opt/svplus/log directory.


K.11.4.5  Node Resync Mode Is 5

Node mode 5 indicates that the node resync failed. This could be caused by the config upload or parsing failure of the shelf-generic file (CARD_01_CC.CF.13, for example) or PNNI file (PNNI_01_CC.CF.13, for example). In general, mode 5 signifies that a problem has occurred in one or more stages of the entire resync process. The resync process consists of the following stages and each stage alone can lead to mode 5 problems:

1. The ooemc triggers node resync.

2. The ooemc sends an SNMP request to the switch for bulk file creation.

3. The ooemc receives bulk file creation-related traps: 60901, 60902, and 60903.

4. The ooemc FTPs the config upload files from the switch after it has received 60901 and 60902 from the switch.

5. The ooemc parses the config upload files.

6. The ooemc declares sync-up complete.


Step 1 Determine whether the problem is caused by the SNMP request:

a. Grep RESYNC from the ooemc log files. The ooemc changes the node mode to 2 when it starts the node resync process. The output should look similar to the following:

NOTICE: N17 <EMC_Node_c::InSync> SENDING RESYNC STATUS 2 FOR NODE 17 TO GUI - Node is 
synchronizing.

This message tells you that ooemc will start the node resync process for nodes with ID equal to 17 (N17). If you look further in the log, you should see log messages related to SNMP requests and SNMP responses. If the SNMP request succeeds, the switch responds to the request. The ooemc then processes the SNMP response by invoking the response function, which might do nothing:

<EMC_SnmpFunc_c::ProcFunc_GenNodeBulkFile_1> entering

b. If the log messages indicate an SNMP error, refer to the snmpcomm log files for failure information.

Step 2 Determine whether the problem is caused by bulk creation traps. Grep 60901, 60902, and 60903 from the log files. (See Node Mode Remains in 1.) The output should look similar to the following:

INFO: <EMC_TrapClientImpl::onIncomingTrap> NTS NodeId 17 genericTrap 6 specificTrap 60901
INFO: <EMC_TrapClientImpl::onIncomingTrap> NTS NodeId 17 genericTrap 6 specificTrap 60902

Step 3 Determine whether the problem is caused by the config file FTP. Check the FTP or the FTP plus config upload filename. The output should look similar to the following:

NOTICE: N17 <EMC_NodeFsmHandler_c::LoadShelf> to ftp /opt/svplus/tmp/CARD_01_CC.CF.17
INFO: <ParseFile_c::CheckFile> OOEMC9 CHECKSUM OK FOR FTP FILE 
/opt/svplus/tmp/CARD_01_CC.CF.17

These messages indicate that FTP is successful and there is no checksum error in the file. From the log file, you should be able to determine whether there is an FTP problem. See the cwmftpd log files for more failure information.

Step 4 Determine whether the problem is caused by a shelf-generic file or PNNI file parsing. After you have located where to do a checksum check on the shelf-generic file (CARD_01_CC.CF.17, for example) or PNNI file (PNNI_01_CC.CF.17, for example), continue to trace the log messages to see whether a parsing error occurred. If there are no errors, you should see the following output:

INFO: N17 <EMC_NodeFsmHandler_c::FinishShelf> Parse /opt/svplus/tmp/CARD_01_CC.CF.17 
successfully.

Step 5 Collect the following defect information for analysis:

ooemc, nts, snmpcomm, and cwmftpd log files from the /opt/svplus/log directory

Config upload files from the /opt/svplus/tmp directory


K.11.4.6  CTM Database Is Inconsistent with Switch Data After a Successful Cold Start (CTM Server Stop/Start) or Periodic Resync

If the CTM database is inconsistent with the switch data after a successful node resynchronization triggered by the periodic resync, complete the following steps:


Step 1 Collect the ooemc log files and all config upload files for the node from the /opt/svplus/log directory. The ooemc implements a node-based cache. You can dump and save the cache.

Step 2 Get the process ID of the ooemc process that manages the node.

Step 3 Save the config upload files and ooemc log files.

Step 4 As a possible alternative workaround, manually resync the node. If the problem persists, perform a cold start. If the problem is still not resolved, collect the log files and report the problem.


K.11.4.7  CTM Connection Management GUI Shows Incomplete Connections After Successful Node Resync and Dbroker Sync-Up

If CTM shows that some connections are incomplete after the node resync and dbroker synchronization, complete the following steps:


Step 1 Determine which ooemc process manages the node. The ooemc manages the connection segment in the CTM database, which terminates on the MGX. Find the child ID of the ooemc process that manages your node, which reveals the correct ooemc log files for inspection. The debugging process for this issue focuses mainly on log file inspection.

Step 2 After you have identified the ooemc log files, grep NotifyDataBroker from the files to see some log message examples before you modify your grep format. This will help you more efficiently grep the pair of log messages that correspond to the local and remote ends of one endpoint (primary endpoint or secondary endpoint) of a connection segment. In other words, each end (either primary end or secondary end) of a connection segment managed by ooemc should be logged with one pair of messages: one for the local end, and one for the remote end. This pair of messages corresponds to one atm_connection segment entry in the database. Verify the information that ooemc sends to dbroker. Dbroker refers to these messages from ooemc and constructs a user_connection database entry. The CTM Connection Management (CM) GUI displays connection information based on the user_connection database entry. Whether the connection is displayed as complete or incomplete depends on the information in the user_connection database entry. It is important to verify that ooemc sent the correct information to dbroker. Also verify that the segment database entries are populated correctly.

Step 3 If the correct number of messages has been forwarded to dbroker and the data in the messages is correct, but the user_connection database entry still contains invalid data, see Data Inconsistency. If the problem is due to the wrong data populated in segment database entries, contact the Cisco TAC with the following defect information:

ooemc and dbroker log files from the /opt/svplus/log directory

Config upload files from the /opt/svplus/tmp directory

Step 4 As a possible alternative workaround, manually resync the node. If the problem persists, perform a cold start. If the problem is still not resolved, collect the log files and report the problem.


K.11.4.8  CTM GUI Shows Mismatched Information Between GUI Views and Database Data

After an initial ooemc node resync, such as a cold start, GUI views such as Network Monitor tree view, Inspector View, and Chassis View display information that does not match newly provisioned or updated database data. The problem persists even after each subsequent manual or periodic resync.


Step 1 Determine which ooemc process manages the node and find the child ID of the ooemc process that manages your node, which reveals the correct ooemc log files for inspection. The debugging process for this issue focuses mainly on log file inspection.

Step 2 Initiate an initial node resync, such as a cold start; ooemc starts sending newly provisioned or updated database data to the network management (NM) server. The current supported database tables for NM server message forwarding include the following:

node

card

line

ausm_port

cesm_port

frp

rpm_port

svc_port

virtual_port

aps

ima_group

ima_link

linedistribution

au4tug3

controller

license_in_use

mfr_bundle

mfr_link

peripheral

redundantcard

sensor

Step 3 To determine whether ooemc sent newly provisioned or updated database data to the NM server, grep ComposeNMSMsg from the ooemc log files. The keywords in log messages that correspond to the supported database tables in Step 2 are the following:

EMC_DBProperty_Node_c::ComposeNMSMsg

EMC_DBProperty_Card_c::ComposeNMSMsg

EMC_DBProperty_Line_c::ComposeNMSMsg

EMC_DBProperty_AusmPort_c::ComposeNMSMsg

EMC_DBProperty_CesmPort_c::ComposeNMSMsg

EMC_DBProperty_FrPort_c::ComposeNMSMsg

EMC_DBProperty_RpmPort_c::ComposeNMSMsg

EMC_DBProperty_SvcPort_c::ComposeNMSMsg

EMC_DBProperty_VtPort_c::ComposeNMSMsg

EMC_DBProperty_Aps_c::ComposeNMSMsg

EMC_DBProperty_ImaGroup_c::ComposeNMSMsg

EMC_DBProperty_ImaLink_c::ComposeNMSMsg

EMC_DBProperty_LineDist_c::ComposeNMSMsg

EMC_DBProperty_AU4TUG3_c::ComposeNMSMsg

EMC_DBProperty_Ctrlr_c::ComposeNMSMsg

EMC_DBProperty_LicenseInUse_c::ComposeNMSMsg

EMC_DBProperty_MfrBundle_c::ComposeNMSMsg

EMC_DBProperty_MfrLink_c::ComposeNMSMsg

EMC_DBProperty_Peripheral_c::ComposeNMSMsg

EMC_DBProperty_RedCard_c::ComposeNMSMsg

EMC_DBProperty_Sensor_c::ComposeNMSMsg

Database data is not sent to the NM server during the initial node resync. It is after the initial resync that ooemc starts sending updated or newly provisioned database data to the NM server. Each ooemc log message from grep contains detailed information on the identity of the NE object and the required fields from the corresponding database table. The following is an example of NM server messages for line tables:

ooemc10.24370.log.old.5:( 24370: 4) 23:15:17 INFO: N0:C7:B2:L2 
<EMC_DBProperty_Line_c::ComposeNMSMsg> (MODIFY::LINE) aNMSEvent.node:0 type:4 subType:1 
lpbkType:1 alarmState:1 adminState:1 sectStatus:6 pathState:2

In this example, the identity of the NE object is N0:C7:B2:L2, which is the line with node ID=0, slot=7, bay=2, and line#=2. The database operation is updated. The rest of the message shows the values of the required fields from the line table.

Step 4 If grep returns log messages that indicate matched data, you need to continue investigation on the NM server. See CTM Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Basics for information on nmController.

Step 5 As a possible alternative workaround, manually resync the node. If the problem persists, perform a cold start. If the problem is still not resolved, collect the ooemc and NM server log files from the /opt/svplus/log directory and report the problem.


K.11.4.9  CTM Database and Switch Data Inconsistencies After MGX Node Provisioning

This section includes the following information:

Database Table Population Through Traps and SNMP Upload

Switch Data Does Not Match the CTM Database Table After Node Provisioning

K.11.4.9.1  Database Table Population Through Traps and SNMP Upload

Except for node and card resync, another mechanism that CTM ooemc uses to populate the database table entries is through trap processing followed by SNMP upload, if necessary. You can provision the switch through the CLI, the CTM GUI, or CTM service agents. All three cases involve trap processing and SNMP upload on ooemc. The CTM CM GUI can only provision connections. Other provisioning, such as lines and ports, must go to CTM service agents. The MGX switch CLI can handle all provisioning.

K.11.4.9.2  Switch Data Does Not Match the CTM Database Table After Node Provisioning

Complete the following steps if after provisioning the switch through the CLI, CTM GUI, or CTM service agents, the switch data does not match the CTM database:


Step 1 Study the log files to understand the root cause of the inconsistency. If the issue is caused by a trap (for example, if you provision a connection, but CTM does not populate the database entry or the information in the database entry does not match the switch data), it is possible that CTM did not receive the trap, or the trap is buffered in the ooemc trap queue, and trap processing is delayed. If the data in the CTM database is not correct, it is possible that the SNMP upload did not upload the correct data.

Step 2 To determine whether a trap has been received and processed, you can grep keywords in the ooemc log files. For example, to determine the channel traps from node_id=4, slot=6, vpi=1, and vci=326, you can grep "TRAPLIST" as shown in the following example:

cwmult60% grep "TRAPLIST: N4:" ooemc* | grep "Channel Trap" | grep "C6" | grep "vpi 1 vci 
326"
ooemc10.5760.log.old.55:( 5760: 10) 23:21:12 INFO: TRAPLIST: N4: Channel Trap 60310 from 
C6 B2 L1 P20 Ch299 ifIndex 17176597 vpi 1 vci 326 upCntr 0 vpcFlag 2 operS 1 alarm 67
ooemc10.5760.log.old.55:( 5760: 10) 23:21:12 INFO: TRAPLIST: N4: Channel Trap 60310 < 
PROCESSED > from C6 B2 L1 P20 Ch299 ifIndex 17176597 vpi 1 vci 326 upCntr 0 vpcFlag 2 
operS 1 alarm 67
ooemc10.5760.log.old.73:( 5760: 10) 23:23:48 INFO: TRAPLIST: N4: Channel Trap 60310 from 
C6 B2 L1 P20 Ch299 ifIndex 17176597 vpi 1 vci 326 upCntr 0 vpcFlag 2 operS 1 alarm 66

For other types of traps, you can use the following keywords to supplement "TRAPLIST" in your grep statement:

Port Trap

RscPart Trap

Svc Trap

SonetLn Trap

SctCard Trap

SonetPath Trap

FunMod Trap

LineMod Trap

RedCard Trap

TrapMiss Trap

VsiCtrlr Trap

DS3Line Trap

AtmPhy Trap

Peripheral Trap

CoreSwth Trap

TrapLost Trap

AtmAddr Trap

Restart Trap

Node Trap

SonetAps Trap

LMIPort Trap

Pnni IF Trap

Bulkfile Trap

Vism Trap

Vism Ann Trap

SubIf Trap

NBSMCnfg Trap

NBSMChan Trap

NBSMLine Trap

AusmLine Trap

AusmPort Trap

AusmChan Trap

AusmIma Trap

FrChan Trap

FrPort Trap

CesmPort Trap

CesmChan Trap

HsFrPort Trap

HsFrChan Trap

LineDist Trap

DS3Path Trap

Chan Upload

Party Trap

PrefRoute Trap

CardIma Trap

DS1 Line Trap

SctPort Trap

SvcDerouteGroomTrap

Channel Trap

TUG3Path Trap

Cug Trap

AddrCug Trap

RSC Upload

APS Upload

FrPort State Upload

MPSM Upload

Vism ToneDetect Trap

License Trap

PortAtmIf Trap

VxsmPvcRed Trap

ChanProt Trap

VxsmGwDsp Trap

VxsmGwIp Trap

VxsmGw Trap

VxsmSysRes Trap

VxsmGw1 Trap

VxsmMgc Trap

VxsmMgcIp trap

VxsmMgcGrpParam Trap

VxsmMgcGrpMgc Trap

VxsmMgcGrpProt Trap

VxsmAal2Prof Trap

VxsmCodec Trap

VxsmSvc Trap

VxsmAal2CrossConn Trap

VxsmAal25DataProfileTrap

VxsmSensor Trap

VxsmSensorThrhd Trap

VxsmModule Trap

DS0Grp Trap

VxsmAnnounce Trap

VxsmAudioFile Trap

VxsmDs0XConn Trap

VxsmMegaco Trap

VxsmCrr Trap

VxsmTone Trap

VxsmAs Trap

VxsmAsp Trap

VxsmAs Trap

VxsmLapd Trap

VismABCDBitTemplate Trap

Review the log messages using the traplist command and decide how you can grep the messages according to your needs. In the preceding example, the keyword "PROCESSED" indicates that the trap has been processed. This gives you the starting point in the log file to study the log messages. If the database is not populated correctly, the subsequent log messages should provide information about the database inconsistencies. Some trap processing might invoke SNMP data upload. From the log messages, you can verify whether the uploaded data is correct. Other possible reasons for database inconsistency include:

SNMP upload failure and maximum retries exceeded, SNMP timeout occurred, or throttle error occurred. You can verify the problem in the snmpcomm log files.

Database operation error. You can verify the problem in the ooemc log files.

If you do not know exactly what trap sequence the switch sends to CTM, try a working scenario and study the log for the trap sequence, or use HPOV to determine the trap sequence.

Step 3 To verify whether a trap has been received from the switch and forwarded to ooemc, refer to the nts log. If the trap has been buffered in the trap queue, the possible reasons are:

The node is synchronizing due to a regular node resync or a -2 trap.

The card is synchronizing and the trap is related to the card. The traps are buffered in the queue until the card resync completes.

A summary alarm trap is processed. The summary alarm is being uploaded or parsed. You can grep the log files for information related to summary alarm traps.

Step 4 To verify that the trap has been put into the queue, enter the following command:

grep EMC_TrapQueue_c::append ooemc_log | grep trap_num

You should see output similar to the following example:

INFO: <EMC_TrapQueue_c::append> entering. ======> append trap# 60303 to trap queue; 
getHdlrLevel=5, getTrapLevel=5, #=174

Step 5 As a possible alternative workaround, manually resync the node. If the problem persists, perform a cold start. If the problem is still not resolved, collect the ooemc, nts, and snmpcomm log files from the /opt/svplus/log directory and report the problem to the Cisco TAC.


K.11.5  Configuration Center, Chassis View, Diagnostics Center, and Statistics Report Problems

This section includes the following information:

CTM Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Basics

Basic Issues

Topology Discovery Issues

NE Discovery Issues

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Alarm Issues

K.11.5.1  CTM Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Basics

CTM manages MGX NEs. The topology module discovers the nodes and trunks. Once the nodes are discovered, the EM module synchronizes with the nodes to discover the card, line, port, and so on. All of the discovered nodes, trunks, and node elements are published by the NM server and displayed in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report. The Configuration Center, Chassis View, Diagnostics Center, and Statistics Report have tree and inspector views, which feed off the data published by the NM server.

Traps notify CTM of all subsequent changes in the network.

The following figure shows the end-to-end architecture.

Figure K-1 CTM End-to-End Architecture

The NM server provides inventory and alarm information for the Configuration Center, Chassis View, Diagnostics Center, and Statistics Report. The NM server receives node information from topod. The initial inventory of card, lines, and ports is read from the database. The NM server receives dynamic updates from topod, ooemc, and sdbroker.

The nmControl and nmClient utilities in the /opt/svplus/util/ directory are shipped with CTM to debug issues in the NM server.

nmControl—This utility provides the means to check the NM server cache and its state. Output is redirected to /opt/svplus/log/nmControl.log. Cache dumps are redirected to /opt/svplus/log/nmControl.dump. The nmControl utility allows the following operations, all of which are nondestructive and perform tasks in a passive mode.


Note Do not use the All Cache operation to dump the cache for a large network that contains more than 1000 nodes.


Resync Node—Resynchronizes the node (specified by the node_id).

All Cache—Dumps all data in its cache to the dump file.

Topology Cache—Dumps topology (node) data to the dump file.

Node Cache—Dumps a node's data (specified by the node_id) to the dump file.

CTM Alarm Cache—Dumps CTM-specific alarms to the dump file.

Client Manager Cache—Dumps information about all clients to the dump file.

Error Statistics—Dumps error information associated with the sync-up to the dump file.

Sync-Up Status—Dumps the sync-up state to the dump file.

Configuration—Dumps configuration data to the dump file.

nmClient—This utility isolates problems between the client and server. It queries the NM server the same way the Configuration Center, Chassis View, Diagnostics Center, and Statistics Report GUIs query the server. Output is redirected to /opt/svplus/log/nmClient.log. Cache dumps are redirected to /opt/svplus/log/nmClient.dump.

You must complete the Register Client operation before performing any other operations. When you are finished using the client, complete the Unregister Client operation.

The nmClient utility allows the following operations:

Register Client—Registers with the server.

Update Filter—Updates the subscription/filter with the server passing the FDN.

Unregister Client—Unregisters with the server.

Get Topology—Retrieves the topology information of networks and nodes.

Get Children—Retrieves the list of children for a particular object in the tree.

Get CwmInfo—Retrieves the CTM sync-up state information.

Get ManagedObject—Retrieves detailed information about an object in the tree.

Subscribe for all Events—Subscribes to all updates on the server.

Unsubscribe for all Events—Unsubscribes from all updates on the server.

The GUI client's log files (CMSCclient.log) are saved under log directories, such as:

Windows—D:Documents and Settings\<username>\log\

UNIX—/opt/svplus/log

K.11.5.2  Basic Issues

This section includes the following information:

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show Any Nodes

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Is Unable to Connect to Server

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Newly Added Node

K.11.5.2.1  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show Any Nodes


Step 1 Check whether the nodes are discovered. See No Nodes Are Discovered.

Step 2 Use the nmControl CLI command to check whether the NM server has the nodes and trunks in its cache.

Step 3 If the nodes are in the NM server cache, open a new window and verify whether the nodes are shown.

Step 4 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.2.2  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Is Unable to Connect to Server


Step 1 Check whether the client registers in NMServer.log.

Step 2 Enable orbix logs by creating the orbix directory in the /opt/svplus/log/ directory.

Step 3 If the problem persists, collect the following defect information for analysis:

CMSCclient.log, NMServer.log

Orbix logs


K.11.5.2.3  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Newly Added Node


Step 1 Check whether the nodes have been discovered. See No Nodes Are Discovered.

Step 2 Use the nmControl CLI command to check whether the NM server has the nodes in its cache.

Step 3 If the nodes are in the NM server cache, open a new GUI window and verify whether the nodes are shown.

Step 4 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.3  Topology Discovery Issues

This section includes the following information:

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Node

Incorrect Node Information on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows a Node that Is Not in the Network

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows Incorrect Sync State for a Node

Duplicate Nodes Are Displayed on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report

K.11.5.3.1  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Node


Step 1 Check whether the nodes are discovered. See No Nodes Are Discovered.

Step 2 Use the nmControl CLI command to check whether the NM server has the nodes in its cache.

Step 3 If the nodes or trunks are in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and verify whether the nodes are shown.

Step 4 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.3.2  Incorrect Node Information on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report


Step 1 Use the selnd command to check the node information in the database.

Step 2 Use the nmControl CLI command to check the node information in the NM server cache.

Step 3 Check the CMSCclient.log for the node information.

Step 4 From nmClient, use the getTopology option to retrieve the node information and verify whether it matches the database and the GUI.

Step 5 If the node information is correct in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and verify whether the node is shown.

Step 6 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

selnd o/p

nmControl.dump

CMSCclient.log


K.11.5.3.3  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows a Node that Is Not in the Network


Step 1 Check the node information in the database by querying the node, packet_line table. Verify whether the node is active.

Step 2 If the node is active, see No Nodes Are Discovered.

Step 3 Use the nmControl CLI command to check the node in the NM server cache.

Step 4 Check the CMSCclient.log for the node/trunk. Verify whether a delete message was received for the node.

Step 5 From nmClient, use the getTopology option to retrieve the node information and verify whether it matches the database and GUI.

Step 6 If the node is not present in the NM server cache, open a new GUI window and verify whether the node is shown.

Step 7 If the problem persists, collect the following defect information for analysis:

ILMITopoc.log, topod.log, and NMServer.log

Dump outputs of ILMITopoc, topod, and NM server, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.5.3.4  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows Incorrect Sync State for a Node


Step 1 Use the selnd CLI command to query the node table and check the node information in the database.

Step 2 Use the nmControl CLI command to check the node in the NM server cache.

Step 3 Check the CMSCclient.log for the node. Verify whether the correct message was received for the node.

Step 4 From nmClient, use the getTopology option to retrieve the node information and verify whether it matches the database and the GUI.

Step 5 If the problem persists, collect the following defect information for analysis:

ILMITopoc.log, topod.log, and NMServer.log

Dump outputs of ILMITopoc, topod, and NM server, retrieved with the kill -USR1 signal <process> command

Output of the switch CLI, selnd, and dbnds


K.11.5.3.5  Duplicate Nodes Are Displayed on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report


Step 1 Check whether the node table has duplicate node IDs. Enter the following command to query the node table:

tballraker18% echo "select (*) from node where node_id=2 and slot=1" | dbaccess stratacom

Step 2 Check whether both nodes in the database have active=1. If both nodes are active, collect the logs.

Step 3 If duplicate nodes do not exist in the database, use the nmControl command to check the NM server cache.

Step 4 Check the CMSCclient.log for the node.

Step 5 From nmClient, use the getTopology option to retrieve the node information and verify whether it matches the database and the GUI.

Step 6 If duplicate nodes are not present in the NM server cache, open a new GUI window and verify whether the node is shown.

Step 7 If the problem persists, collect the following defect information for analysis:

For Solaris clients, collect the CMSCclient.log file under /opt/svplus/log

For Windows clients, collect the CMSCclient.log file under D:\Documents and Settings\<username>\log

Node data from the node table for that specific node

ILMITopoc.log and topod.log, if there are duplicate nodes in the database


K.11.5.4  NE Discovery Issues

This section includes the following information:

All Cards Under a Node Are Missing

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Card, Line, or Port for a Node

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows Incorrect Card, Line, or Port Information

Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows a Card, Line, or Port that Is Not Present on a Node

K.11.5.4.1  All Cards Under a Node Are Missing

If no cards are shown under a node in the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 Use the selnd CLI command to check whether the node has synchronized.

Step 2 If the node has not synchronized, wait for it to do so.

Step 3 If the node has synchronized, check whether cards are populated in the card table. If the card table is empty, collect the EM logs. See Equipment Management Problems.

Step 4 Use the nmControl CLI command to check whether the NM server has cards in its cache.

Step 5 If cards are not present in the NM server cache, collect the logs.

Step 6 If cards are present in the NM server cache, use nmClient and verify whether getChildren for the node's FDN returns the cards in the nmClient.<pid>.dump file.

Step 7 If cards are present in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and verify whether the cards are shown.

Step 8 As a possible alternate workaround, use nmControl and perform a Resync Node operation, specifying the node ID.

Step 9 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.4.2  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Does Not Show a Card, Line, or Port for a Node


Step 1 Use the selnd CLI command to check whether the node has synchronized.

Step 2 If the node has not synchronized, wait for it to do so.

Step 3 If the node has synchronized only partially, check whether the card, line, or port is populated in the database. See the Cisco Transport Manager Release 9.0 Database Schema.

Step 4 If the entry is not populated in the database, see Equipment Management Problems.

Step 5 If the entry is present in the database, use the nmControl CLI command to check the NM server cache. Check the nmControl.<pid>.dump.

Step 6 If the element is not present in the NM server cache, use nmClient and verify whether getChildren for FDN returns the entities in the nmClient.<pid>.dump file.

Step 7 If the element is present in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and verify whether the missing elements are shown.

Step 8 As a possible alternate workaround, use nmControl and perform a Resync Node operation, specifying the node ID.

Step 9 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.4.3  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows Incorrect Card, Line, or Port Information

If the entity information of an element shows incorrectly on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 Use the selnd CLI command to check whether the node has synchronized.

Step 2 If the node has not synchronized, wait for it to do so.

Step 3 Check whether the card, line, or port is populated in the database. See the Cisco Transport Manager Release 9.0 Database Schema.

Step 4 If the entry in the database matches the GUI, see Equipment Management Problems.

Step 5 If the entry does not match the database, use the nmControl CLI command to check the NM server cache. Check the nmControl.<pid>.dump.

Step 6 Use nmClient and verify whether getChildren for FDN returns the correct information for the element in the nmClient.<pid>.dump file.

Step 7 If the element is present in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and verify whether the missing elements are shown.

Step 8 As a possible alternate workaround, use nmControl and perform a Resync Node operation, specifying the node ID.

Step 9 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.4.4  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Shows a Card, Line, or Port that Is Not Present on a Node

If an extra card, line, or port element is shown incorrectly on the Configuration Center, Chassis View, Diagnostics Center, or Statistics Report, complete the following steps:


Step 1 Use the selnd CLI command to check whether the node has synchronized.

Step 2 If the node has not synchronized, wait for it to do so.

Step 3 Check whether the card, line, or port is populated in the database. See the Cisco Transport Manager Release 9.0 Database Schema.

Step 4 If the element is present in the database but does not match what is shown in the GUI, see Equipment Management Problems.

Step 5 If the element is not in the database, use the nmControl CLI command to check the NM server cache. Check the nmControl.<pid>.dump.

Step 6 Use nmClient and verify whether getChildren for FDN returns the element in the nmClient.<pid>.dump file.

Step 7 If the element is not present in the NM server cache, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report GUI and verify whether the extra element is shown.

Step 8 As a possible alternate workaround, use nmControl and perform a Resync Node operation, specifying the node ID.

Step 9 If the problem persists, collect the following defect information for analysis:

topod.log, ILMITopoc.log, and NMServer.log

nmControl.dump

CMSCclient.log


K.11.5.5  Configuration Center, Chassis View, Diagnostics Center, or Statistics Report Alarm Issues

This section includes the following information:

Alarm Processing Basics

XML Schema for Alarm Rules

Alarm Severity and Object Severity

Severity Shown on Tree View Does Not Match Severity Shown on Platform

Alarm List Shows Alarm that Does Not Exist on Platform

Transient Event Has Disappeared Unexpectedly

Port in Tree View Displays Aggregate Alarm; However, No Children Exist Under Port

K.11.5.5.1  Alarm Processing Basics

Alarm processing is done within the NM server processes. Note the following important points regarding alarm processing:

Alarm rules are defined in an XML file (/opt/svplus/xml/ruledata.xml). For details on XML schema alarm rules, see Figure K-2.

GUI clients register for alarms by providing the parent FDN.

GUI clients pull initial alarms from the NM server. After the initial pull, the NM server pushes all alarms to registered clients as they occur.

The object severity for all NEs is determined by the XML alarm rule.

Administrative states of NE states are not displayed in the alarm list; only alarm states are displayed.

The alarm list displays mostly active alarms; however, some events are also displayed. For details on events versus alarms, see XML Schema for Alarm Rules.

Some alarms result in the NE having a different alarm severity than the actual alarm. For details on object severity versus alarm severity, see Alarm Severity and Object Severity.

Alarms for various NEs are stored in memory cache in the NM server object tree. GUI clients register for alarms from the NM server by providing the FDN of the NE for which they want alarms. All alarms for NEs under the FDN registered are sent if the parent FDN is registered. GUI clients might also update alarms; for example, by flagging an alarm as acknowledged or manually clearing an alarm. These updates are done through the Alarm Repository component of the NM server. The following figure illustrates the client/server architecture for alarm components in the NM server.

Figure K-2 Client/Server Architecture for Alarm Components

K.11.5.5.2  XML Schema for Alarm Rules

Alarm rules are defined in XML: specifically, in a file named $HOME/xml/ruledata.xml. These alarm rules are read once when NM server starts. When events are processed, the rules are queried from memory to determine what action, with regard to alarms, should be taken on the event. There are three types of alarm rules defined in the XML schema: Correlated, Correlated Bitmap, and Transient. Specifics on each of these types are as follows.

Correlated Alarm Rule Type

The correlated alarm rule type is the most common in ruledata.xml. This rule is used when an NE can be in only one of many states at any one time. If the entity changes states, the previous alarm state is cleared. Most NEs managed by CTM fall into this category. A simple example is the database sync-up status of a node. The sync-up status can be Partial Sync-Up, Sync-Up Failed, or In Sync. Any one of these states correlates out any other. See the following figure.

Figure K-3 Correlated Alarm Rule Diagram

A CorrelatedRule can have any of the following:

1 AlwaysClear (Used when a given element never has an alarm, such as in a top-level network)

0 or more ClearAlarmCondIds

1 new alarm (which consists of NewAlarmConditionID, NewAlarmServAffect, and so on)

Correlated Bitmap Alarm Rule Type

The correlated bitmap alarm rule type is different from the correlated alarm rule type because the bitmap rule type can represent the many states an entity can be in at any one time. The correlated bitmap rule type is used primarily to represent line alarms. Lines can have many different alarms associated with them at any given time, such as Loss of Signal and Loss of Frame. See the following figure.

Figure K-4 Correlated Bitmap Alarm Rule Diagram

A CorrelatedBmRule can have any of the following:

0 or more ClearAlarmCondIds

1 new alarm (which consists of NewAlarmConditionID, NewAlarmServAffect, and so on)

Transient Alarm Rule Type

The transient alarm rule type is used to distinguish events from alarms. The alarm list shows some events. Events are normally associated with the NM server itself. Examples of events are Process Restarted or Primary Gateway Disconnected. These are NM server events that are not correlated, but are displayed in the alarm list. Another example of a transient event that is not related to the NM server is a card switchover. If automatic protection switching (APS) is enabled on a card and there is an APS switchover, an event is displayed in the alarm list. The primary distinction between transient events and correlated alarms is that transient events are not cleared by the network, whereas correlated alarms are cleared. See the following figure.

Figure K-5 Transient Alarm Rule Diagram

A transient alarm rule can have only one new alarm associated with it.


Note There is no ClearAlarmCondID associated with a transient rule. Transient alarms can be manually cleared by the user, cleared by the same transient alarm occurring twice, or purged by the NM server when a high threshold is reached.


K.11.5.5.3  Alarm Severity and Object Severity

Every alarm in an alarm rules XML file is assigned two severities: object and alarm. These severities match for most alarms. There are some cases, however, where these severities differ. The following is an example of an XML alarm rule where object and alarm severity do not match:

<CorrelatedRule State="1 -1">
<ClearAlarmCondID>10302</ClearAlarmCondID>
<ClearAlarmCondID>10303</ClearAlarmCondID>
<ClearAlarmCondID>10304</ClearAlarmCondID>
<NewAlarmCondID>10301</NewAlarmCondID>
<NewAlarmServAffect>0</NewAlarmServAffect>
<NewAlarmTransient>0</NewAlarmTransient>
<NewAlarmDbPersistent>0</NewAlarmDbPersistent>
<NewAlarmSev>6</NewAlarmSev>
<NewAlarmObjSev>7</NewAlarmObjSev>
<NewAlarmDescrSuffix>Sync-Up has not started yet</NewAlarmDescrSuffix>
</CorrelatedRule>

This alarm occurs if the southbound processes (EMs) send a node message with EM sync-up status as 1 or -1. If this occurs, the node has an Unreachable severity (value 7) in the tree view. Note that there is no unreachable severity in the alarm list. This is why there are two severities for each alarm. Unreachable alarms have Critical severity (value 6) in the alarm list.

Another instance when alarm and object severity do not match is for aggregate port alarms. Aggregate port alarms summarize the condition of the connections on the port. Since these alarms should not affect the severity of the port, the object severity of these is alarms is Clear (value 3). The following is the XML for an aggregate port alarm:

<CorrelatedBmRule StateBm="1">
<NewAlarmCondID>40801</NewAlarmCondID>
<NewAlarmServAffect>0</NewAlarmServAffect>
<NewAlarmTransient>0</NewAlarmTransient>
<NewAlarmDbPersistent>1</NewAlarmDbPersistent>
<NewAlarmSev>5</NewAlarmSev>
<NewAlarmObjSev>3</NewAlarmObjSev>
<NewAlarmDescrSuffix>Aggregate Port alarm, One or more connections on this port are in 
primary failure</NewAlarmDescrSuffix>
</CorrelatedBmRule> 


Note The alarm severity (NewAlarmSev tag) is Major (value 5).


K.11.5.5.4  Severity Shown on Tree View Does Not Match Severity Shown on Platform

If the severity of an NE in the tree view does not match the severity the switch shows for the same NE, complete the following steps:


Step 1 Verify whether you are comparing the correct severity icon.

a. There are two severities associated with every object in the tree view: aggregate severity and self severity. The aggregate severity is on the left; the self severity is on the right. Because the switch aggregates many faults, you should compare the aggregate severity of the object in the tree view with the severity of the switch.

b. If you are looking at the correct severity icon and there is still a discrepancy between the severities, continue to Step 2.

Step 2 Verify whether CTM has synchronized with the node.

a. Log into the CTM server and enter the selnd command.

b. Verify that the node mode in question is mode 3.

Step 3 If the mode is 3, verify whether the discrepancy is caused by aggregated connection alarms. The NM server aggregates more alarms than does the platform. For instance, the NM server aggregates connection alarms on the port, but the switch does not. Therefore, if the tree view displays a higher severity than the switch, the discrepancy might be caused by one or more aggregated port alarms.

a. Right-click the NE in the tree view and choose Show Alarms. All of the alarms for this NE and its children should appear in the alarm list.

b. Filter on the alarms that have greater severity than what is shown for the platform.

c. Verify whether the alarms are aggregate port alarms. If so, this is expected behavior and there is no defect. If there are alarms that have greater severity than what is shown for the platform, and the alarms are not aggregate port alarms, continue to the next step to see whether the alarm is in the database.

Step 4 Verify whether the database has the correct alarm state. See the Cisco Transport Manager Release 9.0 Database Schema for information on what table to look up for this particular entity type.

Step 5 As a possible alternate workaround, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report and see if the alarm severities match.

Step 6 If the problem persists, collect the following defect information for analysis:

NMServer.log

nmControl.dump

CMSCclient.log on client machine


K.11.5.5.5  Alarm List Shows Alarm that Does Not Exist on Platform

There are several alarms that are correlated by the NM server and are not handled by the switch. The NM server tracks alarm conditions that might not be relevant to the switch alone, but are relevant to the switch and the NM server. One example of such an alarm is the node sync-up state. The switch itself does not track whether the NM server is synchronized with it, but the NM server tracks this information. Therefore, if the node is still synchronizing with the NM server, an alarm is displayed for that node in the alarm list, but there is no such alarm on the platform. The following procedure includes a summary of alarms that might be seen in the alarm list, but not on the platform:


Step 1 If there is an alarm in the alarm list, and not on the platform, verify whether the alarm is included in any of the following lists:

Node sync-up status alarms

Node database sync-up status alarms

Node management state status alarms

Node aggregate alarm status

Link0/Link1 node alarms

Card sync-up status alarms

Aggregate port (connection) alarms

Step 2 If the alarm is not included in any of the alarm lists shown in Step 1, it might be a defect. Complete the following substeps:

a. Verify whether the database has the correct alarm state. See the Cisco Transport Manager Release 9.0 Database Schema for information on what table to look up for this particular entity type.

b. Collect the following defect information:

topod.log, linktopoc.log, ILMITopoc.log, NMServer.log, and fileTopoc.log

nmControl.dump

CMSCclient.log

Step 3 As a possible alternate workaround, open a new GUI window and a new Configuration Center, Chassis View, Diagnostics Center, or Statistics Report.


K.11.5.5.6  Transient Event Has Disappeared Unexpectedly

Transient alarms behave differently depending on whether the entity is managed by the NE or the NM server itself. If the transient alarm is on a managed NE—for example, an alarm generated by an FTP transfer failure—the alarm is self-clearing. This means if the same transient alarm occurs more than twice, the previous active alarm is cleared by the latest alarm.

If the transient alarm is not a managed NE, but rather the NM server itself, the alarm is not self-clearing and there are multiple occurrences of the same alarm conditions. Because these NM server alarms (or events) that pertain to the NM server are not cleared by the NM server, NM server alarms are not correlated. They must be cleared manually by the operator. If these events are not cleared manually, the list grows only to the value of MAX_ACTIVE_NMS_EVENTS that is specified in the NMServer.conf file. Once the list of NM server alarms reaches MAX_ACTIVE_NMS_EVENTS, the NM server automatically clears the oldest alarms in the list to make room for the new events. The number of events that are cleared each time the maximum is reached is specified by the EVENTS_TO_CLEAR_WHEN_MAX_REACHED configuration parameter in NMServer.conf.

If a transient event disappears unexpectedly, it is most likely because the NM server purged the event.


Step 1 Filter the alarm list on CTM alarms.

Step 2 Check the alarm count and compare it with the MAX_ACTIVE_NMS_EVENTS value in NMserver.conf. The transient event is purged if the alarm count is close to the MAX_ACTIVE_NMS_EVENTS value.

Step 3 If the alarm count for NM server alarms has not reached the MAX_ACTIVE_NMS_EVENTS value, the missing event might have been cleared manually. The NM server log indicates whether the event was cleared manually.

Step 4 As a possible alternate workaround, reopen the Alarm List window.

Step 5 If the problem persists, collect the following defect information for analysis:

CMSCclient.log

NMServer.log

CMSCclient.log

Use nmControl to capture an NM server dump


K.11.5.5.7  Port in Tree View Displays Aggregate Alarm; However, No Children Exist Under Port

The tree view in CTM client GUIs displays NEs from the top-level physical view down to the port. Alarm severities are aggregated from children up to parents. Because the port is at the lowest level of the tree, the question that often arises is how a port can have an aggregate alarm if it has no children. The answer to this question is that connections are virtual entities under ports in the tree view. Virtually, you cannot see connections in the tree. Connection alarms aggregate up to the port in the tree view.


Step 1 Compare the connection alarms with the aggregate port alarms:

a. In the Configuration Center, click the Connections tab.

b. Find the port in the tree view and drag and drop it to the right window pane. The Connection view window appears.

c. Click the Get button. The connections are listed, and the alarm status should match the alarm status in the alarm list.

Step 2 If the connection alarms do not match the aggregate port alarm(s) in the Alarm List window, there might be a defect. Collect the following defect information:

CMSCclient.log, NMServer.log, sdbroker*.log, xdbroker*.log

Use nmControl to capture an NM server dump


K.11.6  Chassis View Problems

This section includes the following information:

Chassis View Basics

Lines Are Not Displayed in the Chassis View

Card Is Not Displayed in the Chassis View

Ethernet Status Does Not Update on PXM Cards

HIST, CBRX, and CBTX Status Does Not Update on MGX Nodes

RPM Card Status Does Not Update

RPM Secondary Card Status Is Blue

Lines Are Not Displayed on Secondary Card

Lines Are Not Selectable

Wrong Tooltip Is Displayed

K.11.6.1  Chassis View Basics

The Chassis View is a display-only application that provides a physical view of WAN devices. For a specific node, the Chassis View displays node, cards, and lines; it does not display ports. It can handle the following events:

Node status changes

Card status changes

Line status changes

The Chassis View handles card and line alarms.

For unsupported cards, the Chassis View shows an empty slot in the chassis. For reserved cards, it shows the lines, but they are disabled. The Chassis View does not show lines under a card when the card is in a state other than active or standby.

The Chassis View displays the chassis by combining static data from the XML file and dynamic data from the database. The first step in debugging is to determine whether data is populated correctly in the database. The next step is to check that the card or line is defined in the XML file.

K.11.6.2  Lines Are Not Displayed in the Chassis View


Step 1 Determine whether line data is populated for the corresponding lines in the line table in the database.

Step 2 Check whether the lines are defined in the XML file Chassis View.xml.

Step 3 Make sure that the line numbers in the database start from 1 and not from 2 or above.

Step 4 Make sure that the node is in sync (mode is 3).

Step 5 If the details are not found in the database, this is likely an EM issue. If the details are found, proceed as follows:

a. Enter the cwmver command to get the CTM version.

b. Enter the following command to retrieve the node table information for the node; then, save it to a file:

echo "select (*) from the node, where node_id=<NodeId> " | dbaccess

c. Enter the following command to retrieve the card table information for the card; then, save it to a file:

echo "select (*) from the card, where node_id=<NodeId> and slot=<Slot>" | dbaccess

d. Enter the following command to retrieve the line table information for the line; then, save it to a file:

echo "select (*) from the line, where node_id=<NodeId> and slot=<Slot>" | dbaccess

e. Save the log as CMSCclient.log to your local drive.

f. Take a copy of the chassisview.jar from the /opt/svplus/java/jars/cwm/ directory. Confirm that the gif files used to draw the lines are available in the jar.

Step 6 As a possible alternate workaround, select the lines from the tree view to launch other applications.


K.11.6.3  Card Is Not Displayed in the Chassis View


Step 1 Check that entries for the card are available in the card table.

Step 2 Check whether the lines are defined in the XML file ChassisView.xml.

Step 3 Make sure that the node is in sync (mode is 3).

Step 4 If the details are not found in the database, this is likely an EM issue. If the details are found, proceed as follows:

a. Enter the cwmver command to get the CTM version.

b. Enter the following command to retrieve the node table information for the node; then, save it to a file:

echo "select (*) from the node, where node_id=<NodeId> " | dbaccess

c. Enter the following command to retrieve the card table information for the card; then, save it to a file:

echo "select (*) from the card, where node_id=<NodeId> and slot=<Slot>" | dbaccess

d. Save the log as CMSCclient.log to your local drive.

e. Take a copy of the chassisview.jar from the /opt/svplus/java/jars/cwm/ directory. Confirm that the gif files used to draw the lines are available in the jar.

Step 5 As a possible alternate workaround, select the cards from the tree view to launch other applications.


K.11.6.4  Ethernet Status Does Not Update on PXM Cards

Ethernet status is received only when the Chassis View is launched. Dynamic changes in states of the Ethernet status are not updated in the Chassis View.

K.11.6.5  HIST, CBRX, and CBTX Status Does Not Update on MGX Nodes

For MGX nodes, only the CPUOK LED status is updated in the Chassis View. The Chassis View does not manage the LEDs for HIST, CBRX, or CBTX.

K.11.6.6  RPM Card Status Does Not Update

Dynamic event updates are not generated for RPM cards on MGX PXM1E-based nodes, so the Chassis View does not receive event updates on hot insertion or removal of RPM cards. The card is identified when a cold start is performed.

K.11.6.7  RPM Secondary Card Status Is Blue

For RPM cards, the Standby state shows the card status in blue because the card has only one LED (CPUOK) to show the status of the card, unlike other cards. For other types of cards, the Standby state is indicated by a yellow LED.

K.11.6.8  Lines Are Not Displayed on Secondary Card

If two cards are in a redundancy relationship, the primary card (for example, the logical slot) is used to display the children and for all provisioning and troubleshooting activities, even if the primary slot becomes a standby. The secondary slot does not show any children under it, even if it becomes active. Hierarchical views in all applications behave in this manner. Similarly, provisioning is allowed only on the working line of an APS pair regardless of whether that line is currently active. However, monitoring occurs on both working and protection lines.

K.11.6.9  Lines Are Not Selectable

Lines must be spaced sufficiently apart. If they are not, they might overlap, causing them to become unselectable. For example, when trying to select line 3, line 1 might get selected or vice versa.


Step 1 Select the line from the tree view.

Step 2 If the problem persists, do the following to get defect information for analysis:

Get a copy of the ChassisView.xml file used.

Take a copy of the chassisview.jar from the /opt/svplus/java/jars/cwm/ directory. Confirm that gif files were used to draw the lines.


K.11.6.10  Wrong Tooltip Is Displayed

Lines must be spaced sufficiently apart. If they are not, they might overlap, causing the wrong tooltip to be displayed. For example, when moving the cursor below the last line in the card, the tooltip of the line is displayed instead of the tooltip of the card.


Step 1 Try selecting from the tree view.

Step 2 If the problem persists, do the following to get defect information for analysis:

Get a copy of the XML file used: ChassisView.xml for MGX nodes and BPXIGX.xml for BPX/IGX nodes.

Take a copy of the chassisview.jar from the /opt/svplus/java/jars/cwm/ directory. Confirm that gif files were used to draw the lines.


K.11.7  Configuration Center Management

The Configuration Center management functions are divided into the following categories:

NEs—Manages the nodes and their components. For NE management, the Configuration Center communicates with the Config Server process.

Connections—Manages the connections between the nodes. For connection management, the Configuration Center communicates with the Connection Management (CM) Server process.

This section describes troubleshooting guidelines for the element management (Configuration Center > Elements tab) section. The Connection Management (Configuration Center > Connections tab) section describes troubleshooting guidelines related to connection management.

This section includes the following information:

Configuration Center Framework

Configuration Center—Element Management

K.11.7.1  Configuration Center Framework

The Configuration Center uses the CTM framework and workflow mechanism to launch applications and drag and drop objects across applications.

NEs can be selected in the tree view and the Configuration Center (Elements tab) can be launched to view or modify the selected object.

NEs can be dragged and dropped from another application to the Configuration Center's Elements tab for modification.

This section includes the following information:

Cannot Launch the Configuration Center

Cannot Launch Other Applications from the Configuration Center

An Exception Is Raised when the Configuration Center Is Launched

An Exception Is Raised when the Configuration Center Launches Another Application

Element Tab—Internal Frame Does Not Launch

Element Tab—Drag-and-Drop Functionality Does Not Launch an Internal Frame

Element Tab—Create, Details, Modify, and Refresh Button Issues

Element Management—Drag and Drop Within the Configuration Center

Cross Application—Configuration Center as Drag Source

Cross Application—Configuration Center's Element Tab as Drop Target

Element Tab—Internal Frame Displays Incorrect Object or Object Data

Configuration Center's Element Tab Does Not Respond (GUI Is Grayed Out)

K.11.7.1.1  Cannot Launch the Configuration Center

Complete the procedure if you cannot launch the Configuration Center by doing the following:

Click the Configuration Center icon from the Launch Center or from any application.

Choose Tools > Configuration Center from any application.

Right-click the selected node from the hierarchical tree, and choose Configuration Center.


Step 1 Check the configcenter.jar file. Make sure that the configcenter.jar file is located in the /opt/svplus/java/jars/cwm directory of the target machine.

Step 2 If the problem persists, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.2  Cannot Launch Other Applications from the Configuration Center

Complete the following steps if the Configuration Center does not launch other applications using the following methods:

Choose an application from the Tools menu item.

Right-click the selected object from the hierarchical tree and choose the target application.


Step 1 Check the target application jar file. Make sure the target application jar file is located in the /opt/svplus/java/jars/cwm directory of the target machine.

Step 2 If the problem persists, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.3  An Exception Is Raised when the Configuration Center Is Launched

Complete the following step if an exception is raised when the Configuration Center is launched using one of the following methods, and the Java console shows the exception trace information:

Choose an application from the Tools menu item.

Right-click the selected object from the hierarchical tree and choose the target application.


Step 1 Collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.4  An Exception Is Raised when the Configuration Center Launches Another Application

Complete the following step if an exception is raised when the Configuration Center uses one of the following methods to launch another application, and the Java console shows the exception trace information:

Choose an application from the Tools menu item.

Right-click the selected object from the hierarchical tree and choose the target application.


Step 1 Collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.5  Element Tab—Internal Frame Does Not Launch

When the Element tab is selected and you double-click a supported NE, if an internal frame is not created or the content of an existing internal frame is not recycled, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory

K.11.7.1.6  Element Tab—Drag-and-Drop Functionality Does Not Launch an Internal Frame

When the Element tab is selected and you drag and drop supported NEs to the Element tab's content pane, if an internal frame is not created or the content of an existing internal frame is not recycled, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory

K.11.7.1.7  Element Tab—Create, Details, Modify, and Refresh Button Issues

When the Element tab is selected and an object is displayed in an internal frame, if the Create, Details, Modify, and Refresh buttons do not launch other internal frames for further provisioning or they do not update the user interface with data for the selected operation, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory

K.11.7.1.8  Element Management—Drag and Drop Within the Configuration Center

Complete the following steps if the drag-and-drop functionality of an NE from the Configuration Center tree view to the Element tab's content pane does one of the following:

Fails to open an internal frame

Fails to recycle the contents of an existing frame to display the object's attributes

Results in an Operation Not Supported message box


Step 1 Determine whether the drag-and-drop functionality is supported for the dropped object.

The following objects can be dropped from the tree view to the content pane:

For the Element tab, the Network, Node, Card, Line, Port, IMA, and IMA link objects are supported and Folder objects are not supported.

For the Connection tab, the Node, Card, Line, and Port objects are supported and Folder, IMA, and IMA link objects are not supported.

Step 2 If the problem persists, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information; specifically, any Java-raised exceptions

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.9  Cross Application—Configuration Center as Drag Source

Complete the following steps if the drag-and-drop functionality of an object from the Configuration Center's tree view or Element tab to another CTM application fails to display the selected object in the target application.


Step 1 Determine whether the drag-and-drop functionality is supported for the selected object.

Step 2 Determine whether the target application supports the drag-and-drop functionality for the dropped object.

Step 3 If the problem persists, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information; specifically, any Java-raised exceptions

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.10  Cross Application—Configuration Center's Element Tab as Drop Target

Complete the following steps if the drag-and-drop functionality of an NE from another CTM application to the Configuration Center's Element tab content pane does one of the following:

Fails to open an internal frame.

Fails to recycle the contents of an existing frame to display the dropped object's attributes.

Results in an Operation Not Supported message.


Step 1 Determine whether the drag-and-drop functionality is supported for the dropped object.

The following objects can be dropped from the tree view to the content pane:

For the Element tab, the Network, Node, Card, Line, Port, IMA, and IMA link objects are supported and Folder objects are not supported.

For the Connection tab, the Node, Card, Line, and Port objects are supported and Folder, IMA and IMA link objects are not supported.

Step 2 If the problem persists, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information; specifically, any Java-raised exceptions

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory


K.11.7.1.11  Element Tab—Internal Frame Displays Incorrect Object or Object Data

If the Element tab successfully creates the internal frame but either it displays information related to another object or the object's attribute values are not valid, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information; specifically, any Java-raised exceptions

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory

K.11.7.1.12  Configuration Center's Element Tab Does Not Respond (GUI Is Grayed Out)

If the Configuration Center's Element tab does not respond and the GUI is grayed out, collect the following defect information for analysis:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

Screen snapshots; in particular, error or information messages

Cmsvr.log file from the /opt/svplus/log directory

Configserver.log file from the /opt/svplus/log directory

K.11.7.2  Configuration Center—Element Management

The Configuration Center can be used to configure different NE objects. The Configuration Center's Element tab is the main window for creating, modifying, and viewing NEs. The Element tab uses the CTM Internal Frame mechanism to display the attributes (groups and categories) associated with a particular NE. The following sections describe guidelines for troubleshooting issues related to creating, modifying, and viewing NEs.

K.11.7.2.1  XML Parsing Error

Complete the following steps if an XML parsing error is seen while launching the Configuration Center (either by the drag-and-drop method or the right-click method) for an NE. The popup window says:

Internal Error: XML Parsing Error

Step 1 Verify whether your CTM version supports the NE on which you want to launch the Configuration Center. If this error occurs for a supported node or card, go to Step 2.

Step 2 Open the /opt/svplus/log/configserver.log file and look for the message information showing when this error occurred. A typical message in the log resembles the following output:

ERR: Fatal Error at file, line 0, char 0, Message: An exception occurred! 
Type:RuntimeException, Message: The primary document entity could not be opened. 
Id=/opt/svplus/xml/configcenter/XXX/XXX-XXX.xml 
( <Number>: <x>) <Time Stamp> ERR: InternalError: XML Parsing Error  

If the .xml filename contains two consecutive hyphens (for example, ABC--XYZ.xml), or if it has a preceding hyphen (for example, -ABC.xml) or terminates with hyphen before the file extension (for example, ABC-.xml), proceed to Step 4.

Step 3 Verify that the .xml file mentioned in the log message exists. If it does not exist, contact the Cisco TAC.

Step 4 Complete the following substeps to investigate incorrectly formed XML filename strings. The format of XML filenames is <Platform>-<Card>-<Interface_Type>-<Entity_Name>.xml. This format is generic with a few exceptions. Also note that <Platform> and <Interface_Type> are optional and are not seen in many files. For example, ABC-Card.xml is a valid XML filename.

a. If the <Platform> part of the XML filename is incorrect or missing, check the Node table to verify that it is populated correctly.

b. If the <Card> part of the XML filename is incorrect or missing, check the Card table to verify that it is populated correctly.

c. If the <Interface_Type> part of the XML filename is incorrect or missing, check the appropriate table to verify that it is populated correctly. This table generally corresponds to the Line or Port table of the card for which the error occurred.

d. If the <Entity_Name> part of the XML filename is incorrect, contact the Cisco TAC with the following defect information:

CMSCclient.log file from the D:\Documents and Settings\<username>\log directory

Java console information

/opt/svplus/log/configserver.log

Dspcd command output from the switch for the controller card and the service module where the error is detected

Ddspln or dspport command output, if this error seen for a line or port


K.11.7.2.2  SNMP No Data Error

Complete the following steps if an SNMP NO DATA error occurs while launching the Configuration Center (either by the drag-and-drop method or the right-click method) for an NE:


Step 1 Use the selnd command on the CTM machine to verify that the NE for which this error occurred is active, and that the node is synchronized.

Step 2 Check whether the correct community strings are used. If this error occurs for a correctly synchronized node on an active element, proceed to the next step.

Step 3 Use an SNMP manager tool to determine whether the switch responds to SNMP queries.

The SNMP MIB objects to be queried for a particular dialog box are available in the /svplus/log/configserver.log file. See the entries in this log file when this error occurs and complete the following substeps:

a. Using HP-OV SNMP operations, perform a walk (on the "system" MIB table for this example, using the public community string) as /opt/OV/bin/snmpwalk -c public nodeName system.

b. If an error occurs during the SNMP operations (from the SNMP manager tool), check the switch to verify whether this information is present. If yes, proceed to the next step.

Step 4 Collect the information from the /svplus/log/configserver.log file when this error occurs. Complete the following substeps: