Guest

Cisco ONS 15800 Series DWDM Platforms

Empty Slot Troubleshooting

Cisco - Empty Slot Troubleshooting

Document ID: 18924

Updated: Sep 16, 2005

   Print

Introduction

This document describes how to troubleshoot on-field empty slot occurrences.

Prerequisites

Requirements

Readers of this document should be knowledgeable of the following:

  • Cisco ONS 15801 System

  • Using Telnet

  • Cisco Photonics Local Terminal (CPLT)

  • Cisco Photonics Toolkit (CPTK)

Components Used

The information in this document is based on the software and hardware versions below.

  • A laptop PC with Windows 95, Windows 98 or Windows 2000.

    Note: This procedure does not require access to the Control and Monitoring Processor (CMP) through the PPP serial connection but only through the LAN port. If the LAN port is not available and you want to collect the data through the serial port (and the CMP is running the TL1 Agent version 1.1.2) you need a Windows 95 PC to get successful connection, since using Windows 98 and Windows 2000 does not work.

  • Telnet application on the PC (verify that your Telnet application enables you to log your session and commands in a log file).

  • CPLT and CPTK already installed on the PC (compatible with the agent version you are running).

The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you are working in a live network, ensure that you understand the potential impact of any command before using it.

Background Information

The troubleshooting process described here applies to any empty slot problem. Depending on the type of failure, some steps may not apply. The main steps are as follows:

  • Site inventory, to collect information on the site hardware and software configuration.

  • Site history, to collect information which involves adhering to a local principle, and looks at the board itself, its sub-rack and the whole site.

  • Slot configuration check, to ensure the empty slot corresponds to a present board.

  • Recovering the board monitoring, which involves one or more aimed board resets or extractions.

Conventions

For more information on document conventions, see the Cisco Technical Tips Conventions.

Site Inventory

Site inventory involves collecting information on the site hardware and software configuration. Information collected during this step includes:

General site information:

  • Site Type (Terminal Site, Optical Line Amplification (OLA), Optical Add/Drop Multiplexing (OADM), Regeneration site).

  • Number of sub-racks managed by the CMP.

  • Number and type of boards for sub-rack.

  • Firmware version, Serial Communication Controller (SCC) version and serial number of all the boards present in the site.

CMP card details:

  • Serial number.

  • Part number.

  • Q3 or TL1 agent version.

  • SCC master version.

Site History

Empty slot issues can happen as a result of a configuration change at the level of the subrack containing the empty slot or at the general site level.

The following information is required:

  • History of the card which corresponds to the empty slot:

    • Remote history: the card was inserted during the site installation or during upgrade.

    • Recent history: the card was recently reset or extracted from the slot.

  • Recent idle time: report whether the board had a significant idle time period (that is, when there was no traffic running)

  • Recent fiber cut: for amplifiers, report whether a fiber cut occurred recently

  • History of the local subrack:

    • Recent history: specify whether any board on the same subrack (where the empty slot is) was either recently extracted or upgraded. SCF boards are mainly used for communication between subracks.

  • History of the site:

    • Recent history: as for the local subrack history but for the whole site, focussing on the CMP board.

CMP Reset Information

Step 1

From a Telnet session, run telnet <ip address> and provide the username and password, such as ROOT and WMUX and run the date command to see the actual time and date.

Step 2 - (Q3 Agent from Blue Devil onwards)

From a Telnet session, run cat Q3INFO. You should see output like this:

INFO SECTION
1.5(1)M
PART_NUMBER SECTION
85-03204-02
TIME_STAMP SECTION
date:[20/08/2001] time:[17:15:12]

The TIME_STAMP SECTION shows the last restart of the CMP.

Step 3 - (TL1 Agent Only)

  1. Run telnet <ip address> 1000 to connect to the TL1 Agent and then set up a TL1 session with the network element (NE) through the activate user command: ACT-USER::<user_ID>:<ctag>::<password>;

    Ask the network administrator about a NE account with read only privileges. The default setting has both <user_ID> and <password> equal to USER_3. The correlation tag <ctag> is a string of up to six characters. For instance, the command for the default account could be: ACT-USER::USER_3:CONNEC:USER_3;

    If the command is successfully executed you get a COMPLD response.

  2. Issue the retrieve equipment command: RTRV-EQPT::ALL:EQTLIS;. If the command is successfully executed you get a COMPLD response and the list of all configured boards on the CMP. For example:

    DEFAULT 01-08-20 09:40:29
    M LLL COMPLD
    "TPA_R-01-01-01:IS-NR"
    "TPA_B-01-01-02:IS-NR"
    "EMPTY_SLOT-01-01-03:OOS-MT-MTCE"
    "24WD_LLR-01-01-04:IS-NR"
    "8WD_B-01-01-07:IS-NR"
    "RBA_10G-01-01-09:IS-NR"
    "BBA_10G-01-01-10:IS-NR"
    "RBU_W-01-01-11:IS-NR"
    "EMPTY_SLOT-01-01-13:OOS-MT-MTCE"
    "CMP_W-01-01-15:IS-NR"
    "IOC_W-01-01-16:OOS-MT"
    "SCF_W-01-01-17:IS-NR" 
    
  3. Select the CMP entry and enter the command RTRV-UPTIME::CMP_W-01-01-15:LLL; If the command is successfully executed, you see a COMPLD response and the time of the last restart. For example:

    00-06-19 14:55:56
    M LLL COMPLD
    CMP_W-01-01-15:00-06-19 14:37:46
    

PSOS Resource Usage

Relevant information for troubleshooting includes the PSOS Operating System volatile memory and the current process list.

  • OS memory usage: connect to the CMP through a Telnet session on port 5678, enable trace and enter the command monon. This enables pSOS object tracing. Every ten seconds the operator is displayed with a set of meaningful values, including memory fragmentation, message queue usage and so on. Send the trace file to the Cisco Technical Assistance Center (TAC).

  • Current agent activity: this data is useful in order to find out whether the agent deals with a shortage of OS resources. Connect to the CMP through a Telnet session on port 5678, then enable the trace and enter the command allon. Keep the trace running for about a minute then stop the log and disconnect. Send the trace file to the Cisco TAC.

  • Process list: connect to the CMP by means of a Telnet session. Once connected, enable the trace logging and execute the command ps. Send the output of the command to the Cisco TAC.

NE Configuration - Slot List

Collect information about all configured slots in the site. You can retrieve this information through the CPLT application by selecting the Configuration Management pull down menu and the NE inventory option.

This check shows whether some slots without boards inserted appear as configured, giving rise to undesired empty slots.

If the empty slot problem corresponds to a wrongly configured slot, which for instance would happen if the operator forgot to update the slot configuration after site configuration, the troubleshooting terminates.

Empty Slot List

This phase involves collecting a list of all the empty slots providing current status of the site, each of them testifying either a real empty slot (no board is present on a configured slot) or a board communication failure. The number and position on the site can provide useful information about an eventual common route cause of the problem.

If a given subrack presents all empty slots, provide its eight dip-switch configuration as further information when seeking support. The empty slot list is carried out through the CPLT application by looking into the status column in the main window. In order to make a comparison, retrieve the same list by means of the available Element Manager Application (Cisco Transport Manager (CTM) or Customer Profile Manager (CPM)).

Recovery Procedure

The process of recovering monitored boards also provides useful information needed to understand the cause of the problem. For instance, whenever all the boards on a given subrack except the SCF are no longer visible, the problem may reside on some locking of the communication bus of such subrack rather than on the single boards that are missing from the management point of view.

In order to activate the trace logging at the CMP level, the operator should connect to it through a Telnet session on the port 5678. Type allon in order to enable all interface traces to and from the CMP. Then select Log to File from the application pull down menu.

CMP Reset

After an empty slot problem occurs, follow the steps below:

  1. Perform a CMP board reset through the CPTK.

  2. Once the CMP is reachable (triggered through a ping -t <ip address> command) activate the logging procedure. Name the log file afterCmpReset.

  3. Check through CPLT whether the CMP card is working properly.

  4. Check through CPLT whether the previous missing cards are visible or not.

  5. Close the logging on the file afterCmpReset and send the file to Cisco TAC.

Afterwards, if none of the empty slots are turned into boards, proceed with the following steps to physically extract the SCF board.

Extracting the Board from the Empty Slot

caution Caution: Depending on the board you extract this operation could be traffic-affecting.

This process is recommended only if you want to recover the normal management condition and are aware of the impact that this operation could have on your network.

  1. Activate the logging procedure. With <boardtype> as the board type of the missing board (for instance RXT, wavelength channel module (WCM), line extender modules (LEM) or PRE-LEM), name the log file after<boardtype>Reset (such as afterRxtReset).

  2. Physically extract the board and wait about one minute before reinserting it.

  3. Check through CPLT whether the board no longer appears as an empty slot.

  4. If the empty slot(s) still exist then close the logging on the file after<boardtype>Reset and send the file to Cisco TAC.

Related Information

Updated: Sep 16, 2005
Document ID: 18924