Alarm Troubleshooting


Note


Certain software releases have reached end-of-life status. For more information, see the End-of-Life and End-of-Sale Notices.


This chapter gives a description, severity, and troubleshooting procedure for each commonly encountered Cisco NCS 1001 alarm and condition. When an alarm is raised, refer to its clearing procedure.

0/PM [0|1] Unit Unsupported

Default Severity: Critical (CR), Service Affecting (SA)

Logical Object: EQUIPMENT

The 0/PM [0|1] Unit Unsupported alarm is raised when a an usupported PSU is plugged within NCS1001 chassis. The alarms has two forms because of the two PSU:

  • 0/PM0 unit unsupported
  • 0/PM1 unit unsupported

Clear the 0/PM [0|1] Unit Unsupported Alarm

Procedure


Replace the PSU with correct part number and hardware revision.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


0/RP0 Unit Unsupported

Default Severity: Critical (CR), Service Affecting (SA)

Logical Object: EQUIPMENT

The 0/RP0 Unit Unsupported alarm is raised when an unsupported CPU board is plugged in the chassis.

Clear the 0/RP0 Unit Unsupported Alarm

Procedure


Replace the control board with a proper part number and or hardware revision

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


AMPLI-GAIN-LOW, AMPLI-GAIN-HIGH

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Object: OTS

The Amplifier Gain Low or High alarm is raised when the EDFA module cannot reach the gain setpoint. This condition occurs if the amplifier reaches its range boundaries.

Clear the AMPLI-GAIN-LOW, AMPLI-GAIN-HIGH Alarm

Procedure


Step 1

If the Amplifier-control-mode is set to "Manual", the applied gain comes from configuration. You need to adjust the gain setting to a correct value (not too low or too high).

Step 2

If the Amplifier-Control-Mode is set to "Automatic" it may be due to a too long/too short span or other conditions (i.e. measured channels). Check the overall system settings and performance.

Step 3

If the alarm persist, it may indicate an amplifier hardware failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


AUTO-AMPLI-DISABLED

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: OTS

The AUTO-AMPLI-DISABLED alarm is triggered when the amplifier operates in automatic mode and the power level difference between two channels goes beyond the configured Channel Power Max Delta value. When the difference exceeds the configured delta value, the Amplifier control is disabled. Although the output power is still visible, the gain regulation does not occur as intended.

Clear the AUTO-AMPLI-DISABLED Alarm

The alarm is cleared automatically when the power level difference between two channels is below the configured delta value. To further troubleshoot and clear this alarm, perform the following steps:

Procedure


Step 1

Check the channel plan at the system level and verify if the OTS-OCH power levels of the amplifier meet the expected values. It's possible that one or more channels may have power issues compared to others. The channel with the minimum power value and the channel with the maximum power value should have a power difference greater than channel-power-max-delta.

Step 2

Check the patch panel for channels with low power. Clean or replace fibers of affected incoming channels.

Step 3

Check the channel plan at the system level including the channel launch power vs channel path. If it cannot be determined, adjust the Channel Power Max Delta value using the command channel-power-max-delta value to be higher than the difference between the highest and lowest channel power.

Note

 

Ensure to revert the configuration to the default delta value once the power level difference between two channels is below the configured delta value.


If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).

AUTO-AMPLI-MISMATCH

Default Severity: NotAlarmed (NA), Not-Service-Affecting (NSA)

Logical Object: OTS

The Amplifier Automatic Configuration Mismatch alarm is raised when the amplifier-control-mode is configured as auto but there is no grid mode configuration. Hence the amplifier control cannot work.

Clear the AUTO-AMPLI-MISMATCH Alarm

Procedure


Step 1

Cisco NCS1001 configuration depends on how the system works (automatic control vs manual control).

Step 2

Check Cisco NCS1001 running configuration using show running config command.

Step 3

Enter the grid-mode configuration for the EDFA module, if required.

Step 4

Change the amplifier configuration in manual mode, if required.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


AUTO-AMPLI-RUNNING

Default Severity: NotAlarmed (NA), Not-Service-Affecting (NSA)

Logical Object: OTS

The Automatic Amplifier Running alarm is raised when the internal algorithm performs calculation to reach the target power. The alarm notifies the user that the final power is not reached.

Clear the AUTO-AMPLI-RUNNING Alarm

Procedure


The Automatic Amplifier Running alarm clears automatically when the target power is reached.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


AUTO-LASER-SHUT

Default Severity: NotAlarmed (NA), Not-Service-Affecting (NSA)

Logical Object: OTS

The Amplifier Laser Shutdown alarm is raised for safety concern. If an OTS port supports an amplifier and the safety-control-mode is set to "auto", the amplifier may shut down its Tx power if it is not receiving the same Rx port due to a fiber cut.

Clear the AUTO-LASER-SHUT Alarm

Procedure


Step 1

For Controller OTS 1 (LINE), check the RX-LOC or RX-LOSP alarm. For Controller OTS 0 (COM), check if any RX-POWER-FAIL-LOW on Controller OTS 3 (COM-CHECK).

Step 2

Check the fiber is properly plugged or if there is no fiber cut on the span.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


AUTO-POW-RED

Default Severity: NotAlarmed (NA), Not-Service-Affecting (NSA)

Logical Object: OTS

The Automatic Power Reduction alarm is raised when the temporary conditions in the amplifier restarts hence pulsing the APR power levels.

Clear the AUTO-POW-RED Alarm

Procedure


Step 1

Wait for the APR cycles complete, say, 100 seconds, 8 seconds APR levels. The alarms disappear once the amplifier ends the APR phase.

Step 2

If the alarm persist check the fiber cut is repaired.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Ctrl-FPGA PCIe Error

Default Severity: Critical (CR), Service Affecting (SA)

Logical Object: EQUIPMENT

The Ctrl-FPGA PCIe error is raised when the Contro FPGA is unreachable due to PCIe bus error.

Clear the Ctrl-FPGA PCIe Error Alarm

Procedure


Reload Cisco NCS1001. If the alarm does not clear it may due to the hardware failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Daisyduke - FPGA PCIe Error

Default Severity: Critical (CR), Service Affecting (SA)

Logical Object: EQUIPMENT

The Daisyduke - FPGA PCIe error alarm occurs when the Daisy Duke CPU FPGA is unable to communicate with the CPU controller due to a Peripheral Component Interconnect Express (PCIe) error.

Clear the Daisyduke - FPGA PCIe Error Alarm

Procedure


Step 1

Reload the chassis.

Step 2

If not cleared it may be due to hardware fault.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Daisyduke Processor Hot Alarm

Default Severity: Minor(MN), Non-Service Affecting (NSA)

Logical Object: EQUIPMENT

The Daisyduke Processor Hot alarm is raised when the CPU detects high temperature of the Processor.

Clear the Daisyduke Processor Hot Alarm

Procedure


Verify proper functioning of all the fans in the system.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


EQPT-DEGRADE-<n>

Default Severity: MINOR (MN), Not-Service-Affecting (SA)

Logical Object: EQUIPMENT

The Equipment Degrade alarm is raised when there is misbehavior detected by the optical module however the module is still considered as working properly.

Clear the EQPT-DEGRADE-<n> Alarm

Procedure


The Module is still working but requires further diagnosis. Plan for a module replacement.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


EQPT-FAIL-<n>

Default Severity: Critical (SA), Service-Affecting (SA)

Logical Object: EQUIPMENT

The Equipment Failure alarms is rasied on an optical module equipment that has an internal failure or is not able to commucate with the NCS1001 chassis.

Clear the EQPT-FAIL-<n> Alarm

Procedure


Step 1

From admin plane try to trigger a module reset (this opertaion is traffic affecting).

Step 2

If the problem persist it typically indicate an internal hardware failure for the optical module.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Ethernet Switch Communication Error

Default Severity: Major (MN), Service Affecting (SA)

Logical Object: EQUIPMENT

The Ethernet Switch Communication error is raised when the interconnected board is unable to communicate with CPU due to a Peripheral Component Interconnect Express (PCIe).

Clear the Ethernet Switch Communication Error Alarm

Procedure


Power Cycle the Cisco NCS1001.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Fan Tray Removal Alarm

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: FT

The Fan Tray Removal alarm is raised when the fan tray is removed from Cisco NCS1001 chassis.

Clear the Fan Tray Removal Alarm

Procedure


Insert the missing or removed fan tray.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FPDs are Incompatible - Need to Upgrade all the FPDs

Default Severity: Major (MN), Service Affecting (SA)

Logical Object: EQUIPMENT

The FPDs are incompatible - need to upgrade all the FPDs alarm is raised when FPDs are incompatible.

Clear the FPDs are Incompatible - Need to Upgrade all the FPDs Alarm

Procedure


Update the firmware.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


I2C Access Error

Default Severity: Major (MN), Service Affecting (SA)

Logical Object: EQUIPMENT

The I2C access error alarm is raised when Cisco NCS1001 detect errors in interconnected card I2C busses.

Clear the I2C Access Error Alarm

Procedure


This alarm indicates hardware failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


[Low | High] Voltage Alarm

Default Severity: Critical (CR), Service Affecting (SA)

Logical Object: EQUIPMENT

A [Low | High] Voltage alarm is raised when one of the internal voltage measurement is not within the operating range. The format of the voltage alarm is:

  • [sensor name]: high voltage alarm.
  • [sensor name]: low voltage alarm.

Clear the [Low | High] Voltage Alarm

Procedure


Voltage alarms clears when the voltage is within the operating conditions.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


More than One Fan Tray is Removed from Chassis

Default Severity: Critical (CR), Not-Service-Affecting (NSA)

Logical Object: EQUIPMENT

This alarm is raised when more than one fan tray is removed from Cisco NCS1001 chassis.

Clear the More than One Fan Tray is Removed from Chassis Alarm

Procedure


Step 1

Check if at least three fan trays are inserted.

Step 2

Ensure that there is no fan tray failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Out of Tolerance Fault

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: EQUIPMENT

An Out of Tolerance alarm is raised when a sensor detects wrong working condition. The alarm appears in the following format of:

  • [sensor name]: out of tolerance fault.

Clear the Out of Tolerance Fault Alarm

Procedure


Check the sensor for hardware failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Error (PM_I2C_ACCESS_ERROR)

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Error (PM_I2C_ACCESS_ERROR) alarm is raised when there is an error on the power module. The detected error is a communication error on I2C bus.

Clear the Power Module Error (PM_I2C_ACCESS_ERROR) Alarm

Procedure


Check the health status of PEM module through the admin console.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Error (PM_NO_INPUT_DETECTED)

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Error (PM_NO_INPUT_DETECTED) alarm is raised when there is an error on the power module. This error is detected when the input power is not available.

Clear the Power Module Error (PM_NO_INPUT_DETECTED) Alarm

Procedure


Check the health status of PEM module using the admin console.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Error (PM_VIN_VOLT_OOR)

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Error (PM_VIN_VOLT_OOR) alarm is raised when there is an out of range input voltage issue on the power module.

Clear the Power Module Error (PM_VIN_VOLT_OOR) Alarm

Procedure


If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Output Disabled (PM_OUTPUT_EN_PIN_HI)

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Output Disabled (PM_OUTPUT_EN_PIN_HI) alarm is raised when the output power is disabled and PEM module does not work.

Clear the Power Module Output Disabled (PM_OUTPUT_EN_PIN_HI) Alarm

Procedure


Enable the power supply to clear the alarm.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Overloaded (PM_POWER_LIMITED)

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Overloaded (PM_POWER_LIMITED) alarm is raised when there is power limitation control in the Power module.

Clear the Power Module Overloaded (PM_POWER_LIMITED) Alarm

Procedure


Check the health status of the PEM module through the admin console.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Redundancy Lost

Default Severity: Major (MJ), Service Affecting (SA)

Logical Object: PEM

The Power Module Redundancy Lost alarm is raised when any one of the two active power module is removed from the chassis.

Clear the Power Module Redundancy Lost Alarm

Procedure


This alarm is cleared when an active power supply module is inserted on Cisco NCS 1001.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Warning (Low Input Voltage)

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: PEM

The Power Module Warning (Low Input Voltage) alarm is raised when the input power reaches warning level.

Clear the Power Module Warning (Low Input Voltage) Alarm

Procedure


Check the PEM module power supply.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Warning (PM_FAN_OUT_OF_TOLERANCE)

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: PEM

The Power Module Warning (PM_FAN_OUT_OF_TOLERANCE) alarm is raised when there is a warning on the power module. The detected problem is a PSU fan out during normal working condition.

Clear the Power Module Warning (PM_FAN_OUT_OF_TOLERANCE) Alarm

Procedure


Check the health status of the PEM module through the admin console.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Power Module Warning (PM_OT_Warning)

Default Severity: Minor (MN), Not-Service-Affecting (NSA)

Logical Object: PEM

The Power Module Warning (PM_OT_Warning) is raised when the PEM Module reaches the warning level.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).

RX-LOC

Default Severity: Critical (SA), Service-Affecting (SA)

Logical Object: OTS

The RX Loss Of Continuity (LOC) alarm is raised when there is a an optical power failure on the port receiving from a span. This alarm represents a fiber cut on the span and is a combination between RX-LOS-P and RX-POWER-FAIL-LOW.

Clear the RX-LOC Alarm

Procedure


Check the RX power reading on the ports received from the fiber spans. For EDFA Pluggable module, check the RX total power on OTS Port Controller 1 (LINE Port) and the TX power on OTS Port Controller 2 (OSC port).

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


RX-LOS-P

Default Severity: Critical (SA), Service-Affecting (SA)

Logical Object: OTS

The RX-LOS-P alarm is raised when there is an optical signal power loss on an OTS port as it transmits the signal.

Clear the RX-LOS-P Alarm

Procedure


Step 1

Check if the threshold settings is as per expected system performance.

Step 2

Check if the receiving power is functional.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


RX-POWER-FAIL-LOW

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Object: OTS, OTS-OCH, Optics Controller

The RX-POWER-FAIL-LOW alarm is triggered on an OTS-OCH, optics, or OTS controller whenever the optical power of the incoming signal drops below the set RX-low-threshold on the corresponding controllers.

Clear the RX-POWER-FAIL-LOW Alarm

Procedure


Step 1

Check if the threshold settings is as per expected system performance.

Step 2

Check if the receiving power is correct or is missing due to a fiber cut or connected to a removed channel.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Sensor in a Failure State

Default Severity: Critical, Service Affecting

Logical Object: EQUIPMENT

The Sensor in a Failure State alarm is raised when there is a failure indication in the sensor.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).

SWITCH-TO-PROTECT

Default Severity: NotAlarmed (NA), Not-Service-Affecting

Logical Object: OTS

The Switch to Protect alarm is raised when the status of "Protect" Controller type is Active and the status of "Working" Controller is Standby status. (Only PSM OTS controllers 1 and 2 have these types).

Clear the SWITCH-TO-PROTECT Alarm

Procedure


This alarm is cleared when a switch event happens.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


Temperature Alarm

Default Severity: Critical (CR), Not Service Affecting (NSA)

Logical Object: EQUIPMENT

The Temperature alarm is raised when the temperature of a sensor exceeds normal operating range. The alarm appears in the form of:

  • [sensor name]: temperature alarm.

Clear the Temperature Alarm

Procedure


Step 1

Verify the temperature of Cisco NCS1001 or the temperature of the optical modules displayed on the chassis.

Step 2

Verify the environmental temperature of the room is not abnormally high.

Step 3

Verify the functioning of fans and ensure that the air flow for the fans is proper.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


TX-POWER-FAIL-LOW

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: OTS, OTS-OCH

The TX Power Fail Low alarm is raised when the transmit optical power is below the tx-low-threshold.

Clear the TX-POWER-FAIL-LOW Alarm

Procedure


Step 1

Check if the threshold settings are as per the expected system performance.

Step 2

Check if the corresponding receiving power is correct. For example, an OTS Controller 1 TX receives power from Controller 0 RX.

Step 3

Check for any hardware failure.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


CARLOSS (GE)

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: GE

The CARLOSS alarm for Gigabit Ethernet (GE) controller occurs when the GE fiber cable is disconnected at the optical controller port.

Clear the CARLOSS (GE) Alarm

Procedure


Step 1

Check that OSC SFP Rx port fiber is plugged into the EDFA appropriate slot where the alarm is raised.

Step 2

If the fiber is correctly plugged, you need to clean the connector or change the fiber to clear the alarm.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).