Guest

Cisco ONS 15454 Series Multiservice Provisioning Platforms

Monitor Synchronization Performance and Troubleshoot Timing Alarms on ONS 15454

Cisco - Monitor Synchronization Performance and Troubleshoot Timing Alarms on ONS 15454

Document ID: 65121

Updated: Jan 05, 2006

   Print

Introduction

This document explains how you can monitor synchronization performance, and troubleshoot timing alarms on Cisco ONS 15454.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

Components Used

The information in this document is based on these software and hardware versions:

  • Cisco ONS 15454 NEBS/ANSI (SW 2.X minimal timing advances, 3.X, 4.X – 5.X latest timing advances)

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Background Information

This section provides the relevant background information on timing as seen on the ONS 15454.

Node Timing Architecture

The ONS 15454 supports SONET standard-compliant timing and synchronization. Standards with which the ONS 15454 complies include:

  • Telecordia GR-253, SONET Transport Systems, Common Generic Criteria

  • Telecordia GR-436, Digital Network Synchronization Plan

The ONS 15454 platforms implement timing and synchronization functions in the TCC Timing Control Card. A redundant architecture protects against failure or removal of one common control card. For timing reliability, the TCC card is able to synchronize on one of these three timing references:

  • Primary timing reference

  • Secondary timing reference

  • Third synchronization reference

You can select the three timing references from these timing sources:

  • Two Building Integrated Timing Supply (BITS) clocks inputs (External mode)

  • All of the synchronous optical interfaces (Line mode)

  • An internal, free-running Stratum 3 enhanced clock

A slow-reference tracking loop allows the common control cards to track the selected timing reference and provide ‘holdover’ timing (or timing reference memory) when all references fail. In a fail-over scenario, availability of the next best timing reference (or clock quality) governs the selection of the next timing reference. The Stratum hierarchy defines the next best timing reference. In summary, here is a list of the timing modes available in the ONS 15454:

  • External (BITS) timing

  • Line (Optical) timing

  • Internal / Holdover (automatically available when all references fail)

  • Internal / Free-running

Stratum Levels

The American National Standards Institute (ANSI) standard entitled "Synchronization Interface Standards for Digital Networks" released as ANSI/T1.101-1998 defines the stratum levels and minimum performance criteria. This table provides a summary:

Stratum Accuracy, Adjustment Range Pull-In-Range Stability Time To First Frame Slip *
1 1 x 10-11 N/A N/A 72 Days
2 1.6 x 10-8 Must be able to synchronize to the clock with an accuracy of +/-1.6 x 10-8 1 x 10-10/day 7 Days
3E 4.6 x 10-6 Must be able to synchronize to the clock with an accuracy of +/-4.6 x 10-6 1 x 10-8/day 17 Hours
3 4.6 x 10-6 Must be able to synchronize to the clock with an accuracy of +/-4.6 x 10-6 3.7 x 10-7/day 23 Minutes
SONET Minimum Clock 20 x 10-6 Must be able to synchronize to the clock with an accuracy of +/-20 x 10-6 Not yet specified Not Yet Specified
4E 32 x 10-6 Must be able to synchronize to the clock with an accuracy of +/-32 x 10-6 Same as Accuracy Not Yet Specified
4 32 x 10-6 Must be able to synchronize to the clock with an accuracy of +/-32 x 10-6 Same as Accuracy N/A

* In order to calculate the slip rate from drift, assume a frequency offset equal to the drift in 24 hours, which accumulates bit slips until 193 bits (frame) accumulate. Drift rates for various atomic and crystal oscillators are well known. However, drift rates are usually neither linear nor continually on an increase.

Jitter, Wander and Slips

Jitter and Wander

Jitter is the instantaneous deviation of a digital signal (frequency) from the nominal value (that is, the reference clock). Jitter commonly occurs when digital signals pass through network elements that use stuffing bits in the transmission protocol. The removal of these stuffing bits can cause jitter. You can express jitter in terms of Unit Interval (UI). UI is the nominal period of one bit. Express jitter as a fraction of one UI. For example, at a data rate of 155.52 Mbits/s, one UI is equivalent to 6.4 ns.

Wander is very slow jitter (frequency less than 10 Hz). When you design the synchronization distribution sub-system for a network, your targets for sync performance must be zero slips and zero pointer adjustments during normal conditions. You can express wander in terms of TIE (Time Interval Error). TIE represents the phase difference between a clock signal under test and a reference source.

Minimize Jitter and Wander

Reduce the number of nodes that use daisy-chain and Line timing in order to minimize Wander in a line-timed network. In order to distribute timing through a multiple-node SONET ring, distribute the timing from the node that uses BITS timing in both the east and west directions rather than using daisy chain in a single direction. When you do so, you can minimize wander.

By design, SONET equipment works ideally in a synchronous network. When the network is not synchronous, use mechanisms such as pointer processing and bit-stuffing. Otherwise, jitter and wander tend to increase.

Timing Slips

Some DS-1 sources use slip buffers that enable you to perform controlled slips of the DS-1 signal. ONS 15454 does not support controlled slips on synchronization inputs.

Monitor Pointer Justification Count Performance

Use Pointers to compensate for frequency and phase variations. Pointer justification counts indicate timing errors on SONET networks. When a network is out of synchronization, jitter and wander occur on the transported signal. Excessive wander can cause terminating equipment to slip.

Slips cause different effects in service. For example, intermittent audible clicks interrupt voice service. Similarly, compressed voice technology faces short transmission errors or dropped calls; fax machines lose scanned lines or experience dropped calls; digital video transmission shows distorted pictures or frozen frames; encryption service loses the encryption key, and causes re-transmission of data.

Pointers provide a way to align the phase variations in STS and VT payloads. You can find the STS payload pointer in the H1 and H2 bytes of the line overhead. You can measure clocking differences by the offset in bytes from the pointer to the first byte of the STS synchronous payload envelope (SPE) called the J1 byte. Clocking differences that exceed the normal range of 0 to 782 can cause data loss.

You must understand positive pointer justification count (PPJC) parameters and negative pointer justification count (NPJC) parameters. PPJC is a count of path-detected (PPJC-PDET-P) or path-generated (PPJC-PGEN-P) positive pointer justifications. NPJC is a count of path-detected (NPJC-PDET-P) or path-generated (NPJC-PGEN-P) negative pointer justifications based on the specific PM name. PJCDIFF is the absolute value of the difference between the total number of detected pointer justification counts and the total number of generated pointer justification counts. PJCS-PDET-P is a count of the one-second intervals that contain one or more PPJC-PDET or NPJC-PDET. PJCS-PGEN-P is a count of the one-second intervals that contain one or more PPJC-PGEN or NPJC-PGEN.

A consistent pointer justification count indicates clock synchronization problems between nodes. A difference between the counts means the node that transmits the original pointer justification has timing variations with the node that detects and transmits this count. Positive pointer adjustments occur when the frame rate of the SPE is too slow in relation to the rate of the STS-1.

Monitor Synchronization Performance

Pointer Justification Counts (PJCs) record the pointer activity at Synchronous Transport Signal level 1 (STS-1) and Virtual Tributary level 1.5 (VT1.5). You can use PJCs to detect synchronization problems. PJCs also help you to troubleshoot payload jitter and wander degradation. When a network is not synchronized, jitter and wander occur on the transported signal.

ONS 15454 defines these two PJCs:

  • PJC-Det—The number of incoming pointer adjustments.

  • PJC-Gen—The number of outgoing pointer adjustments.

Two numbers are used because of a possible mismatch due to internal buffers. Internal buffers absorb a certain number of pointer adjustments. Buffers attenuate wander in the network.

Here are some guidelines to interpret these numbers:

  • You can infer the occurrence of wander attenuation if PJ-Det is non-zero and PJ-Gen is 0 or lower than PJ-Det.

  • You can identify the presence of a synchronization problem upstream in the network if PJ-Det is non-zero and PJ-Gen is non-zero and roughly equal to PJ-Det. This problem is not local.

  • You can identify the occurrence of a synchronization problem between this node and the node directly upstream if PJ-Gen is significantly greater than PJ-Det.

Several thresholds are defined for PJCs. When the thresholds are crossed, Threshold Crossing Alarms (TCAs) are generated. This table lists these TCAs:

TCA Description
T-PJ-DET Pointer Justification Detected
T-PJ-DIFF Pointer Justification Difference
T-PJ-GEN Pointer Justification Generated
T-PJNEG Negative Pointer Justification
T-PJNEG-GEN Negative Pointer Justification Generated
T-PJPOS Positive Pointer Justification
T-PJPOS-GEN Positive Pointer Justification Generated

Troubleshoot Timing Alarms

The table in this section defines synchronization related events, alarms or conditions that help you monitor and troubleshoot synchronization issues. Some alarms are more important than others. Repeated occurrence of alarms or conditions warrants further investigation.

Alarm Description Severity Alarm Information
EQPT FAIL Equipment Failure CR, SA This alarm indicates equipment failure for the indicated slot. See the EQPT FAIL Alarm section for more information.
FRNGSYNC Free-running Synchronization Mode NA, NSA The reference in this alarm is the internal Stratum 3 clock. See the Internal (Free-Running) Synchronization section for more information.
FSTSYNC Fast-start Synchronization mode NA, NSA TCC chooses a new timing reference to replace the previous failed reference. The FSTSYNC alarm usually clears after approximately 30 seconds. See the Fast-start Sync (FSTSYNC) Alarm section for more information.
HLDOVRSYNC Holdover Synchronization Mode MJ, SA for Release 4.5 NA, NSA for Release 4.1 This alarm indicates a loss of the primary or secondary timing reference. The TCC uses the formerly acquired reference. See the Holdover (HLDOVRSYNC) Alarm section for more information.
LOF (BITS) Loss of Frame (BITS) MJ, SA This alarm indicates that the TCC loses frame delineation in the incoming data from BITS.
LOS (BITS) Loss of Signal (BITS) MJ, SA This alarm occurs when the BITS clock or the connection to the BITS clock fails.
MANSWTOINT Manual Switch To Internal Clock NA, NSA This condition occurs if you manually switch the NE timing source to the internal timing source.
MANSWTOPRI Manual Switch To Primary Reference NA, NSA This condition occurs if you manually switch the NE timing source to the primary timing source.
MANSWTOSEC Manual Switch To Second Reference NA, NSA The condition occurs if you manually switch the NE timing source to the secondary timing source.
MANSWTOTHIRD Manual Switch To Third Reference NA, NSA The condition occurs if you manually switch the NE timing source to the third timing source
SWTOPRI Synchronization Switch to Primary Reference NA, NSA The condition occurs when the TCC switches to the primary timing source.
SWTOSEC Synchronization Switch to Secondary Reference NA, NSA The condition occurs when the TCC switches to the secondary timing source.
SWTOTHIRD Synchronization Switch to Third Reference NA, NSA The condition occurs when the TCC switches to the third timing source.
SYNC-FREQ Synchronization Reference Frequency Out Of Bounds NA, NSA The condition is reported against any reference that is out of the bounds for valid references.
SYNCPRI Loss of Timing on Primary Reference MN, NSA This alarm occurs when the primary timing source fails, and timing switches to the secondary timing source. The switch to the secondary timing source also triggers the SWTOSEC alarm
SYNCSEC Loss of Timing on Secondary Reference MN, NSA This alarm occurs when the secondary timing source fails, and timing switches to the third timing source. The switch to the third timing source also triggers the SWTOTHIRD alarm
SYNCTHIRD Loss of Timing on Third Reference MN, NSA This alarm occurs when the third timing source fails. If SYNCTHIRD occurs when the internal reference is the source, check whether the TCC card had failed. Thereafter either FRNGSYNC or HLDOVRSYNC is reported.

Note: CR - Critical, MJ – Major, MN – Minor, SA – Service Affecting, NA – Not Alarmed, NSA – Not Service Affecting

The next section describes two of the alarms mentioned in Table 2 in more detail.

EQPT FAIL Alarm

Software releases 3.2 and later contain a new feature to monitor the standby TCC. This feature helps you identify the presence of a hardware problem. The active TCC collects frequency data from the standby TCC, and evaluates the results every 40 seconds. If one TCC reports a synchronized signal, and the other TCC reports an OOS signal, the active TCC interprets this as a TCC hardware failure. In such a situation, the active TCC issues an EQPT FAIL alarm. If the active TCC detects an OOS signal, the TCC is automatically reset.

Holdover (HLDOVRSYNC) Alarm

Holdover occurs when a clock loses external references, but continues to use reference information acquired during normal operation. Holdover refers to a failover state after a system clock continuously locks and synchronizes to a more accurate reference for more than 140 seconds. In other words, the clock “holds” the original operating parameters for a predefined period. The holdover frequency begins to drift over time, particularly when the “holdover period” expires. Holdover occurs when:

  • The external BITS timing reference fails.

  • The optical line timing reference fails.

Holdover frequency refers to a measure of the performance of a clock while in holdover mode. The holdover frequency offset for Stratum 3 is 50 x 10-9 initially (the first minute), and an additional 40 x 10-9 for the next 24 hours.

Holdover mode continues indefinitely until a better reference is available again. If the system tracks the active reference for less than 140 seconds before the system loses reference, the system goes into Free-Running mode. Typically, the TCC with a stratum 3 enhanced phase lock loop circuitry holds the clock reference for over 17 hours before the first slip occurs. If the holdover frequency value is corrupt, the ONS 15454/327 switches to Free-Running mode.

Internal (Free-Running) Synchronization

The ONS 15454 has an internal clock in the TCC that tracks a higher quality reference, or in the event of node isolation, provides holdover timing or a free-running clock source. The internal clock is a certified Stratum 3 clock with enhanced capabilities that match the Stratum 3E specifications for:

  • Free-run accuracy

  • Holdover frequency drift

  • Wander tolerance

  • Wander generation

  • Pull-In and Hold-In

  • Reference locking/Settling time

  • Phase transient (tolerance and generation)

Fast-start Sync (FSTSYNC) Alarm

This alarm occurs when the TCC enters into Fast-start Synchronization mode and attempts to lock in with the new reference. This issue often occurs due to the failure of a previous timing reference. The FSTSYNC alarm disappears after approximately 30 seconds. The system clock locks into the new reference. If the alarm does not clear or the alarm recurs continuously, you must check for signal corruption of the incoming reference.

During the manufacturing process, the TCC is calibrated to a Stratum 1 Clock source. The calibration information is stored on TCC flash. When you first power up, the TCC loads the calibration database. The TCC then collects 30 seconds of incoming reference data, and compares the data with the local TCC database. If the difference exceeds 4 ppm, the TCC automatically enters a "Fast-start Synchronization Mode". In the Fast-start Synchronization Mode, TCC attempts to quickly synchronize the System Clock to the incoming clock.

When TCC achieves synchronization, the TCC collects 30 seconds of post-qualification data. Synchronization can take a few minutes, based on the extent of the clock variation. The TCC uses the post-qualification data to verify successful synchronization. Thereafter, the TCC proceeds with normal operation. When a distorted input signal is received, the TCC reports continual mismatches in the clock data. These reports result in an infinite cycle within the Fast-start Synch Mode.

Related Information

Updated: Jan 05, 2006
Document ID: 65121