Guest

Synchronous Optical NETwork (SONET)

Troubleshooting Bit Error Rate Errors on SONET Links

Document ID: 16149

Updated: Sep 12, 2005

   Print

Introduction

This document explains bit interleaved parity (BIP-8) checks on frames that a packet over SONET (POS) router interface transmits.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

  • SONET (Synchronous Optical NETwork).

  • GSR (Gigabit Switch Router).

  • ESR (Edge Services Router).

Components Used

This document is not restricted to specific software and hardware versions.

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Conventions

Refer to Cisco Technical Tips Conventions for more information on document conventions.

Background Information

When the number of BIP errors crosses a threshold that you can configure, the router reports log messages similar to this:

Feb 22 08:47:16.793: %LINEPROTO-5-UPDOWN: Line protocol on Interface POS3/0, 
changed state to down 
Feb 22 08:47:16.793: %OSPF-5-ADJCHG: Process 2, Nbr 12.122.0.32 on POS3/0 
from FULL to DOWN, Neighbor Down 
Feb 22 08:48:50.837: %SONET-4-ALARM:  POS3/0: SLOS 
Feb 22 08:48:52.409: %LINK-3-UPDOWN: Interface POS3/0, changed state to down 
Feb 22 08:50:47.845: %SONET-4-ALARM:  POS3/0: B1 BER exceeds threshold, 
TC alarm declared 
Feb 22 08:50:47.845: %SONET-4-ALARM:  POS3/0: B2 BER exceeds threshold, 
TC alarm declared 
Feb 22 08:50:47.845: %SONET-4-ALARM:  POS3/0: B3 BER exceeds threshold, 
TC alarm declared 
Feb 22 08:50:52.922: %SONET-4-ALARM:  POS3/0: SLOS cleared 
Feb 22 08:50:54.922: %LINK-3-UPDOWN: Interface POS3/0, changed state to up

This document provides tips on how to troubleshoot threshold-crossing (TC) bit error rate (BER) alarms.

BIP-8 Bytes in SONET Overhead

SONET is a protocol that uses an architecture of layers: section, line and path. Each layer adds some number of overhead bytes to the SONET frame, as illustrated here:

Path Overhead
Section Overhead A1 Framing A2 Framing A3 Framing J1 Trace
B1 BIP-8 E1 Orderwire E1 User B3 BIP-8
D1 Data Com D2 Data Com D3 Data Com C2 Signal Label
Line Overhead H1 Pointer H2 Pointer H3 Pointer Action G1 Path Status
B2 BIP-8 K1 K2 F2 User Channel
D4 Data Com D5 Data Com D5 Data Com H4 Indicator
D7 Data Com D8 Data Com D9 Data Com Z3 Growth
D10 Data Com D11 Data Com D12 Data Com Z4 Growth
S1/Z1 Sync Status/Growth M0 or M1/Z2 REI-L Growth E2 Orderwire Z5 Tandem Connection

Importantly, each layer uses a single, interleaved parity byte to provide error monitoring across a particular segment, along the end-to-end SONET path. This parity byte is known as BIP-8, which is an abbreviation for bit interleaved parity. BIP-8 performs an even-parity check on the previous Synchronous Transport Signal level 1 (STS-1) frame.

During the parity check, the first bit of the BIP-8 field is set so that the total number of ones in the first bit of all octets of the previously scrambled STS-1 frame is an even number. The second bit of the BIP-8 field is used exactly the same way, except that this bit performs a check on the second bits of each octet, and so on.

The Bellcore GR-253 standard for SONET networks defines the bytes over which a particular parity error is calculated. This table describes the portion of the SONET frame that a particular BIP byte covers:

Byte Portion of Frame Covered Span Monitored Error Indication
B1 Entire frame, after scrambling. Monitors bit errors between two adjacent STEs (Section Terminating Equipment), such as a regenerator. Differences indicate the occurrence of section-level bit errors.
B2 Line overhead and synchronous payload envelope (SPE) (including path overhead and payload), before scrambling. Monitors bit errors between two adjacent LTEs (Line Terminating Equipment), such as an Add/Drop Multiplexer (ADM) or DCS. Differences indicate the occurrence of line-level bit errors.
B3 SPE (including path overhead and payload), before scrambling. Monitors bit errors between two adjacent Path Terminating Equipments (PTEs), such as two router POS interfaces. Differences indicate the occurrence of path-level bit errors.

When Do Particular BIP Errors Occur?

Under some conditions, the output of the show controllers pos command reports only one level of BIP errors. The reason is that the reported BIP errors vary depending on where the code violation or bit flip actually occurs. In other words, parity bytes monitor and detect errors over different parts of a SONET frame. A BIP error can occur anywhere in the frame.

This diagram illustrates a typical SONET network:

biterrorrate_16149.gif

When you connect two router POS interfaces point to point, over a dense wavelength division multiplexing (DWDM) link without intermediate SONET or Synchronous Digital Hierarchy (SDH) equipment, all three BIP mechanisms monitor the same segment, and typically detect the same errors. However, in this configuration, B2 must provide the most accurate bit error count.

An increment in B1 and B2 errors, without an increment in B3 errors is statistically improbable. This condition occurs only if the errors affect parts of the frame that the B3 byte does not monitor. Recall that the B3 byte covers the path overhead and payload section.

An increment in B3 errors points to a corrupt SPE or payload portion. The path overhead does not change until a remote PTE terminates the SONET frame. ADMs and regenerators do not terminate the path overhead and must not report B3 errors. Thus, a condition in which B3 errors increase only indicates that either the local or remote router interface corrupts the path overhead or payload.

In addition, when the B3 check covers the longest span, the chance of bit flips is greater. Typically, the end-to-end path spans a few monitored segments between LTEs. The B2 parity check must monitor these segments.

SONET interfaces must not report an increase in BIP errors during a loss of signal or loss of frame alarm condition. However, a burst of B1 errors can occur during the time the interface takes to declare the alarm. This burst can last for up to 10 seconds, which is the interval at which the line cards in the Cisco 12000 and 7500 router series report statistics to the central route processor.

In addition, you must understand that BIP errors have different error detection resolutions, which are explained here:

  • B1: B1 can detect up to eight parity errors per frame. This level of resolution is not acceptable at OC-192 rates. Even-numbered errors can elude the parity check on links with high error rates.

  • B2: B2 can detect a far higher number of errors per frame. The exact number increases as the number of STS-1s (or STM-1s) increases in the SONET frame. For example, an OC-192/STM-64 produces a 192 x 8 = 1536 bit-wide BIP field. In other words, B2 can count up to 1536 bit errors per frame. There is considerably less chance of an even-numbered error that eludes the B2 parity calculation. B2 offers superior resolution when compared to B1 or B3. Therefore, a SONET interface can report B2 errors only for a particular monitored segment.

  • B3: B3 can detect up to eight parity errors in the entire SPE. This number produces acceptable resolution for a channelized interface because, (for example) each STS-1 in an STS-3 has a path overhead and B3 byte. However, this number produces poor resolution over concatenated payloads in which a single set of path overhead must cover a relatively large payload frame.

    Note: When you initiate an IOS reload or a microcode reload, the POS interface is reset, and so is the framer. The reset downloads the microcode on the interface again. In some cases, this process can generate a small burst of bit errors.

BER

The BER counts the number of detected BIP errors. In order to calculate this value, compare the number of bit errors to the total number of bits transmitted per unit of time.

Set BER Thresholds

POS interfaces use the BER to determine whether a link is reliable. The interface changes the state to down if the BER exceeds a threshold that you can configure.

All three SONET layers use a default BER value of 10e-6. The show controllers pos command displays the current values.

RTR12410-2#show controllers pos 6/0
POS6/0 
SECTION 
  LOF = 0    LOS    = 2                         BIP(B1) = 63 
LINE 
 AIS = 0     RDI    = 1          FEBE = 1387    BIP(B2) = 2510 
PATH 
  AIS = 0    RDI    = 1          FEBE = 17      BIP(B3) = 56 
  LOP = 2    NEWPTR = 0          PSE  = 0       NSE     = 0 
Active Defects: None 
Active Alarms:  None 
Alarm reporting enabled for: SF SLOS SLOF B1-TCA B2-TCA PLOP B3-TCA 
Framing: SONET 
APS 
  COAPS = 8          PSBF = 1 
  State: PSBF_state = True 
  ais_shut = FALSE 
  Rx(K1/K2): 00/00  S1S0 = 00, C2 = CF 
  Remote aps status working; Reflected local aps status non-aps 
CLOCK RECOVERY 
  RDOOL = 0 
  State: RDOOL_state = False 
PATH TRACE BUFFER : STABLE 
  Remote hostname : 12406-2 
  Remote interface: POS2/0 
  Remote IP addr  : 48.48.48.6 
  Remote Rx(K1/K2): 00/00  Tx(K1/K2): 00/00 
BER thresholds:  SF = 10e-3  SD = 10e-6 
TCA thresholds:  B1 = 10e-6  B2 = 10e-6  B3 = 10e-6

Use the pos threshold command to adjust the threshold values from the defaults.

router(config-if)#pos threshold ? 
  b1-tca  B1 BER threshold crossing alarm 
  b2-tca  B2 BER threshold crossing alarm 
  b3-tca  B3 BER threshold crossing alarm 
  sd-ber  set Signal Degrade BER threshold 
  sf-ber  set Signal Fail BER threshold

Signal failure (SF) BER and signal degrade (SD) BER are sourced from B2 BIP-8 error counts (as is B2-TCA). However, SF-BER and SD-BER feed into the automatic protection switching (APS) machine, and can lead to a protection switch (if you have configured APS).

B1 BER Threshold Crossing Alert (B1-TCA), B2-TCA, and B3-TCA only print a log message to the console if you have enabled reports for them.

Report BIP Errors

The pos report {b1-tca | b2-tca | b3-tca } command allows you to configure the SONET alarms that you want to report. A router common reports TC alarms when the router declares a path-level or line-level alarm.

This sample output shows how a POS interface on a Cisco router reports a high BER.

Aug  7 04:32:41 BST: %SONET-4-ALARM:  POS4/6: B1 BER exceeds threshold, 
TC alarm declared 
Aug  7 04:32:41 BST: %SONET-4-ALARM:  POS4/6: B2 BER exceeds threshold, 
TC alarm declared 
Aug  7 04:32:41 BST: %SONET-4-ALARM:  POS4/6: SD BER exceeds threshold, 
TC alarm declared 
Aug  7 04:32:41 BST: %SONET-4-ALARM:  POS4/6: B3 BER exceeds threshold, 
TC alarm declared 
Aug  7 04:32:44 BST: %SONET-4-ALARM:  POS4/6: SLOF cleared 
Aug  7 04:32:44 BST: %SONET-4-ALARM:  POS4/6: PPLM cleared 
Aug  7 04:32:44 BST: %SONET-4-ALARM:  POS4/6: LRDI cleared 
Aug  7 04:32:44 BST: %SONET-4-ALARM:  POS4/6: PRDI cleared 
Aug  7 04:32:46 BST: %LINK-3-UPDOWN: Interface POS4/6, changed state to up 
Aug  7 04:32:47 BST: %LINEPROTO-5-UPDOWN: Line protocol on Interface POS4/6, 
changed state to up

How Does a Router Respond to BIP Errors?

When a Cisco POS interface detects a BIP error, the interface does not discard the frame. The reason is that the BIP value carried in the current frame is the value calculated on the previous frame. In order to calculate the BIP value on the entire frame, the entire frame needs to be created. At SONET speeds, a frame is quite large and would occupy a large amount of buffer resources. The actual approach is to avoid any delay in sending the frame that normally occurs until the parity calculation. This approach minimizes buffer requirements. Parity calculation occurs after the actual transmission of the frame.

For example, the parity value for frame 100 is placed in the BIP field of frame 101.

As long as the SONET framer can maintain frame alignment, the frame is sent to the layer-2 protocol. If the layer-2 data within the frame is corrupt, the frame is dropped as a cyclic redundancy check (CRC).

Steps to Troubleshoot

Use these steps to troubleshoot the SONET alarms and defects that this document describes:

  • Check the optical power levels. Ensure that the link has sufficient attenuation.

  • Ensure that bad or dirty fiber does not cause the bit errors. Complete these steps:

    1. Clean the physical fiber and the interfaces.

    2. Swap the cables.

    3. Check any patch panels.

  • Ensure proper clock settings.

  • Draw out the topology, and check for any transport devices or signal regenerators in between the two ends. Check and clean these devices also.

  • Perform hard loopback tests. Loop a single strand of fiber into the transmit and receive connectors of the interface. Then ping the IP address of the interface to ensure that the interface is capable of actual data flow. For more information, refer to Understanding Loopback Modes on Cisco Routers.

  • When you contact the Cisco Technical Assistance Center (TAC):

    1. Collect output from the show running-config command.

    2. Collect output from the show controllers pos details command. Determine the number of SONET-level bit errors.

    3. Execute the clear counters command.

    4. Wait a few minutes.

    5. Capture the output of show controllers pos details command again for the same interface.

Here is a table that appears in the Cisco 10000 Series ESR Troubleshooting Guide. This table provides the steps to troubleshoot BIP TC alarms.

Note: A known issue with Gigabit Switch Router (GSR) POS cards is that a hard loop results in ping loss because the GSR rate-limits packets are pushed to the Gigabit Route Processor (GRP). For more information, refer to Cisco bug ID CSCea11267 (registered customers only) .

Alarm Type and Severity Alarm Symptoms Recommendation
TCA_B1 Threshold crossing alarm - B1 Minor For alarm types:
  • TCA_B1
  • TCA_B2
  • TCA_B3
Alarm messages appear in the CLI and logs.
In all cases, test the quality of the cables and connections.
TCA_B2 Threshold crossing alarm - B2 Minor - Same as TCA_B1.
TCA_B3 Threshold crossing alarm - B3 Minor - Same as TCA_B1.
BER_SF Signal Fail condition Minor BER_SF and BER_SD alarms result in APS cutovers. In both cases, test the quality of the cables and connections.
BER_SD Signal degrade condition Minor - You can specify these BER thresholds.

Bit Errors on ATM Interfaces

Campus ATM switches, for example, the LightStream 1010 and Catalyst 8500, do not support a command to configure the TC alarm value on ATM over SONET interfaces.

Sep 19 02:21:44: %SONET-4-ALARM:  ATM11/0/0: B1 BER below threshold, 
TC alarm cleared 
Sep 19 02:21:44: %SONET-4-ALARM:  ATM11/0/0: B2 BER below threshold, 
TC alarm cleared 

Troubleshoot TC alarms on ATM switches with the same steps as on POS interfaces. Bit errors point to a physical layer problem between the ATM switch and other devices in the path.

Related Information

Updated: Sep 12, 2005
Document ID: 16149