Channel Redundancy in the Cisco D9036-based Video Compression Platform
PDF(1.2 MB) View with Adobe Reader on a variety of devices
Updated:November 20, 2012
This white paper describes the Cisco® reference architecture of the 9036 - DCM - Statmux Compression System. We'll provide an overview of how the system behaves under normal circumstances as well as transitions under failover behavior. We'll also describe the redundancy behavior of the solution, provide details on typical system configurations, and explain how to operate the system with Cisco ROSA System.
The Cisco D9036
® - DCM - Statmux Compression System consists of a number of platforms that need to seamlessly work together in order to have a fully operational system. This paper assumes that the reader is familiar with the following products:
• The Cisco Modular Encoding Platform D9036
• The Cisco Digital Content Manager (DCM) platform
• The Cisco ROSA Video Services Management Suite
For service providers who deliver digital content, digital television service has become the primary revenue source. With the ever-increasing demand for higher density, service uptime, and better coding efficiency and picture quality, Cisco has introduced a new encoding engine
D9036 with a clear focus on video and audio quality, bandwidth efficiency, and 24/7 system reliability.
The Cisco reference architecture of the 9036 - DCM - statmux compression system provides superior video and audio quality by statistically multiplexing both high-definition (HD) and standard definition (SD) services. In addition, the reference architecture is a full-redundant solution, with state-of-the-art redundancy mechanisms that exceed 99.999 percent ("five-nines") uptime.
Here are some important general trends in video:
• Service providers and broadcasters are looking for dense solutions as the number of channels increases daily. (Dense means "minimum amount of hardware boxes and minimum amount of energy consumption).
• It's critical to have a lot of prime channels and HD channels to generate revenue.
• Modern digital TV headends should meet very high channel uptimes (99.999 percent) as a standard, but traditional one-to-many backup device redundancy schemes cannot meet these requirements.
• With greater densityyou need less hardware, which means fewer failures.
Cisco's solution meets these needs with the following features:
• High availability 1+1 architecture with intelligent, automatic program rerouting and protection mechanisms provided through IP routing.
• With 1:1 redundancy, Cisco can exceed 99.999 percent channel uptime.
• The combination of the Cisco Digital Content Manager (video processor) and the Cisco ROSA control system provides stream-level redundancy. In the event of a failure, only the affected services are restored; other services are not affected.
• The ROSA control system also helps to ensure the video quality and bandwidth of the complete platform, while redundancy is provided through the Digital Content Manager.
• The Cisco solution provides very low impact at failover which provides to 1:1 operation of devices.
Components of the Solution
The Cisco reference architecture of the 9036 - DCM - Statmux Compression System consists of the Cisco Modular Encoding Platform D9036, the Cisco DCM Series D9900 Digital Content Manager MPEG Processor, and the Cisco ROSA System.
The Cisco Modular Encoding Platform D9036
The Cisco D9036 Modular Encoding Platform provides multi-resolution, multi-format encoding for applications requiring high levels of video quality. The modular platform is scalable to support as many as eight standard definition (SD), four high definition (HD), or other combinations of video encoders within a single rack unit, while providing excellent broadcast quality video and consuming as little as 40 watts per service.
The Cisco DCM Series D9900 Digital Content Manager MPEG Processor
The Cisco DCM Series D9900 Digital Content Manager MPEG Processor is a compact 2RU or 1RU platform capable of processing a high number of MPEG video streams. The Cisco DCM Series MPEG Processor is the next generation of intelligent headend processing equipment, a cost-effective solution that combines compactness and flexibility. Based on our experience, the DCM Series D9900 processor brings operational and economic benefits in MPEG processing applications. The optional built-in DVB scrambler allows easy integration with several conditional access (CA) systems.
The Cisco ROSA Video Services Management Suite
The Cisco ROSA Video Services Management Suite provides a complete solution for end-to-end management of digital platforms. It monitors, manages, and controls equipment and services throughout the network.
The ROSA system delivers major benefits to help you:
• Reduce complexity: Intuitive GUI for rapid service configuration, service troubleshooting, and service alarming, and on-screen network topology schematics help decrease configuration errors.
• Increase uptime: Management of service redundancy through automated backup capabilities helps reduce mean time to repair (MTTR) by providing a reroute or reconfigure service in the event of a device failure or service interruption.
• Decrease operating expenses: Extensive fault and alarm management make troubleshooting quick and easy, and GUI helps reduce costly configuration errors.
• Simplify operations: Dynamic reconfiguration keeps the network to run as scheduled. The system also allows you to make manual changes to video or audio service profiles.
• Evaluate proactively: Recording and analyzing the historical performance of the network extends the mean time between failure (MTBF).
Reference Architecture for the Solution
To help ensure that you achieve high uptime, flexible management, and platform support, Cisco advises its customers to work with solution that has a reference architecture. The reference architecture for the Cisco video compression platform is fully tested in the lab and through its use by Cisco's customer base.
Figure 1 presents the solution's fully 1:1 redundant architecture, giving you full coverage for both minimized failover times as well as full protection against failing devices, links, and interfaces.
Figure 1. Cisco Reference Architecture of the 9036 - DCM - Statmux Compression System
As Figure 1 shows, the system consists of 1:1 redundant Cisco D9036 multichannel, modular encoders, 1:1 Digital Content Managers, 1:1 redundant Cisco Catalyst
® 4948 Switches, and a redundancy management IP network carried over the Cisco Catalyst 2960 Series Switches. Note that in this example, the Cisco Digital Content Manager also has the statmux controller integrated.
The reference architecture works with an intelligent and prioritized interpretation of various alarms reported by the system components (Cisco D9036, the Cisco IP router, and Cisco DCM). Some of the alarms call only for stream-level redundancy, which leaves the nonimpacted streams untouched and only switches over the failing streams (for example, serial digital interface [SDI] loss on the D9036 and transport stream [TS] loss on the DCM). Other alarms call for a device-level failover because they impact all streams on the device (for example, loss of both video ports on the D9036). ROSA Element Manager (EM) always makes sure the correct decision is taken, no matter which combination of alarms are raised or cleared in the system. Cisco has thoroughly optimized and tested ROSA EM together with redundancy features on D9036 and DCM to provide best-in-class redundancy schemes that can recover even if multiple failures occur in the system.
Two Types of Redundancy
The Cisco video compression platform provides two complementary types of redundancy:
• Stream-level redundancy: Only the impacted channels are switched from primary device to the backup device, which avoids the unnecessary switching of non-impacted channels.
• Device-level redundancy: Here the complete device is switched over from primary to backup device.
Under normal operation, the system behaves as shown in
Figure 2. Normal Operation of the Cisco Reference Architecture of the 9036 - DCM - Statmux Compression System
In this example, the video Channels
are encoded by a single D9036 device that is part of a single statmux pool. The output of the solution is represented in the outgoing service bit rate chart, present on the DCM output.
Stream-Level Redundancy Protection
Stream-level redundancy is based on the fact that the system protects the individual services instead of the complete device. This gives significant additional advantages with respect to outage time of services. Instead of failing over the complete device, Cisco opted to only fail over the failing services and to keep the nonimpacted channels untouched.
Figure 3 shows a typical example of the system output for a single service stream failover. As you can see, only the channel (the green, shown by the
) is lost for a very minimal amount of time. The other channels remain streaming from the main encoder. As the outgoing service bit rate chart in the figure shows, the green channel is recovered at the speed of a transport stream backup (TSBU) of the Cisco DCM.
Figure 3. Cisco Reference Architecture of the 9036 - DCM - Statmux Compression System: Stream Level Redundancy Active
How Stream-Level Redundancy Works
The Cisco Digital Content Manager is at the heart of achieving very fast failover timing. Cisco's market proven concept of transport stream backup, combined with the ROSA system, helps to guarantee the system's output signals and protect against various failure conditions.
The ROSA system is able to detect whether an individual stream or service on the D9036 is failing and if it is, to trigger a failover of a channel or stream. As a result, the DCM executes a transport stream backup where the ROSA system updates the services state in the statmux pools to achieve best quality video at any time during transition and recovery of the failing streams.
Figure 4 shows, at failover time, the failed channels (in this case, the green
one) disappears for a short period. This period is defined by the time it takes to activate the backup (BU) service and include it as part of the statmux pool.
During this period, the ROSA system (in particular the ROSA Element Manager) the backup channel table list at the statmux controller, and then the channel reappears when BU service is activated. A list of failover times is included later in this document
With its complementary redundancy systems, the Cisco video compression platform provides device-level protection by default on top of stream-level redundancy protection that can be configured. The combination of device-level and stream-level redundancy offers the customer the best protection and very fast recovery.
Device-level protection is needed if there is an error that impact all streams delivered by the Cisco D9036 encoder. In this case the ROSA, platform makes sure that a full device is restored by making use of the backup device, effectively restoring all streams through the backup device.
Figure 5 shows a typical example of the system output for a device failover. As you can see, all the channels -
- are lost for a very minimum amount of time. The outgoing service bit rate chart shows that all channels are recovered at speed of the transport stream backup in the DCM.
Figure 5. Cisco Reference Architecture of the 9036 - DCM - Statmux Compression System: Device Redundancy Active
How Device-Level Redundancy Works
As Figure 6 shows,
Figure 6 at failover, the failed channels - Service 1 (S1), S2, S3 and S4
- disappear for a short period. This period is defined by the time it takes to activate the BU service and include it as part of the statmux pool. During this period, the ROSA system updates the backup channel table list at the statmux controller. The channels reappear when the backup (BU) service is activated. A list of failover times is included later in this document.
Using the ROSA System to Operate the Cisco DCM/D9036/Statmux Solution
Now let's turn to using the ROSA system to operate the solution. The ROSA interface and drawings in this section come from the Customer Demo Lab (CDL) at the Cisco Office in Kortrijk-Belgium. While the drawings do show specific examples, you can configure and customize the settings to meet your requirements.
Figure 7 the top-level dashboard, which displays the services being encoded at the output of the headend.
Figure 7. Operator Dashboard in the ROSA Network Management System (NMS)
The icons for the various service states appear in different colors, as shown in
Figure 8. ASSR ICON Representing Service Status
Service in Backup
To tune the system's operation, you can customize the ROSA Network Management System (NMS)
Aggregated Service Status Reflection (ASSR) alerts feature.
Below an example presentation of using this feature on a topology as shown in
Figure 9. Service Dashboard Using the ROSA NMS ASSR Feature
Services State at Outputs DCMs
Services from main D9036, reflected by the DCM inputs
Services from backup D9036 reflected by the DCM inputs
Performing a Manual Failover of a Service or Device
Using the same ROSA NMS ASSR feature, you can immediately see in what state a service is. Figures 10 and 11 provide an example. In Figure 10, where stream-level backup is active, channels 1, 3, and 4 are still being retrieved from the main D9036, while the Channel 2 is retrieved from the backup encoder. In Figure 11, where device-level backup is being used, all Streams retrieved from backup D9036 encoder.
Figure 10. Stream-Level Backup Active
Figure 11. Device-Level Backup Active
Using macros in ROSA, administrators can give extra flexibility to the engineers and operators who use the ROSA NMS as the management layer on top of the solution. For example,
with the combination of stream-level backup through ROSA Element Manager and the macros of ROSA NMS, you are able to switch each individual stream with minimal service impact.
As shown in
Figure 10 and
Figure 11, the operator is able to force en individual stream to its main or backup device without affecting the other channels.
Different MACROs exist in the ROSA NMS portfolio. The one shown in
Figure 10 and
Figure 11 forces a single stream to its main or backup encoder. Cisco also has a MACRO to do this for all streams from a certain D9036
Displaying a Video Compression System Topology in ROSA NMS MAPS
Figure 12 shows, you can draw a systems topology in ROSA NMS MAPS to represent a typical D9036/DCM/Statmux setup that is compliant with Cisco reference architecture.
Figure 12. Topology View Represented by ROSA NMS
Using ROSA NMS Settings Management
ROSA NMS Settings Management allows the engineers and operators to store and restore all device settings from devices (e.g. D9036) to and from the ROSA NMS central server system.
In this way, you can quickly copy settings from one master device to the other or backup all device settings so that you can restore them later (for instance, when you are replacing devices).
Figure 13 shows how to access the Backup Settings and Restore Settings options.
Figure 13. Using ROSA NMS Settings Management to Store or Restore Encoder Settings
ROSA System Configuration
To maximize operational flexibility and protection, Cisco recommends the following ROSA system configuration.
For ROSA Network Management System (NMS):
• Drivers3 for Cisco D9036 encoder, Cisco DCM, Cisco ROSA Element Manager (EM), and IP routers
• Aggregated Service Status Reflection (ASSR) feature to reflect service status
• MACRO component + macros for Cisco DCM, Cisco D9036, and Cisco ROSA EM to flexibly operate the solution and simplify operations
• Settings Management component to be able to save settings of all devices to central place (on ROSA NMS server)
• SI Suite (SI Editor/SI Distributor) for typical DVB related settings on the DCM
• ROSA NMS MAPs to create service dashboards, topology maps, and rack views
• ROSA EM Headend version, that is capable of running redundancy schemes for Cisco D9036 encoder, Cisco DCM, and statmux controller
For market leadership and to meet the customer's technical requirements, Cisco recommends 1:1 stream- level and device-level backup mechanisms to provide best in-class and easy to manage solutions that deliver high availability.
Customers who use Cisco reference architectures gain from Cisco's market-leading technology for protecting channels and devices in the dense compression and processing Cisco Digital Content Manager (DCM) base platform. This solution provides the best protection for the customer's channels with:
• Fast failover times
• No additional hardware required for audio-visual switching
• Cost-effective solution with Cisco's
• Reference architectures that make solution easier to deploy and manage
Although this paper focuses on 1:1 reference architectures and their advantages, Cisco also supports N:M redundancy topologies. Please refer to Appendix C for details.
Appendix A: Triggers for Auto-Failover
ROSA EM Enhanced Redundancy Trigger Logic
Enhanced Redundancy Trigger Logic (ERTL) is a feature on ROSA EM that allows you to combine ROSA EM monitored alarms into new customizable alarms that you can then use as new triggers for redundancy.
As an example, you could combine an alarm called D9036 Link Loss GbE 1 and an alarm called D9036 Link Loss GbE 2 into a new ERTL alarm called D9036 Video Loss at Output. You can then enable the new alarm as a trigger for redundancy.
Table 1 provides an overview of the failover triggers you can create using the ERTL feature.
Table 1. Triggers for Failover Detection
Alarm/Trigger for Failover Detection
D9036 SDI Loss
TS loss on DCM
Loss of both statmux ports on D9036
ROSA EM ERTL config required
Loss of both statmux ports on main and backup Cisco Catalyst 4948 Switch
ROSA EM ERTL config required
D9036 Device Operational Failures (DOF)
ROSA EM ERTL config required
Loss of both video ports on D9036
ROSA EM ERTL config required
Loss of both video ports on main and backup Cisco Catalyst 4948 Switch
ROSA EM ERTL config required
SDI Emb. audio loss for TV service
SDI Emb. audio loss for radio service
AES loss for TV service
AES loss for Radio service
VBI related loss
Appendix B: Stream-Level Redundancy Example
This appendix details one of the many use cases and failover scenarios handled by the solution. To illustrate how stream-level redundancy works, we will zoom into the use case of stream-level redundancy for one particular failover trigger - namely, SDI Loss at the input of the encoder.
Let's revisit the topology and channel configuration shown in
Figure 1. The Cisco reference architecture for this solution assumes the following:
• Layer 3-enabled, multicast-capable routing between the Cisco D9036 Encoder and the DCM, set up for PIM-SM and SSM.
• The Cisco D9036 Encoder outputs its services in SPTSs on different multicast addresses.
• DCM is set up to receive the main SPTSs over the first port-pair and to back up SPTSs over second port-pair with corresponding IGMPv3 join-relationships configured to enable the corresponding main and backup SPTSs.
• Transport stream backup enabled on the DCMs between main SPTSs and backup SPTSs.
• The ROSA EM stream table is correctly configured.
Under normal operation (no backup active), the streams are retrieved from a single Cisco D9036 encoder.
Figure 14 shows the service flow for a single service, denoted by
. As the drawing shows, the stream is processed from the main D9036 and muxed in the Cisco DCM and part of a statmux pool (DCM controlled) to enter at the output of the DCM.
Figure 14. Stream-Level Backup: Normal Operation (No Backup Active)
Now let's introduce a failure in the SDI signal (that is, SDI Loss) at the input of the Cisco D9036 encoder. This will result in an SDI Loss Alarm that will be caught by the ROSA EM and will start the stream-level redundancy execution. The alarm will cause the ROSA EM to stop the main SPTS, update the statmux controller main and backup channel lists, and activate the individual stream of the backup D9036. As a result, the transport stream backup feature of the DCM will make sure that the stream from the backup encoder is selected for further processing on the DCM.
Figure 15 shows the status after the failover - that is, the moment where the system is recovered. The individual failed stream is now retrieved from the backup D9036 and the DCM will be in a transport stream backup state.
ROSA EM has in the meantime updated the statmux controller, with the result that the backup channel participates in the statmux.
Figure 15. Backup Operation: Stream Level Redundancy Active
Appendix C: N:M Device Redundancy on the Cisco Modular Encoding Platform D9036
Figure 16 shows the Cisco N:M backup topology for the encoder devices.
Figure 16. Cisco Topology for N:M Device Redundancy
Such a deployment allows for backup of multiple encoders to a backup encoder or multiple backup encoders. The OSA EM control system transfers the settings from the primary encoders to the backup encoders in case of failover. Table 2 compares the 1:1 and N:M systems. As shown in
Table 2, there is a significant difference between the 1:1 and N:M redundancy schemes with regard to failover times of the channels.
Cisco supports both the 1:1 and N:M topologies, but promotes the 1:1 topology with stream-level failover for its outstanding performance in granularity of failover and outstanding failover times. For N:M operations, stream-level redundancy is not supported.
Table 2. Comparison Between 1:1 and N:M Systems
Very fast switchover times
Failover times are longer, depending on the configuration
No A/V router required
A/V router required
DCM performs stream-level redundancy between the encoders
Cost-effective solution for large systems
Special pricing for backup licenses makes the configuration commercially attractive
Cost-effective solution for large systems
Maintenance switches are performed with minimal switch time. (Maintenance failover is an action that the operator performs uses the ROSA NMS
Maintenance switches are performed with minimal switch time
Optimizing failover times for N:M operations depends on the complexity involved in reconfiguring the backup device that gets the settings of main device at failover.
Cisco provides a best-practice document on how to set up the Cisco Modular Encoding Platform D9036 so it has minimized failover times. The combination of the homogeneous mode and N:M grouped configurations, a ROSA EM feature keeps the failover times to a minimum. Please refer to the Service Note "Service Note - D9036 Settings File Transfer Consideration" on details on N:M operations of Cisco D9036 encoder.