Cisco CDS Topology
In the Cisco CDS the Service Engines are grouped together into locations, such that a Location Tree is a set of locations organized in the form of a tree. The Location Tree represents the network topology configuration that is based on parent-child relationships. Locations are well connected and have similar connectivity properties to the outside world. A location generally implies topological proximity. Each location can have a parent relationship and multiple child relationships, such that each location can have zero to one parent locations and zero to many child locations. These relationships guide how content flows among locations but does not restrict content flow in any direction.
Locations are also classified into tiers. Each tier consists of locations belonging to the same tier. All locations with no parents belong to Tier 1. All locations that are children of Tier 1 locations belong to Tier 2.
The Cisco CDS can consist of one or more topological Location Trees. A Cisco CDS network is limited by the maximum depth of four tiers.
Figure 2-1 illustrates two location trees, with the parent-child relationship of each location indicated by a solid line and each tier indicated by a dotted line.
Figure 2-1 Location Trees Example
The Location Trees define preferred distribution routes. The Tier 1 locations are located closest to the Internet or backbone. Tier 1 locations can communicate with all other Tier 1 locations.
Note The CDS does not support network address translation (NAT) configuration, where one or more CDEs are behind the NAT device or firewall. The workaround for this, if your CDS network is behind a firewall, is to configure each internal and external IP address pair with the same IP address.
The CDS does support clients that are behind a NAT device or firewall that have shared external IP addresses. In other words, there could be a firewall between the CDS network and the client device. However, the NAT device or firewall must support RTP/RTSP.
Device groups offer a way to group similar devices and configure all the devices in a group at one time. Service Engines can be assigned to multiple device groups when the Device Group Overlap feature is enabled.
A device in a device group can have individual settings different from other devices in the group, and its settings can revert back to the group settings. The last configuration submitted for the device, whether group or individual, is the configuration the device uses.
In addition to group configuration and assignment, the CDSM allows the following:
Hiding configuration pages of a device group
Adding all newly activated devices to a device group
Forcing device group settings onto all devices assigned to a group
A device can be assigned to a device group in one of two ways:
From the Device Assignment page
From the Device Group Assignment page
A baseline group is a special type of device group that denotes a group of devices for a particular service. There are three baseline groups:
Web Baseline Group—Used for web-based content
Video Baseline Group—Used for video content
Platform Baseline Group—Used for platform-specific configurations
A device group can be configured as a baseline group. A device can be assigned to a baseline group in the following three ways:
From the Devices home page.
From the Device Assignment page.
From the Device Group Assignment page.
A delivery service is a configuration that defines how content is acquired, distributed, and stored in advance of a client request (prefetch), and after a client request (cached). Content from a single origin server is mapped to a set of Service Engines by a delivery service. Content objects associated with a specific delivery service have a common domain name; in other words, the content in a specified delivery service resides in a single location on an origin server. Each delivery service maps service routing domain names to origin servers one-to-one for Service Router DNS interception.
The CDSM is used to create the topology and configure the delivery services. All Service Engines and Service Routers that register with the CDSM are populated with the topology and the information about the configured delivery services.
The designated Content Acquirer is the only role which is administratively defined in the CDSM, all other roles, based on the topology and delivery service subscription, are assumed by the Service Engines automatically.
Both prefetched content and on-demand (dynamic and hybrid) content caching is supported. Different algorithms are used to elect the Service Engines for the various roles based on the type of content being distributed.
For each delivery service, there is only one Content Acquirer but multiple Service Engines. The location that has the Content Acquirer for a delivery service is called the
. Other Service Engines in the root location that are assigned to the same delivery service can act as backup Content Acquirers if the configured Content Acquirer fails.
Note The locations can be virtual. For example, a location can consist of the enterprise data center and the backup data center. The SEs in both the data center and the backup data center can be backup Content Acquirers for each other.
For Content Acquirer redundancy, a delivery service must have at least two SEs located in the root location. If the primary Content Acquirer fails or becomes overloaded, the SEs in the delivery service use the selected backup Content Acquirer (there could be several SEs assigned to the delivery service that are colocated at the root location).
Content Acquirer Selection for Prefetched Content
For prefetched content, the designated Content Acquirer always performs the content acquisition. Only in an event of a failure does another Service Engine in the same location assume the Content Acquirer role.
The selection algorithm runs in every Service Engine in the root location (also known as the Content Acquirer location). The algorithm always runs in context of a delivery service; that is, only the Service Engines subscribed to the same delivery service are considered in the selection.
Each Service Engine creates an ordered list of Service Engines belonging to the same location and subscribed to the same delivery service. In the root location, the designated Content Acquirer is always added as the first entry in the list.
At steady state when there are no failures, the designated Content Acquirer performs the content acquisition. Each Service Engine in the delivery service gets the content and metadata from the Content Acquirer by way of forwarder Service Engines and receiver Service Engines. Every Service Engine polls its forwarder Service Engine periodically for content and metadata. For more information, see the “Forwarder and Receiver Service Engines” section.
In the event that the Content Acquirer fails, the periodic polls for metadata fail causing the Service Engines to run the Content Acquirer election algorithm .
Each Service Engine creates the ordered list again. The list looks the same as the previous list, except that the Content Acquire which just failed is not considered in the election process. The Service Engine that appears second in the ordered list now assumes the role of the Content Acquirer.
Content Acquirer Selection for Dynamic or Hybrid Ingest
For on-demand content, which is dynamic or hybrid ingest, the designated Content Acquirer is only used to determine the location of where to acquire the content from the origin server directly. All of the Service Engines in the root location are eligible to acquire the content. The Service Engine selected to acquire the content is based on a URL hash. Content acquisition and storage is spread across multiple Service Engines.
The selection algorithm runs on every Service Engine in the root location ( also known as the Content Acquirer location). The algorithm always runs in context of a delivery service; that is, only Service Engines subscribed to the same delivery service are considered in the selection.
Each Service Engine creates an ordered list of Service Engines belonging to the same location and subscribed to the same delivery service. This ordering is based on a index created by a URL hashing function. At steady state when there are no failures, the Service Engine that appears first in the list performs the content acquisition.
In addition to the URL-based list ordering, the health and the load of the Service Engines are also considered in the selection. Service Engines that do not have the applicable protocol engine enabled, failed Service Engines, and Service Engines with load thresholds exceeded are eliminated from the selection process. If a Service Engine is eliminated from the list, the next Service Engine in the ordered list is used to acquire the content.
All other locations (that is, non-root locations) in the delivery service have an SE designated as the
. The location leader is determined automatically by the CDSM. The other SEs act as backup location leaders in case the location leader fails. In the same location, different delivery services may have different SEs as their location leaders. The location leader gets the delivery service content from outside the location, while the other SEs in the location get the content from the location leader. This reduces the distribution traffic on low-bandwidth links, because the SEs in the same location are likely to be on the same LAN.
show distribution forwarder-list
show distribution location location-leader-preference
commands to see the location leader for a delivery service.
Location Leader Selection for Prefetched Content
The location leader selection for prefetched content is based on the same algorithm that is used for the Content Acquirer backup selection for prefetched content, except that the Service Engines are ordered based on an internal ID assigned at the time of registering to the CDSM. The first Service Engine in the list is selected. In the root location, the designated Content Acquirer is always the location leader.
Location Leader Selection for Live Streaming
For live streaming, the location leader selection is based on the program URL hash and the service availability. Each program within a delivery service could have different location leaders. Depending on the URL hash and the number of SEs in the location, some SEs could be acting as the location leader for more than one program.
Location Leader Selection for Dynamic or Hybrid Content
For on-demand content, which is dynamic ingest or hybrid ingest, the location leader selection is based on the same algorithm that is used for the Content Acquirer selection for on-demand content, with the algorithm repeated for each location. This mechanism helps distribute the load, improve cache hits, and reduces redundant content (which contributes to storage scalability). The location leader selection is very similar to how a location leader is selected for live streaming content.
Forwarder and Receiver Service Engines
Content distribution flows from the Content Acquirer to the receiver Service Engine (SE) by way of store and forward. A
SE does not just go directly to the Content Acquirer for content. Rather, it finds out who its upstream SE (the
SE) is and pulls the content from that forwarder. The forwarder SE in turn pulls the content from its own forwarder, which may be the Content Acquirer. All receiver SEs store the content on disk after they get the content. Each receiver SE selects a forwarder SE.
The store-and-forward process causes content to flow through a distribution tree constructed specifically for this delivery service and with all receiver SEs in the delivery service as nodes on the tree. If an SE does not belong to the delivery service, it does not appear on the tree.
Both the metadata about the content and content itself flow through the distribution tree. This tree is constructed by using the dynamic routing of the delivery service and is often a subtree of the overall CDS topology.
Although the tree is global, the delivery service routing process is actually a per-SE local function that answers the question "who is my forwarder for this delivery service?”
The following criteria is used to select a forwarder:
An SE is a forwarder for other SEs in its own location if it subscribes to the delivery service and it is the location leader for the delivery service.
An SE in location A can be a forwarder for SEs from location B if it subscribes to the delivery service, location A is "closer" to the root location of the delivery service than location B, and there is no other location between location A and location B that has a receiver SE of the delivery service. When selecting a forwarder from other locations, a receiver SE uses a hash algorithm seeded with its own unique SE ID (assigned by the CDSM), to spread the load of multiple receivers equally to all eligible forwarders.
Note A “location leader” is always a per-delivery service and per-location concept, while a “forwarder” is always a per-delivery service and per-SE concept.
A receiver SE finds its forwarder by examining the series of locations on the topology “toward” the root location, following the parent-child relationship as described in the “Cisco CDS Topology” section.
1. First, find a forwarder within the SE's own location. The location leader should be the forwarder. If the location leader is down, use the backup location leader as the forwarder.
2. If none is found or if the SE thinks it is the location leader, look for a forwarder in the next location “toward” the root location. If still none are found (for example., there is no SE at that location assigned to the delivery service or the potential ones are unreachable), then look further “toward” the root location, and so on. The recursion ends if a forwarder is found or the Content Acquirer's location is reached.
3. Multicast Forwarder: If the delivery service is marked "multicast enabled," the delivery service searches for a multicast forwarder. If it fails to find any reachable multicast forwarder, it searches again, this time, looking for unicast forwarders.
4. Content Acquirer failover: If the SE is unable to find a live forwarder (for example, there is a network or machine outage), the SE has to retry later, unless it is in the root location for the delivery service and is allowed to failover to the origin server directly and act as a backup Content Acquirer.
Note This process follows the search path provided by the overall topology that was configured for the CDS. Using the combination of the overall topology configuration and the assignment of SEs to delivery services, the CDS gives the administrator a lot of control over the form of the distribution tree, and yet still automates most of the selection and failover process.
Persistent HTTP Connections
HTTP connections are maintained among the SEs in a delivery service and the origin server as long as the connection idle period does not exceed the keepalive timeout period of 30 seconds or the idle period does not exceed the timeout period set on the origin server, whichever is the shorter period.
Persistent HTTP connections in a delivery service work in the following way:
Open new HTTP connection
. The first time a request for cache-miss content is sent to an upstream device (SE or origin server), which is identified by the IP address of the device, a new HTTP connection is formed.
The Web Engine has 8 working threads, which are computing units. Each thread can have as many connections to as many upstream devices as required.
There are a maximum of 10 connections per upstream device (SE or origin server) that are persisted in the idle queue for reuse for each of the 8 working threads, which gives a total of 80 persistent connections.
Connection moved to idle queue
. Once the content download is complete, the connection is moved to the idle queue.
Closing connections in idle queue
. A 30-second keepalive timeout period is applied to each connection moved to the idle queue and if the idle time of a connection reaches the keepalive timeout period, it is closed. If a new request needs to be sent and there is a connection for the same server (IP address) in the idle queue, the connection is moved to the main connection list and used for that request.
A working thread uses an existing connection if the connection is idle; otherwise, a new connection is opened.
Open and close non-persistent connection
. If a request for cache-miss content needs to be sent and there are no idle connections for that upstream device, a new connection is created. If, after the request is served, there already exists 10 connections for the upstream device in the idle queue, the connection is terminated.
Close 50 percent of connections in idle queue.
If the origin server has a timeout period for HTTP connections, that is taken into consideration. The 30-second keepalive timeout is used for closing old HTTP connections. If the upstream SE or origin server has a shorter keepalive timeout period, that takes precedence over the downstream SEs 30-second keepalive timeout. If there are no keepalive timeout values set on the upstream devices (SEs or origin server), then every 30 seconds 50 percent of the persistent connections (maximum of 80 per origin server) are closed.
In the case of network partitions, there can be multiple Content Acquirers for a single delivery service, or multiple location leaders. There can be as many Content Acquirers as there are network partitions (that have backup Content Acquirers) in the root location. Once the partition incident is over in the root location, the system recovers and there is only one Content Acquirer again. There can be as many location leaders as there are partitions (that have subscriber SEs) in any location. Once the partition incident is over, the system recovers from it and there is one location leader again.
Delivery Service Distribution Tree
Delivery services form logical routes for content to travel from an origin server through the Content Acquirer to all the Service Engines in the delivery service. Logical routes for content distribution are based on the device location hierarchy or Location Tree.
The content distribution route follows the general tree structure of the Location Tree, where content is distributed from the root of the tree (Content Acquirer) to the branches (Service Engines associated with the delivery service). A delivery service distribution tree is constructed for each delivery service.
By excluding it from the Coverage Zone file, a Service Engine in a delivery service can be configured only to forward content and metadata, and not deliver the content to client devices.
Figure 2-2 shows an example of a delivery service distribution tree. The Service Engines participating in the delivery service are marked in red. Possible content and metadata routes are indicated by red lines. The actual route may differ among the participating Service Engines as determined by the Service Router routing method.
Figure 2-2 Delivery Service Distribution Tree Example
Types of Delivery Services
The Cisco CDS supports two types of delivery services:
Prefetch/caching delivery services
For prefetch delivery services, called content delivery services in the CDSM, content is forwarded from Service Engine to Service Engine through the delivery service distribution tree until all Service Engines in the delivery service have received it. The delivery service distribution architecture provides unicast content replication using a hop-by-hop, store-and-forward methodology with the forwarder Service Engines systematically selected on the basis of the manually configured location hierarchy. For caching delivery services, the content need not be fully stored before forwarding.
The live delivery services are only used for managed live stream splitting. The prefetch/caching delivery services are used for prefetch ingest, dynamic ingest, and hybrid ingest.
Methods for Ingesting Content
There are two methods that can be used to configure a delivery service:
Specifying the content by using an externally hosted Manifest file.
Specifying the content by using the Internet Streaming CDSM.
The Internet Streaming CDSM provides a user-friendly interface for adding content and configuring crawl tasks. All entries are validated and a Manifest file is generated. The Internet Streaming CDSM offers the most frequently used parameters, a subset of the Manifest parameters. For a complete set of parameters, use a Manifest file.
The following sections describe the main building blocks of a delivery service:
Content is stored on origin servers. Each delivery service is configured with one content origin. The same origin server can be used by multiple live delivery services. However, only one prefetch/caching delivery service is allowed per content origin. Each Content Origin is defined in the Internet Streaming CDSM by the following:
Service routing domain name
The origin server is defined by the domain name that points to the actual origin server. The origin server domain name is used to fetch content that resides outside the delivery service, and to request redirection in case of a failure. The origin server must support at least one of the following protocols in order for the CDS to be able to ingest content:
Content can also originate from a local file on the CDS.
The service routing domain name is an FQDN and is used for content redirection. Each content that is ingested by the Manifest file is published using the service routing domain name. The service routing domain name configured for the Content Origin must also be configured in the DNS servers, so client requests can be redirected to a Service Router for request mediation and redirection.
When the Content Acquirer cannot directly access the origin server because the origin server is set up to allow access only by a specified proxy server, a proxy server can be configured. The proxy server is configured through the Internet Streaming CDSM for fetching the Manifest file, and through the Manifest file for fetching the content. Proxy configurations made in the Manifest file take precedence over proxy configurations in the CLI.
The Manifest file contains XML tags, subtags, and attributes used to define how content is ingested and delivered. Each delivery service has one Manifest file. The Manifest file can specify attributes for content playback and control. Attributes for specifying metadata only, without fetching the content, are supported. If special attributes are set, only the metadata and control information are propagated to the Service Engines. The control data is used to control the playback of the content when it gets cached by dynamic ingest. The Manifest file format and details are described in
Appendix B, “Creating Manifest Files.”
For HTTP, HTTPS, FTP, SMB, or CIFS, a single item can be fetched by specifying a single URL in the CDSM or Manifest file, or content can be fetched by using the crawler feature. The crawler feature methodically and automatically searches acceptable websites and makes a copy of the visited pages for later processing. The crawler starts with a list of URLs to visit, identifies every web link in the page, and adds every link to the list of URLs to visit. The process ends after one or more of the following conditions are met:
Links have been followed to a specified depth.
Maximum number of objects has been acquired.
Maximum content size has been acquired.
The crawler works as follows:
1. The Content Acquirer requests the starting URL that was configured for the delivery service.
2. The crawler parses the HTML at that URL for links to other files.
3. If links to other files are found, the files are requested.
4. If those files are HTML files, they are also parsed for links to additional files.
In this manner, the Content Acquirer “crawls” through the origin server.
A website that has indexing enabled and the default document feature disabled generates HTML that contains a directory listing whenever a directory URL is given. That HTML contains links to the files in that directory. This indexing feature makes it very easy for the crawler to get a full listing of all the content in that directory. The crawler searches the folders rather than parsing the HTML file; therefore, directory indexing must be enabled and the directory cannot contain index.html, default.html, or home.html files.
In FTP acquisition, the crawler crawls the folder hierarchy rather than parsing the HTML file. Content ingest from an SMB server for crawl jobs is similar to FTP ingest; that is, the crawler crawls the folder hierarchy rather than parsing the HTML file.
The Content Acquirer parses the Manifest file configured for the delivery service and generates the metadata. If the hybrid ingest attributes are not specified, the Content Acquirer ingests the content after generating the metadata. The Content Acquirer can be shared among many delivery services; in other words, the same Service Engine can perform the Content Acquirer role for another delivery service.
The CDS supports file acquisition from Windows file servers with shared folders and UNIX servers running the SMB protocol. The Content Acquirer first mounts the share folder. This mount point then acts as the origin server from which the content is fetched. The Content Acquirer fetches the content and stores it locally.
Note With SMB, files greater than two gigabytes cannot be ingested.
The no-cache directive in an HTTP server response header tells the client that the content requested is not cacheable. When an HTTP server responds with a no-cache directive, the Content Acquirer behaves as follows:
If the content to be ingested is specified in an <item> tag in the Manifest file, the Content Acquirer ignores the no-cache directive and fetches the content anyway.
If the content to be acquired is specified in a <crawler> tag in the Manifest file, the Content Acquirer honors the directive and does not fetch the content.
The Internet Streamer application on the Service Engine participates in the delivery service by distributing content within the CDS and delivering content to the clients. The Service Engines can be shared among other delivery services.
In some instances, for example when there are contractual obligations to prevent clients from downloading content, it may be necessary to disable HTTP downloads on a delivery service. When HTTP download is disabled, the Web Engine returns a 403 forbidden message. For configuration information, see the “Creating Delivery Service” section.
What follows is a description of the workflow of a delivery service.
shows sample values for for the delivery service workflow described in Figure 2-3. The delivery service workflow is described in detail following Figure 2-3.
Table 2-1 Delivery Service Parameters Example
Service Routing Domain Name
Delivery Service Contents
Figure 2-3 Delivery Service Workflow Diagram
1. The topology is propagated to all the devices registered and activated in the Internet Streaming CDSM. The delivery service configuration is propagated to all the Service Engines subscribed to the delivery service. The Manifest file information is sent to the Content Acquirer for the delivery service.
2. The Content Acquirer parses the Manifest file and generates the metadata. All content listed in the Manifest file, except for non-cache content types, is fetched.
3. The Content Acquirer propagates the metadata to all other Service Engines.
4. The Service Engines receive the metadata and associated prefetched content. The Service Engines do not prefetch content that is “wmt-live” or “cache” types. The “wmt-live” type corresponds to the Windows Media live streaming and the “cache” type corresponds to the hybrid ingest content.
5. The client request for a URL first performs a DNS resolution. The Service Router is configured as the authoritative DNS server for the hosted, or service routing, domain. The URLs that are published to the users have the service routing domain names as the prefix.
6. The Service Router resolves the service routing domain name to its own IP address.
7. The client sends the request to the Service Router and the Service Router uses its routing method to determine the best Service Engine to stream the requested content.
8. The Service Router redirects the client to the best Service Engine.
9. The client sends the request to the Service Engine.
The following are the possible scenarios after the request reaches the Service Engine:
Flow 10, “Pre-ingested response.”
The content is prefetched using the URL: http://www.ivs-internal.com/video/wmv-152
The actual user request is: http://cr-video.videonet.com/video/wmv-152
The Service Engine processes the user request, and based on the metadata, determines the content was prefetched and pinned in its local storage. The Service Engine looks up the policies for the content and streams the content to the user.
Dynamic Ingest/Cached Content
Flows 10, 11, 12, “Non-ingested contents—Hierarchical cache resolution,” “Native Protocol Response,” and “Dynamic ingest response.”
If the request for content is not specified in the Manifest file, dynamic ingest is used.
The user request is: http://cr-video.videonet.com/video/wmv-cached.wmv
The Service Engines in the delivery service form a hierarchy, pull the content into the CDS, and cache it. The Service Engine streams the content to the user.
Hybrid Ingest/Metadata Only Content
(no content flow)
The request for content is specified in the Manifest file as “cache.”
The user request is: http://cr-video.videonet.com/video/wmv-59
The Service Engine fetches the content, similar to the dynamic ingest method, but the metadata attributes (for example, serveStartTime, serveStopTime) are honored by the Service Engines and the content is served only if the request falls within the defined time interval.