Table Of Contents
Configuring the ACNS Network for Content Acquisition
Preparing to Acquire Pre-positioned Content
About the Acquirer
Set Up the Origin Server
Using HTTP and HTTPS
Using FTP
Create a Channel
Assign a Root Content Engine
Assigning a Backup Root Content Engine
Choosing a Content Acquisition Method for a Channel
Acquiring Pre-Positioned Content Using the Content Distribution Manager GUI
Adding Content Items to a Channel Using the Quick Crawl Utility
Adding a Crawl Task to a Channel
Configuring Content Acquisition Rules for a Crawl Task
Configuring Advanced Content Serving Settings
Modifying Multiple Content Items
Deleting Multiple Content Items
Acquiring Pre-Positioned Content Using an Externally Hosted Manifest File
Create a Manifest File
Specify Channel Acquisition and Distribution Properties
Distribution Priority
Item Priority
Channel Quota
Manifest File Update Interval
Configure Manifest File and Proxy Information for the Channel
Generate the Publishing URL
Assign a Playserver
Acquiring Content through a Proxy Server
Configuring Host and Proxy Server Settings Using the Content Distribution Manager GUI
Configuring an HTTP Proxy Server Using the CLI
Configuring a Proxy Server Using the Manifest File
About the noProxy Attribute
Configuring a Proxy Server and Port for the Manifest File
Acquiring Content Using Indexed Web Servers
Authentication Support
Acquiring Content that Requires NTLM Authentication
Proxy Authentication Support
Configuring Authentication for a Proxy to Fetch the Manifest File
Configuring Authentication for a Proxy Using the Manifest File
Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI
Configuring Authentication for a Proxy Using the CLI
Configuring Authentication for a WCCP Proxy Using the Content Distribution Manager GUI
Modifying WCCP Proxy Authentication Settings
Configuring Authentication for a WCCP Proxy Using the CLI
Verifying the Results
Troubleshooting Content Acquisition
Retry and Refresh Mechanisms
Updating Channel Content
Bandwidth Control
Configuring Acquisition and Distribution Default Bandwidth Settings
Displaying a Graphical Representation of the Acquisition and Distribution Bandwidth Settings
Configuring Acquisition and Distribution Bandwidth Settings for Scheduled Times
Scheduling Bandwidth for Streaming Acquisition
Error Codes
ACNS Unified Name Space Errors
Acquirer Error Codes
Where to Go Next
Configuring the ACNS Network for Content Acquisition
Content acquisition is an important part of content pre-positioning in the ACNS network. ACNS software uses the concept of a channel to map a set of content objects to a set of Content Engines. Before pre-positioned content can be acquired or distributed in the ACNS network, users must create a new channel, subscribe Content Engines to that channel, and designate one of the Content Engines as the root Content Engine. The source of the content might be stored on various file servers or web servers. The root Content Engine of the channel fetches the content objects from these origin servers and then replicates this content to all the Content Engines in the channel.

Note
Pre-positioned content is served only on ports that are standard for the protocol. If the incoming URL contains a port number other than the protocol's standard port (for example, HTTP uses port 80), then the Content Engine does not attempt to serve the content from the pre-positioned file system (cdnfs). Instead, the Content Engine tries to serve the content from the cache file system (cfs) or tries to fetch the content from the origin server, depending on the existing configuration of the Content Engine. To server pre-positioned content from a nonstandard port, set the ignoreOriginPort attribute to true in the manifest file. For a description of this attribute, see the "item" section on page A-44.
This chapter outlines the procedures necessary for acquiring pre-positioned content in the ACNS network and contains information on the following topics:
•
Preparing to Acquire Pre-positioned Content
•
Choosing a Content Acquisition Method for a Channel
•
Acquiring Pre-Positioned Content Using the Content Distribution Manager GUI
•
Acquiring Pre-Positioned Content Using an Externally Hosted Manifest File
•
Acquiring Content through a Proxy Server
•
Acquiring Content Using Indexed Web Servers
•
Authentication Support
•
Verifying the Results
•
Troubleshooting Content Acquisition
•
Retry and Refresh Mechanisms
•
Bandwidth Control
•
Error Codes
•
Where to Go Next
Preparing to Acquire Pre-positioned Content
To prepare your ACNS network to acquire pre-positioned content, you must complete the following tasks:
1.
Set Up the Origin Server
2.
Create a Channel
3.
Assign a Root Content Engine
About the Acquirer
The root Content Engine uses a software agent, referred to as the acquirer, that gathers channel content before it is distributed to the receiver Content Engines in the ACNS network. The acquirer maintains a task list, which it updates after receiving a notification of changes in its channel configuration. The following items in this section are supported or not supported by the acquirer:
File Acquisition from SMB Servers
ACNS 5.5 software supports file acquisition from Windows file servers with shared folders and Unix servers running the SMB protocol. The acquirer first mounts the share folder to the root Content Engine. This mount point then acts as the origin server from which the content is to be fetched. The acquirer fetches the content and pre-positions it in the cdnfs storage space. (Note that files greater than 2 GB cannot be acquired using SMB.)
Note
Symbolic links within exported filesystems (SMB or NFS) must contain a relative path to the target file, or the target file should be copied into the exported volume. If the symbolic link points to an absolute path outside of the volume that the file server is exporting to the Content Engine, the file path is inaccessible to the Content Engine, SMB acquisition fails, and the ACNS 5.5 software issues a 404 error message.
SMB is supported in the Content Distribution Manager GUI for fetching an externally hosted manifest file. The Content Distribution Manager GUI content import feature also supports SMB file acquisition for content items and crawl tasks.
You can pre-position content items and crawl tasks from SMB servers by using the following URL format: \\SMBserver\sharefolder\filepath. The full URL can be specified in the manifest file <item> or <crawler> tags, as shown in this example:
<CdnManifest>
<server name="myserver1">
<host name="\\<SMB-Server>" />
</server>
<item src = "<Share Folder>\<File>" server="myserver1"/>
<item src="\\<SMB-Server>\<Share Folder>\File" />
</CdnManifest
File acquisition from an SMB server for crawl jobs is similar to FTP acquisition in which the crawler crawls the folder hierarchy rather than parsing the HTML file.
You can specify the port to be used for the SMB protocol in the URL or in the port attribute. If the port is not specified, the default port is 139.
Authentication for SMB server file acquisition is supported by means of the user, password, and userDomainName attributes in the <host> and <item> tags. (See the "item" section on page A-44 for a description of this attribute.) These attributes must be specified separately rather than as part of the full URL. Proxy and proxy authentication support are not available for this feature.
Note
If you have a Windows XP server share that is using the anonymous method of authentication, file acquisition fails. (Refer to the Release Notes for ACNS Software, Release 5.5 document for more information.)
EMBED and OBJECT ID Tag Support
If an HTML file includes an EMBED tag or an OBJECT ID tag, the acquirer parses the file, obtains the file link from the src and filename attributes, and fetches these files.
Crawl .asx File
The acquirer can parse an .asx file and fetch its contents.
expires Attribute Specified in the Manifest File
The expires attribute, specified as part of the <item> tag or <crawler> tag in the manifest file, designates a time in yyyy-mm-dd hh:mm:ss format for content to be removed.
If you specify the expires attribute in the manifest file, the root Content Engine deletes the content at the time specified. The root Content Engine then replicates the deletion event to receiver Content Engines, which also delete the content from the channel.
Parsing Scripts in an HTML File
If an HTML file contains a JavaScript or a VBScript (Microsoft's Visual Basic Scripting Edition is a scaled-down version of Visual Basic), and the script codes have links, the acquirer does not parse the script code to get the links.
Cookie Support
Cookies are a general mechanism that server-side connections (such as CGI scripts) use to both store and retrieve information on the client side of the connection. The addition of a simple, persistent, client-side state significantly extends the capabilities of web-based client/server applications. Although the acquirer is also a client, it does not support cookies.
Set Up the Origin Server
Content is usually stored on file servers or web servers that are not part of the ACNS network of managed devices. To pre-position this content for ACNS network acquisition and distribution, the origin file server or web server must support at least one of the following protocols that are used by ACNS software to acquire content:
•
HTTP
•
HTTPS
•
FTP
Note
Beginning with the ACNS 5.5 release, the software no longer supports the MMS protocol or the MMS-over-HTTP protocol for content acquisition.
Using HTTP and HTTPS
Any standard web server supports the HTTP and HTTPS protocols. You can set up your web server as an origin server for pre-positioned content intended for the ACNS network by moving the content over to the web server or by configuring the web server to access the desired content. The following two web servers are the most popular:
•
Apache—Supported on UNIX, Linux, and Microsoft NT platforms
•
Microsoft IIS—Only supported on Microsoft platforms
For the HTTP and HTTPS protocols, content can be fetched as single content items by using the <item> tag in the manifest file, or content can be fetched by using the crawling feature to crawl web server directories. The crawler crawls the folder hierarchy rather than parsing the HTML file. Therefore, if you want to use the crawl feature, you must enable directory indexing and make sure that the directory does not contain index.html, default.html, or home.html files.
Tip
You might need to install SSL certificates in order to set up the web server for HTTPS content acquisition. If your server is using an expired certificate, or a self-signed certificate, you should set sslAuthType to "weak" in the manifest file <host> tag.
Redirection Support
For a single-item HTTP redirection, the acquirer follows the redirect to the new link to obtain the file. If the HTTP redirect is a crawl job, the acquirer checks the new link against the crawl parameters. For example, if a start-url http://www.ibm.com/ is redirected to http://www.cnn.com/, the acquirer checks www.cnn.com to see if the start-url matches the crawl parameters. Unless the externalPrefixes attribute is specified, the redirection is usually rejected because it does not match the default prefix.
HTTP Response no-cache Directive
The no-cache directive is related to the HTTP protocol. The HTTP protocol uses headers to communicate between the client and the server. The client sends a request header, and the server responds with a response header. Headers specify the various attributes for the resource being requested. For example, the client sends the request:
The server might send a response, as follows:
The no-cache directive in the response header tells the client that the content being requested is not cacheable. When an HTTP server responds to a request with a no-cache directive in the header, the acquirer behaves as follows:
•
If the content to be acquired is specified in an <item> tag, the acquirer ignores the no-cache directive and fetches the content anyway.
•
If the content to be acquired is specified in a <crawler> tag in the manifest file, and the response sent out by the server contains a no-cache directive in the response header, the acquirer honors the directive and does not fetch the content or pre-position it.
HTTP Response expires Directive
An expires directive in the HTTP response header indicates that the content being served is not valid after the time specified in the response. For example, if the response header contains
the client (the acquirer in this case) is directed not to use the content after the time specified. The expires directive is honored only for content that is specified in the <crawler> tag. Single items configured using the <item> tag in the manifest file do not honor the expires directive sent by the server.
Using FTP
The root Content Engine acquirer supports acquiring files from FTP servers. When you use FTP, content can be acquired as single content items by using the <item> tag in the manifest file, or content can be fetched by using the crawling feature to crawl the FTP server directories. In FTP acquisition, the crawler crawls the folder hierarchy rather than parsing the HTML file. The following popular FTP servers are supported:
•
Microsoft IIS 4.0, 5.0, 6.0—For Windows platforms
•
Wu-2.6.1-18—For Linux platforms
•
FTP Server—In SunOS (Version 5.6)
•
proFTPd—For Linux platforms
Other supported Windows FTP servers are:
•
WS_FTP server
•
Bulletproof FTP server
•
SurgeFTP
•
SlimFTPd
You can use other FTP servers, as long as the following FTP commands are supported:
•
USER, PASS
•
[SIZE, MDTM] [or] [LIST -a]
•
PASV [or] PORT
•
CWD ~ [or] CWD <SPACE> [or] CWD /
•
RETR
Secure FTP
The acquirer currently does not support secure FTP.
Create a Channel
Channels map a set of content objects to a set of Content Engines. You must have channels configured before you can acquire pre-positioned content or distribute it. To create channels, see Chapter 5, "Configuring the ACNS Network for Content Distribution." Chapter 5 discusses how to create channels for content acquisition and distribution by configuring the following network elements:
1.
Locations—Creating and Modifying Locations
2.
Content providers—Creating and Modifying Content Providers
3.
Websites—Creating and Modifying Websites
4.
Channels—Creating and Modifying Channels
Assign a Root Content Engine
A root Content Engine is used to acquire content for a channel. A channel can have only one root Content Engine. We recommend that you choose a root Content Engine that has enough bandwidth to access the content at the origin server.
For information on how to assign a root Content Engine, see the "Designating the Root Content Engine" section on page 5-16.
Assigning a Backup Root Content Engine
You do not need to specify a backup root Content Engine; however, for backup purposes, you must have a second Content Engine in the same location as the primary Content Engine that is subscribed to the channel. If the designated root Content Engine becomes inactive, the other Content Engine in the location automatically becomes a temporary root Content Engine. If the designated root Content Engine comes back online, it takes over as the root and the temporary root Content Engine becomes a regular Content Engine.
During content acquisition, the acquirer uses a lot of CPU power. We recommend that you set up a dedicated high-end root Content Engine for content acquisition. The root Content Engine can be used for streaming video files or serving pre-positioned content; however, streaming video quality might suffer during periods of heavy content acquisition.
Choosing a Content Acquisition Method for a Channel
When you configure a channel to acquire content, you must choose one of the following content acquisition methods:
•
Specify the content using an externally hosted manifest file.
Manifest files contain the XML tags, subtags, and attributes that you wish to use to define the parameters for content acquisition. You must be familiar with the structure of the XML-based manifest file and be sure that XML tags are properly formatted and syntactically correct before you can effectively create and use manifest files. (See the "Acquiring Pre-Positioned Content Using an Externally Hosted Manifest File" section and Appendix A, "Creating Manifest Files.")
•
Specify the content using the Content Distribution Manager GUI.
The Content Distribution Manager GUI provides a user-friendly interface that you can use to easily add content items and specify crawl tasks without having to create and update a manifest file. The Content Distribution Manager GUI automatically validates all user input and generates an XML-formatted manifest file in the background that is free of syntax errors. (See the "Acquiring Pre-Positioned Content Using the Content Distribution Manager GUI" section.)
Only one manifest file is generated per channel for all content items. You can save your GUI-generated manifest file to any desired location.
To configure the content acquisition method for a channel, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel for which you want to configure the content acquisition method. The Modifying Channel window appears.
Note
Channels for pre-positioning content are listed as type "Content" in the Channels window.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel window appears. By default, the content acquisition method for a newly created channel is set to Use GUI to specify content acquisition. (See Figure 6-1.)
Figure 6-1 Content Acquisition Method for Channel—Use Content Distribution Manager GUI
Use this window to add content items and specify crawl tasks for the channel. (See the "Acquiring Pre-Positioned Content Using the Content Distribution Manager GUI" section.)
Step 4
To use an externally-hosted manifest file:
a.
Change the content acquisition method by clicking the Change Method button.
b.
Choose Specify external manifest file from the drop-down list that appears.
Note
When you change the content acquisition method from using the Content Distribution Manager GUI to using an external manifest file, any content items that you added using the GUI are removed from the GUI. You can save the content items in XML format by clicking the Save settings locally icon in the taskbar. A new window displays the GUI-generated manifest file. Save the file to your local disk.
c.
Click the Save button next to the drop-down list. You are prompted to confirm your decision.
d.
Click OK to confirm. The window refreshes and displays fields for defining manifest file settings and manifest proxy information. (See Figure 6-2.)
Figure 6-2 Content Acquisition Method for Channel—Use External Manifest File
Use this window to define basic manifest settings and proxy information. (See the "Configure Manifest File and Proxy Information for the Channel" section.)
Acquiring Pre-Positioned Content Using the Content Distribution Manager GUI
ACNS 5.5 software allows you to pre-position content items and crawl tasks directly in the Content Distribution Manager GUI without having to create a manifest file. Use this method to set up simple demonstrations or systems, or to add a few items that do not require advanced acquisition and distribution features.
To pre-position content using the Content Distribution Manager GUI, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Services > Web > Channels.
Step 2
Click the Edit icon next to the name of the channel that you want to modify. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel window appears.
Step 4
Specify the method for content acquisition.
a.
Click the Change Method button, if necessary.
b.
Choose Use GUI to specify content acquisition from the drop-down list.
c.
Click Save.
See the"Choosing a Content Acquisition Method for a Channel" section.
Step 5
To add content items and crawl tasks, click the Add Content icon in the taskbar. The Content Manager window for the channel appears. (See Figure 6-3.)
Figure 6-3 Content Manager Window
Step 6
Specify the content source.
a.
Choose the protocol to be used for content requests from the Source URL drop-down list. Then enter the origin server domain name or IP address in the field that follows.
b.
If you wish to acquire the source URL as a single content item, check the Single Item check box.
c.
If you wish to acquire content by crawling the website or if you wish to add individual content items, leave the Single Item check box unchecked. For crawl tasks, enter the number of levels to which the website is to be crawled in the Link Depth field.
Step 7
Set up the parameters for acquiring the content.
a.
If you click the Define a crawl task radio button, you can further define and apply rules to the crawl task. Click the Show optional content acquisition rules drop down arrow. (See the "Adding a Crawl Task to a Channel" section and the "Configuring Content Acquisition Rules for a Crawl Task" section.)
b.
If you click the Select individual items radio button, you can further define content item parameters by clicking the Launch Quick Crawl button. This button launches the Quick Crawl Filter window. (See the "Adding Content Items to a Channel Using the Quick Crawl Utility" section.)
Step 8
If you wish to configure advanced settings, such as content serving time, authentication, URL settings and content settings, click the Show advanced settings drop-down arrow. The window expands, allowing you to configure advanced settings. (See the "Configuring Advanced Content Serving Settings" section.)
Step 9
To process the content request, click Submit. When you click Submit, the local manifest file is automatically reparsed, changes are detected, and the corresponding items are acquired or removed. This action, however, does not trigger a recheck of all the content in the channel.
Adding Content Items to a Channel Using the Quick Crawl Utility
Quick Crawl is a utility that automatically crawls websites starting from the specified source URL. You can use this utility when you know only the domain name and not the exact location of the content item. You can set a rule-based quick crawl filter so that only content with predefined characteristics will be matched for crawling. The results of the filtering are displayed, and you can select the content items that you want to be acquired. Quick Crawl supports crawling only for websites that use HTTP and HTTPS.
To add individual content items using the Quick Crawl utility, follow these steps:
Step 1
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel for which you want to add content items. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears. (See Figure 6-1.)
Step 4
Click the Add Content icon in the taskbar. The Content Manager window for the channel appears. (See Figure 6-3.)
Step 5
From the Source URL drop-down list, choose the communication protocol (HTTP or HTTPS) to be used. The default is HTTP.
Step 6
In the Source URL field, enter the URL of the host server (origin server) to specify the location from which to start Quick Crawl. You can copy and paste URLs from a web browser to the Source URL field.
Note
The values specified for Quick Crawl are not directly associated with manifest file attributes. Quick Crawl simply provides a list of content URLs that you can use as a list of individual content items to be added to the channel.
Step 7
Leave the Single Item check box unchecked.
Step 8
In the Link Depth field, specify how many levels of a website are to be crawled or how many directory levels of an FTP server are to be crawled. This is optional. The range is -1 to 2147483636.
If depth = -1, there is no depth constraint.
If depth = 0, acquire only the starting URL.
If depth = 1, acquire the starting URL and all the content it references.
Alternatively, you can choose to specify the link depth after you launch Quick Crawl.
Step 9
Click the Select individual items radio button to select content items that you want to be fetched, and click the Launch Quick Crawl button to start the Quick Crawl utility. The Quick Crawl Filter window pops up.
Note
Items fetched using Quick Crawl are not associated with the <crawl> element of the manifest file because Quick Crawl is not a manifest generator. Quick Crawl merely lists a number of content items or content objects found on the origin server, such as a graphic file, MPEG video, or RealAudio sound file. When added to the channel, these items are associated with the <item> tag in the GUI-generated manifest file.
Step 10
Configure the Quick Crawl filter settings. Note that all fields in this window are optional.
a.
In the MIME type field, specify the MIME type of files to be crawled. An object is acquired only if its MIME type matches this MIME type.
b.
In the Extension field, specify the extension of files to be crawled. An object is acquired only if its extension matches this extension.
c.
In the Modified After field, enter the date that must be matched if content is to be acquired. An object is acquired only if it was modified after this date.
d.
In the Modified Before field, enter the date that must be matched if content is to be acquired. An object is acquired only if it was modified before this date.
Alternatively, click the Calendar icon next to the Modified After and Modified Before fields to display the Date Time Picker popup window. In the Date Time Picker popup window, use the left or right arrow icons to choose previous and following years if required. Choose a month from the drop-down list. Click a day of the month. The chosen date is highlighted in blue. Click Apply. Alternatively, click Set Today to revert to the current day. The chosen date will be displayed in the Modified After or Modified Before field.
e.
In the Minimum Size field, specify the minimum size of content to be acquired. Content equal to or larger than this value is acquired. Choose MB, KB, or Bytes as the unit of measure. The range is 0 to 2147483636.
f.
In the Max Size field, specify the maximum size of content to be acquired. Content equal to or less than this value is acquired. Choose MB, KB, or Bytes as the unit of measure. The range is 0 to 2147483636.
g.
In the Link depth field, specify how many levels of a website are to be crawled or how many directory levels of an FTP server are to be crawled. The range is -1 to 2147483636.
If you have already entered the depth level in the Link Depth field of the Content Manager window, it is displayed here.
h.
The Domain field displays only the host.domain portion of the Source URL that you entered in Step 6. Edit this field if you want to limit the search to a specific host on a domain.
i.
In the Username field, enter the name of the user who requires secure login to the host server (for websites requiring authentication).
j.
In the Password field, enter the password for the user account that is required to access the host server (for websites requiring authentication).
Step 11
When you have finished adding Quick Crawl filter rules, click the Start Quick Crawl button. The Searching for Content window appears. A progress bar shows the number of items that are being crawled. Before the specified number of items have been crawled or the progress bar reaches 100 percent, you can click Show Results to display the list of content item URLs that have been crawled so far. Also, you can click Refresh Results to refresh the progress bar display.
Once the specified number of items has been crawled, the window refreshes itself, and the Content Importer window appears. (See Figure 6-4.)
Figure 6-4 Content Importer Window
This window displays the associated MIME type, size, date modified, and the URL.
Step 12
Check the check box next to the content items that you want to add to the list of URLs to be acquired and distributed.
Step 13
To add all the listed content items in the current window to the Content Manager, click Select All at the bottom of the window. If the number of content items exceeds ten, you might need to select the items from multiple pages. To deselect all selected items and choose different items, click Select None.
Step 14
Click the Add Selected button at the bottom of the Content Importer window to add all selected content items to be imported. The Content Importer window closes, and the Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears with all the content items added.
Alternatively, click the Show Filter button to return to the Quick Crawl Filter window to change the filter settings.
Adding a Crawl Task to a Channel
The crawler application methodically and automatically searches websites and FTP sites that have met the specified crawl criteria and makes a copy of the visited pages for later processing. The crawler application starts with a list of URLs to visit and identifies every link on the page associated with that URL and adds these links to the list of URLs to visit.
The process ends after one or more of the following conditions are met:
•
Links have been followed to a specified depth.
•
The maximum number of objects has been acquired.
•
The maximum content size has been acquired.
If you use a manifest file, you can specify both the max-number and max-size attributes in the <crawler> tag as the criteria to stop a crawler job, whichever condition occurs first stops the job. For example, if the crawler has acquired the maximum number of objects specified in the manifest file, the crawler stops, even if it has not yet acquired the maximum content size.
If you use the Content Distribution Manager GUI, you cannot specify both attributes for crawl tasks. However, after you save the GUI-generated manifest file (click the Save settings locally icon), you can modify it and add additional XML tags manually using any text or XML editor.
Note
If you change the manifest file that you saved, and you want to use that manifest file instead of the content that you defined in the Content Distribution Manager GUI, or if you want to use the manifest file for another channel, then you must use the Specify external manifest file setting (see "Choosing a Content Acquisition Method for a Channel" section) and point to the manifest file. When you change the content acquisition method from using the Content Distribution Manager GUI to using an external manifest file, any content items that you added using the GUI are removed from the GUI.
To add a crawl task to a channel using the Content Distribution Manager GUI, follow these steps:
Step 1
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel for which you want to configure crawl jobs. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content acquisition method for channel-Use GUI to specify content acquisition window appears.
Step 4
Click the Add Content icon in the taskbar to specify a crawl job. The Content Manager window for the channel appears.
Step 5
From the Source URL drop-down list, choose the communication protocol (HTTP, HTTPS, FTP) that will be used to crawl the content from the origin server. The default is HTTP.
Step 6
In the Source URL field, enter the URL of the host server (origin server) to specify the location from which to start crawling the website or FTP server. You can right-click to copy and paste URLs from a web browser to the Source URL field.
The value entered here is associated with the full URL of the start-url attribute of the <crawler> element. This value is also associated with the proto attribute of the <host> element. The proto attribute can be empty if the name attribute of the <host> element is a fully qualified domain name (FQDN).
Step 7
In the Link Depth field, specify how many levels of a website are to be crawled or how many directory levels of an FTP server are to be crawled. The default depth is 10 levels. The range is -1 to 2147483636.
If depth = -1, there is no depth constraint.
If depth = 0, acquire only the starting URL.
If depth = 1, acquire the starting URL and all the content it references.
This value is associated with the depth attribute of the <crawler> element.
Step 8
Click the Define a crawl task radio button.
Step 9
To add the crawl task to the channel, click Submit. When you click Submit, the local manifest file is automatically reparsed, changes are detected, and the corresponding items are acquired or removed.
To define optional content acquisition rules for the crawl task, see the "Configuring Content Acquisition Rules for a Crawl Task" section.
Configuring Content Acquisition Rules for a Crawl Task
To configure content acquisition rules, follow these steps:
Step 1
Navigate to the content acquisition rules configuration area:
a.
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
b.
Click the Edit icon next to the channel for which you want to configure crawl jobs. The Modifying Channel window appears.
c.
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears, listing the content items and crawl tasks defined for this channel.
d.
Click the Edit icon next to the crawl task for which you want to configure acquisition rules. The Content Manager window appears.
e.
Click the Show optional content acquisition rules drop-down arrow. The fields for configuring content acquisition rules appear. (See Figure 6-5.) Table 6-1 describes the fields in this window and provides the corresponding manifest file attributes.
Figure 6-5 Configuring Content Acquisition Rules for a Crawl Task
Step 2
In the MIME type field, specify the MIME type of the content to be acquired.
Step 3
In the Extension field, specify the extension of the files to be acquired.
Step 4
In the Time before field, enter the time that must be matched if content is to be acquired. Files that were modified before this time will be acquired. Use the format dd-mm-yyyy hh:mm:ss [TMZ] format, where TMZ (the time zone) is optional. UTC is the default. For example, you can specify 5/27/2004 14:02:07 UTC, which is a valid date format.
Alternatively, click the Calendar icon next to the Time before and Time after fields to display the Date Time Picker popup window.
In the Date Time Picker popup window, use the left or right arrow icons to choose previous and following years, if required. Choose a month from the drop-down list. Click a day of the month. The chosen date is highlighted in blue. Alternatively, click Set Today to revert to the current day. The Time field displays the current system time. Edit the fields to configure a different time. Click Apply.
Step 5
In the Time after field, enter the time that must be matched for content to be acquired. Files that were modified after this time will be acquired. Use the format dd-mm-yyyy hh:mm:ss [TMZ] format, where TMZ (the time zone) is optional. UTC is the default.
Step 6
In the Minimum size field, specify the minimum size of content to be acquired. Content equal to or larger than this value is acquired. Choose MB, KB, or Bytes as the unit of measure.
Step 7
In the Max size field, specify the maximum size of content to be acquired. Content equal to or smaller than this value is acquired. Choose MB, KB, or Bytes as the unit of measure.
Step 8
Click Add to add the rule to the list of rules. An entry is added, showing the values under each column head.
Note
A maximum of 10 rules can be configured for each crawl task.
Step 9
To modify a content acquisition rule, click the Edit Rule icon next to the rule whose settings you want to change. Once you have finished, click the small Update button in the content acquisition rules section to save the edits.
Step 10
To delete a content acquisition rule, click the Edit Rule icon next to the rule. Click the Delete button in the content acquisition rules section. The rule is removed from the rules listing.
Step 11
When you have finished adding and modifying content acquisition rules, click Update in the Content Manager window to save your crawl task configurations and return to the list of content items.
Table 6-1 Content Acquisition Rules for a Crawl Task
GUI Parameter
|
Function
|
Corresponding Manifest Attribute
|
MIME Type
|
MIME type of content to be acquired. The object is acquired only if its MIME type matches the MIME type specified here.
|
<match> mime-type
|
Extension
|
File extension of content to be acquired. The object is acquired only if its extension matches the extension specified here.
|
<match> extension
|
Time before
|
Date and time limit before which modified content is to be acquired. The object is acquired only if its last modified date and time are before the value specified here. The format of this field is dd-mm-yyyy hh:mm:ss [TMZ], where the time zone (TMZ) is optional. UTC is the default.
|
<match> time-before
|
Time after
|
Date and time limit after which modified content is to be acquired. The object is acquired only if its last modified date and time are after the value specified here. The format of this field is dd-mm-yyyy hh:mm:ss [TMZ], where the time zone (TMZ) is optional. UTC is the default.
|
<match> time-after
|
Minimum Size
|
Minimum size for content; acquired content size must be larger than or equal to this number, in bytes, kilobytes, or megabytes. The range for this field is 0 to 2147483636.
|
<match> minFileSizeInB/KB/MB
|
Maximum Size
|
Maximum size for content; acquired content size must be smaller than or equal to this number, in bytes, kilobytes, or megabytes. The range for this field is 0 to 2147483636.
|
<match> maxFileSizeInB/KB/MB
|
Configuring Advanced Content Serving Settings
In the Content Distribution Manager GUI under the Show advanced settings heading in the Content Manager window, you can configure the following settings:
•
Content Serving Time
•
Authentication
•
URL Settings
•
Content Settings
By configuring these advanced settings, you can control the manner in which content is served by the Content Engines. These settings correspond to attributes in the manifest file and are associated with the <item> and <crawler> tags. These same attributes can also be specified in the <item-group> or <options> tags in a manifest file, so they can be shared by their <item> and <crawler> subtags.
To configure advanced settings for serving content, follow these steps:
Step 1
Navigate to the advanced settings configuration area:
a.
Choose Services > Web > Channels.
b.
Click the Edit icon next to the name of the channel for which you wish to configure content serving settings.
c.
In the Contents pane, choose Channel Content.
d.
Click the Edit icon next to the content item or crawl task that you want to configure.
Alternatively, click the Add Content icon and configure the optional advanced settings along with the required settings for a new content item or crawl task.
e.
In the Content Manager window, click the Show advanced settings drop-down arrow. (See Figure 6-5.) The fields for configuring advanced settings are displayed, and the arrow becomes the Hide advanced settings arrow. (See Figure 6-6 and Figure 6-7.) Table 6-2 describes the fields in these windows and provides the corresponding manifest file attributes.
Figure 6-6 Advanced Settings for Serving Content—Top of Window
Figure 6-7 Advanced Settings for Serving Content—Bottom of Window
Step 2
Under the Content Serving Time section, configure the following:
a.
Check the High priority content check box to specify the order of importance and hence the processing order of the item acquisition or crawl job. The higher the priority, the earlier the content is acquired and distributed.
b.
In the Start serving time field, specify a time in dd-mm-yyyy hh:mm:ss [TMZ] format at which the Content Engine is allowed to start serving the content. The time zone (TMZ) is optional; UTC is the default. For example, you can specify 5/27/2004 14:02:07 UTC, which is a valid date format. If you do not specify a serving start time, content is ready to serve once it is distributed to the Content Engine.
c.
In the Stop serving time field, specify a time in dd-mm-yyyy hh:mm:ss [TMZ] format at which the ACNS network temporarily stops serving the content. The time zone (TMZ) is optional; UTC is the default. If you do not specify a serving stop time, the ACNS network serves the content to the Content Engine until it is removed by modifying the manifest file or renaming the channel.
Alternatively, click the Calendar icon next to the Start serving time and Stop serving time fields to display the Date Time Picker popup window. In the Date Time Picker popup window the current date is highlighted in yellow by default. Use the left or right arrow icons to choose the previous or following years if required. Choose a month from the drop-down list. Click a day of the month. The chosen date is highlighted in blue. The Time field displays the current system time. Edit the fields to configure a different time. Click Apply. Alternatively, click Set Today to revert to the current day. The date and time chosen are displayed in the Start serving time or Stop serving time fields.
Step 3
Under the Authentication section, configure the following:
a.
Check the Use weak SSL certificate check box if you want to allow the HTTP protocol to accept expired or self-signed certificates.
b.
Check the Disable basic authentication check box if you do not want NTLM headers to be stripped off to allow fallback to the basic authentication method while acquiring content.
If you leave this check box unchecked, NTLM authentication headers can be stripped off and fallback to the basic authentication method occurs. The username and password information are passed to the origin server in clear text with a basic authentication header.
c.
In the Username field, specify the name of the user for proxy authentication.
d.
In the Password field, specify the password of the user for proxy authentication.
e.
In the User Domain Name field, specify the NTLM user domain name for the NTLM authentication scheme configured on the proxy.
Step 4
Under the URL Settings section, configure the following:
a.
Check the No redirect to origin server check box if you do not want to redirect content requests to the origin server. If unchecked, the Content Engine is allowed to redirect content requests to the origin server when it does not have the content in its cache.
b.
Check the Ignore query string check box for the ACNS software to ignore any string after the question mark (?) character in the requested URL for playback.
c.
In the Alternate URL field, specify an alternative URL to which the ACNS network can redirect the request if the content requested by the user has not yet been replicated to the Content Engine (that is, it is not ready in the ACNS network).
Step 5
Under the Content Settings section, configure the following:
a.
In the TTL field, specify a time period for revalidation of content, and choose a unit of measure from the drop-down list. If no TTL is specified, the content is fetched only once, and its freshness is never checked again.
b.
In the Retry interval field, specify a time period after which the ACNS software attempts to acquire the content again if acquisition fails, and choose a unit of measure from the drop-down list.
c.
In the Play duration field, specify the playtime duration for a video file, and choose a unit of measure from the drop-down list.
d.
In the Bit rate field, specify the bit rate of the content for download and playback, and choose a unit of measure from the drop-down list.
Table 6-2 Advanced Settings for Serving Content
GUI Parameter
|
Function
|
Corresponding Manifest Attribute
|
Content Serving Time
|
High priority content
|
Processing order of the content item or crawl job.
|
priority
|
Start serving time
|
Time at which the Content Engine is allowed to start serving the content.
|
serveStartTime
|
Stop serving time
|
Time at which the ACNS network temporarily stops serving the content.
|
serveStopTime
|
Authentication
|
Use weak SSL certificate
|
Allows acceptance of expired or self-signed certificates during authentication.
|
sslAuthtype
|
Disable basic authentication
|
Permits NTLM headers to be stripped off to allow fallback to the basic authentication method while acquiring content.
|
disableBasicAuth
|
User name
|
Name of the user for proxy authentication.
|
user
|
Password
|
Password of the user for proxy authentication.
|
password
|
User Domain Name
|
NTLM user domain name for the NTLM authentication scheme configured on the proxy.
|
ntlmUserDomain
|
URL Settings
|
No redirect to origin server
|
Disallows redirecting content requests to the origin server.
This attribute is a per content object attribute, meaning that if the content has been removed, the redirection settings do not apply. However, if the software fails to acquire the content, then the settings apply.
|
noRedirectToOrigin
|
Ignore query string
|
Software ignores any string after the question mark (?) character in the requested URL for playback.
|
ignoreQueryString
|
Alternate URL
|
Alternative URL to which the ACNS network can redirect the request if the content requested by the user has not yet been replicated to the Content Engine.
This attribute supports only the full URL and applies to the content being requested. If the content is deleted, the attribute does not apply. If content acquisition fails, then this attribute should apply.
|
alternateUrl
|
Content Settings
|
TTL
|
Time period for revalidation of content.
|
ttl
|
Retry interval
|
Time period after which the ACNS software attempts to acquire the content again if acquisition fails.
|
failRetryInterval
|
Play duration
|
Playtime duration of a video file.
|
playDuration
|
Bit rate
|
Bitrate of the content for download and playback.
|
bitrate
|
Modifying Multiple Content Items
If multiple content items (single items and crawl tasks) need to be configured with the same content acquisition and distribution properties, you can select them and modify their link depth and advanced content serving settings in one instance. However, source URLs cannot be modified while you edit multiple content items and crawl tasks. Similarly, content acquisition rules, applicable for crawl tasks, cannot be modified for multiple crawl tasks. If you have selected both single items and crawl tasks for modification, even if you specify a common link depth for all crawl tasks, this value is not applied to single items.
To modify multiple content items, follow these steps:
Step 1
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel which has content items that you want to modify. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears.
Step 4
Check the check box next to each content item that you want to modify.
•
To choose all listed content items in the current window, click the All button at the bottom of the window. (You can choose to display 10, 20, 40, or ALL content items from the Rows drop-down list.)
•
To deselect chosen items, click the None button and make a different choice.
Step 5
Click the Edit Selected Items button at the bottom of the window to modify selected content items. The Channel-Content Manager window appears. By default, all fields are deactivated in this window.
Step 6
Click the Click to override individual settings icon next to each field that you want to modify. The fields become available for editing. Alternatively, click the Click to revert initial settings icon next to the field to preserve the original settings set during configuration.
Step 7
Click the Show advanced settings drop-down arrow and configure advanced settings.
Step 8
Click Update to save and apply modified settings to the chosen content items. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears.
Deleting Multiple Content Items
To delete multiple content items, follow these steps:
Step 1
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel which has content items that you want to delete. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears.
Step 4
Check the check box next to each content item that you want to delete.
•
To choose all listed content items in the current window, click the All button at the bottom of the window. (You can choose to display 10, 20, 40, or ALL content items from the Rows drop-down list.)
•
To deselect chosen items, click the None button, and make a different choice.
Step 5
Click the Delete Selected Items button in the taskbar to remove the selected content items. You are prompted to confirm your decision. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window refreshes itself with an updated list of content items.
Acquiring Pre-Positioned Content Using an Externally Hosted Manifest File
To configure your ACNS network to acquire content using a manifest file, you must complete the following tasks:
1.
Create a Manifest File
2.
Specify Channel Acquisition and Distribution Properties
3.
Configure Manifest File and Proxy Information for the Channel
4.
Generate the Publishing URL
5.
Assign a Playserver

Caution 
If you are using an application that dynamically generates and publishes manifest files, you must take precautions before you upgrade this application software in your ACNS network. During the software upgrade of the manifest publishing software, the manifest file could become invalid, causing the distributed content to be inadvertantly deleted.
To avoid losing content in this situation, we recommend that before you upgrade the manifest publishing software, you disable acquisition and distribution processes on the root Content Engine and on any potential temporary root Content Engines in the root location. Any Content Engine that is in the same channel and same location as the root Content Engine is a potential temporary root. After the application software upgrade is complete, you can restart the acquisition and distribution process on each Content Engine.
To disable the acquisition and distribution process, use the
acquisition-distribution stop EXEC command. To restart the acquisition and distribution process, use the
acquisition-distribution start EXEC command.
Create a Manifest File
See Appendix A, "Creating Manifest Files," for details on creating manifest files.
After you create the manifest file, use the Manifest Validator utility to verify the syntax. (See the "Manifest Validator Utility" section on page A-28.) Next, specify the manifest URL in the Content Distribution Manager GUI. If authentication is required to fetch the manifest file, specify a username and password as well. (See the "Configure Manifest File and Proxy Information for the Channel" section.)
Specify Channel Acquisition and Distribution Properties
This section discusses four important content acquisition and distribution properties that you need to define:
•
Distribution priority
•
Item priority
•
Channel quota
•
Update interval
Some of these properties are configured in the Services > Web > Channels section of the Content Distribution Manager GUI, and some are defined in the manifest file.
Distribution Priority
The distribution priority setting determines the priority of content acquisition and distribution. You configure this setting from the Distribution Priority drop-down list in the Content Distribution Manager GUI. The distribution priority values are High (750), Normal (500), or Low (250). Figure 6-8 shows the acquisition and distribution properties and the manifest properties that you can configure in the Content Distribution Manager GUI. See the "Creating a Channel" section on page 5-11 for channel configuration procedures and field descriptions.
Figure 6-8 Acquisition and Distribution Properties for a Channel
The priority of content acquisition also depends on the origin server. Requests from different origin servers are processed in parallel. Requests from the same origin server are processed sequentially by their overall priority.
Item Priority
The item priority is determined by the manifest file. If the priority attribute is specified in the manifest file under the <item> tag, it is used as the item priority; if it is not specified, the order in which the item is listed in the manifest file (index number) is used as the item priority. (All crawled pages have the same item priority; therefore, you do not need to specify the priority attribute under the <crawler> tag.) See the "Specifying Content Priority" section on page A-13 for more information.
Channel Quota
The channel quota is the disk space allowed for the channel. You configure the channel quota in the Content Distribution Manager GUI. (See Figure 6-8.) See the "Configure Manifest File and Proxy Information for the Channel" section for configuration information.
When configuring the channel quota, keep in mind the following:
•
The total of channel quota in all subscribed channels should not exceed the cdnfs disk space allocation of the Content Engine. See the "Updating Storage Capacity on Your Content Engines" section in the Cisco ACNS Software Update and Maintenance Guide.
•
The total of used disk space in a channel should not exceed the amount of disk space that you allocated for the channel in the Content Distribution Manager GUI Channel Quota field (Services > Web > Channels > Definition).
Because of overhead, the amount of disk space used by a file is always larger than the size of the file itself. To figure the amount of disk space needed for a file, follow these steps:
a.
Divide the actual file size in kilobytes (KB) by the file system block size, which is a fixed 4-KB (4096-byte) unit, and then round up the result to the nearest integer. This provides the number of filled and partially filled 4-KB blocks used by a file.
(File size in KB / 4096) rounded up to the next integer value = Total number of blocks per file
b.
Multiply the total number of file system blocks used by 4 KB (4096 bytes) to calculate the actual disk space consumed in bytes.
Total blocks per file * 4096 = Total disk usage in bytes
c.
Multiply 4 KB by 4 and add the product to the total disk space consumed. (The integer 4 represents disk space that is reserved for internal system usage.)
Total disk usage in bytes + (4096 bytes * 4) = Disk usage per file
Also, because the software attempts to reserve enough space for other minor internal system functions, it is helpful to configure your channel quotas (and pre-positioned disk space) with a modest amount (perhaps 10 percent) of extra space beyond the total disk space consumed.
Channel quota in kilobytes = (Total disk usage in kilobytes) + (0.1 * Total disk usage in kilobytes)
Manifest File Update Interval
The update interval is the interval for the root Content Engine to check the manifest file itself. (This is not the update interval for checking content.) You configure the update interval in the Content Distribution Manager GUI. (See the Check Manifest field in Figure 6-9.) For configuration information, see the next section.
Configure Manifest File and Proxy Information for the Channel
The manifest file provides information about the content to be pre-positioned through the channel or information about the live and video-on-demand (VoD) content served through the channel.
To configure manifest file and proxy information for the channel, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Services > Web > Channels.
Step 2
Click the Edit icon next to the name of the channel that you want to modify. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content.
Step 4
In the Content Acquisition Method for Channel title bar, make sure that the content acquisition method for the channel is to specify an external manifest file. If this is not the case, click the Change Method button and choose Specify external manifest file from the drop-down list. Then click Save. The window refreshes, displaying the manifest settings and proxy information fields.
Note
If you are changing the method of content acquisition from using the GUI to using an externally hosted manifest file, all content that was defined in the GUI is removed. If you want to save the content that you defined using the GUI, you must click the Save settings locally icon in the taskbar. A window pops up showing the GUI-generated manifest file. Choose File > Save As to save the file.
Step 5
Use the fields provided under the Define Basic Manifest Settings heading to configure manifest file information for the channel.
Step 6
Use the fields provided under the Define Basic Manifest Proxy Information heading to configure the proxy information for the acquirer to fetch the manifest file. If a proxy server is configured, requests to fetch the manifest file from the origin server will go through the proxy server.
(See Figure 6-9.) Table 6-3 describes the manifest file fields in this window. Required fields are indicated by an asterisk in the GUI and in the table.
(See also the "Configuring a Proxy Server and Port for the Manifest File" section and the "Configuring Authentication for a Proxy to Fetch the Manifest File" section.)
Figure 6-9 Manifest File and Proxy Information for a Channel
Step 7
To save the manifest file settings, click Submit. If you have not specified the manifest file URL, the system prompts you to confirm your decision not to supply one. You can specify the manifest file URL at a later time after configuring other settings. Click OK to confirm.
Table 6-3 Manifest Properties
Property
|
Description
|
Define Basic Manifest Settings
|
Manifest URL
|
Address of the manifest file for the channel. The manifest URL must be a well-formed URL. If the protocol (FTP, HTTP, or HTTPS) for the URL is not specified, HTTP is used.
|
Check Manifest Every*
|
Frequency in minutes (0 to 52560000) with which the Content Engines assigned to the channel check for updates to the manifest file. (This field is required if the manifest file is specified.)
|
Weak Certificate Verification
|
When checked, enables weak certificate verification for the manifest file. This is applicable when the manifest file is fetched using the HTTPS protocol.
Note To use weak certification for channel content, you need to specify weak certification within the manifest file.
|
Disable basic authentication
|
When checked, NTLM headers cannot be stripped off to allow fallback to the basic authentication method.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method, and the username and password information can be passed to the origin server in clear text with a basic authentication header.
|
Manifest Username
|
Username to fetch the manifest. The manifest username must be a valid ID. If the server allows anonymous login, the user ID can be null.
Note The Username and Password fields allow you to enter any secure login information needed to access the manifest file at its remote location.
|
Manifest Password
|
Password for the user.
|
Confirm Password
|
Password confirmation.
|
NTLM user domain name
|
NTLM user domain name to be allowed access by the NTLM authentication scheme configured on the origin server.
|
Define Manifest Proxy Information
|
Disable All Proxy
|
Disables the outgoing proxy server for fetching the manifest file. Any outgoing proxy server configured on the root Content Engine will be bypassed, and the acquirer will contact the origin server directly.
|
Proxy Hostname
|
Host name or IP address of the proxy server used by the acquirer to retrieve the manifest file.
|
Proxy Port
|
Port number of the proxy on which the acquirer fetches the manifest file. The range is from 1 to 65535.
|
Proxy Username
|
Name of the user to be authenticated to fetch the manifest file.
|
Proxy Password
|
Password of the user to pass authentication on the proxy.
|
Confirm Password
|
Reentry of the same password for confirmation to pass authentication on the proxy.
|
Disable proxy basic authentication
|
When checked, NTLM headers will not be stripped off to allow fallback to the basic authentication method against Microsoft Internet Information Services (IIS) servers.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method; the username and password information can be passed to the proxy server in clear text with a basic authentication header.
|
Proxy NTLM user domain name
|
NTLM user domain name to be allowed access by the NTLM authentication scheme configured on the proxy.
|
Generate the Publishing URL
A publishing URL is the URL that plays back pre-positioned content in the ACNS network. A complete publishing URL consists of three parts:
•
Scheme
•
Domain name
•
Path
The path includes both the file directory path and the file name. The playserver list determines the publishing URL for the ACNS network. The playserver list is generated directly through the manifest file, through the <playServerTable> tag in the manifest file, or through the default playserver table.
Scheme
The scheme of the publishing URL is the protocol used to play the content type. For example, if an .asf video file can be played by both an HTTP and a RealMedia playserver, two URL schemes can be used to access this content: HTTP and RTSP.
The scheme is determined by the type of playserver and has the following direct mapping between playserver and scheme:
Playserver
|
Scheme
|
HTTP
|
HTTP
|
RealMedia
|
RTSP
|
QTSS
|
RTSP
|
Domain Name
The domain name of the publishing URL is determined by the configuration of the ACNS network. If WCCP is used to redirect content requests to a Content Engine, the domain name in the request URL is the origin server FQDN (fully qualified domain name) in the website or channel. If content routing is used, the content routing FQDN (the FQDN of the website) becomes the domain name.
All content acquired through the manifest file is published under the domain name that is entered in the Content Distribution Manager GUI (Services > Web > Channels > Edit channel > Edit Website) in the website Origin Server field (in the case of WCCP routing) or in the Request Routed FQDN field (in the case of a Content Router).
The content must be accessible from the website origin server FQDN, even if it is acquired from a different server.
Note
The manifest file contains a <server> tag. The "server" in the case of this manifest tag refers to "acquisition server," and may or may not be the same as the website origin server FQDN that is published during content serving. Note the distinction between the acquisition server and the website origin server FQDN.
Manifest file attributes such as requireAuth and noRedirectToOrigin refer to the website origin server. Attributes such as ttl and user and password are related to the acquisition server. Thus, if you specify the requireAuth attribute for any content item, make sure that the origin server FQDN you entered in the Content Distribution Manager GUI Create New Web Site window or Modifying Web Site window (accessed through Services > Web > Channels > Edit channel > Edit Website) is accurate and can do the following:
1.
Accept requests for the path as stated in the manifest.
2.
Accept authentication requests from end users for this URL.
This reminder is applicable even when you have multiple acquisition servers specified in the manifest file. Because the "real" origin server is still the website origin server FQDN, you need to make sure that content is accessible from the website origin server FQDN and that the website origin server can accept authentication requests.
Path
In most cases, the path of the publishing URL is the relative source URL, or the src attribute in the <item> tags. For content crawling, it is a relative URL, relative to the host name of the origin server.
Certain attributes in the manifest file allow you to alter the publishing URL path. These attributes are cdn-url in the <item> tag, and srcPrefix or cdnPrefix in the <crawler> and <item-group> tags. These attributes convert a relative source URL into a completely new relative ACNS network URL.
For the content in the following example, the path uses default.html instead of index.html.
<item src="index.html" cdn-url="default.html" />
The relative URL is always relative to the host name. In the following example, the relative URL is index.html, not sport/index.html.
<host name="http://www.cnn.com/sport/" />
<item src="index.html" />
In the following example, the srcPrefix and cdnPrefix attributes convert the prefix of every crawled content object from NBA/ to ABC/. The relative cdn-url is ABC/*. The path for the start-url attribute is ABC/index.html.
start-url="NBA/index.html"
Special Characters in URLs
Certain special characters are not allowed in URLs, according to the RFC 2396 standard. If the URL of the content to be acquired contains illegal characters, you must rewrite the URL using the ASCII code equivalent escape characters to represent the illegal special characters.
For example, the escape characters for a blank space are %20. An illegal URL that uses blank spaces in the filename, such as:
http://www.cnn.com/file with space
Can be rewritten as:
http://www.cnn.com/file%20with%20space
You might have to rewrite the URL using the ASCII equivalent escape characters when you specify a URL in the Manifest URL field in the Channel definition window of the Content Distribution Manager GUI (Services > Web > Channels > Definition) or when you specify a URL in the manifest file within the following tags:
<host name=>
<item src=>
<crawler start-url=>
<crawler externalPrefixes=>
<match prefix=>
ACNS software enforces the following rules for escaping a URL:
•
The special characters ! # $ & ' ( ) , ; = ? serve special functions in a URL. If your filename or folder name on the origin server uses these special characters, you must rewrite the URL.
In addition to the above special characters, you must also avoid using the following characters in filenames and folder names and substitute the ASCII equivalent shown:
space ==> %20
" ==> %22
% ==> %25
< ==> %3c
> ==> %3e
[ => %5b
\ ==> %5c
] ==> %5d
^ ==> %5e
{ ==> %7b
| ==> %7c
} ==> %7d
~ ==> %7e
•
If a URL in the manifest file or in an HTML file uses a "?" character, the URL is rejected. If a URL uses a "#" character, the portion of the URL after the "#" is discarded.
•
If a URL in a manifest file or in an HTML file uses other special characters, follow the requirements in RFC 2396.
Assign a Playserver
The playserver is assigned in the manifest file. See the "Generating a Playserver List" section on page A-13 for more details.
The <playServer> tag is very important for playing back pre-positioned content; it contains a list of playservers to play back the content. All ACNS pre-positioned content needs to use one of the ACNS software-supported playservers to play back content to end users.
ACNS 5.x software supports playservers that play back the following pre-positioned content types on the ACNS network: HTTP, HTTPS, WMT, and RTSP (RealMedia and QuickTime Streaming Server [QTSS]).
You can use any protocol to request content. Actually, the protocol information implies which playserver is needed to play the content. The ACNS software checks whether the requested protocol matches the list in the playserver table. If it matches, the request is delivered. If it does not match, the request is rejected.
You can generate a playserver list through:
•
The manifest file, by configuring playServer attributes in an <item> tag
•
The <playServerTable> tag, by configuring playserver MIME-type extension names
To create the playserver list directly through the manifest file, configure playServer attributes of the playserver list in an <item> tag. If an <item> tag does not have a playServer attribute, its playserver list is generated through the <playServerTable> tag. If the <playServerTable> tag is omitted in the manifest file, a built-in default <playServerTable> tag is used to generate the playserver list. Multiple servers are separated by commas, as shown in the following example:
<item src="video.mpg" playServer="real,wmt" />
You can also generate the playserver list that supports these streaming media types through the <playServerTable> tag. The <playServerTable> tag maps content into a playserver list based on the MIME-type extension name. If there is a <playServerTable> tag in the manifest file, use it to generate the playserver list.
To generate the playserver list though the <playServerTable> tag, use MIME-type extension names to configure which playserver can play the particular pre-positioned content, as shown in the following example:
<contentType name="application/x-pn-realaudio" />
<contentType name="application/vnd.rn-rmadriver" />
<contentType name="application/pdf" />
<contentType name="application/postscript" />
The <playServerTable> tag is used to generate a playserver list for each content type. Note that in the preceding example, any file with a PDF or a PostScript extension uses HTTP to play the content.
HTTP and HTTPS are the default playservers. In other words, if you did not use the customized <playServerTable> tag or the playServer name field to specify playservers for content, HTTP and HTTPS are added to the playServer name fields to make sure the content can always be played back through HTTP and HTTPS.
If you do use a <playServerTable> tag or playServer attribute in the manifest file, HTTP and HTTPS are not automatically allowed for playback. In this case, if you want to use HTTP or HTTPS to play certain content, you must specify the protocol, as shown in this example:
<playServer name="tvout">
<playServer name="https">
<contentType name="text/html" />
<item src="http://www.aaa.com/a.asf" />
<item src="http://www.aaa.com/b.html" />
<item src="http://www.aaa.com/c.mpg" />
<item src="http://www.aaa.com/d.jpg" playServer="http,https" />
<item src="http://www.aaa.com/e.rm" />
In this example, the playserver list is generated as follows:
a.asf: wmt: from <playServerTable> by extension
b.html: https: from <playServerTable> by contentType
c.mpg: tvout: from <playServerTable> by extension.
d.jpg : http + https: from "playServer" attribute
e.rm : real + http + https: from build-in <playServerTable>
Because there is no playserver specified in the manifest file for the .rm extension, the built-in rule matches it to the RealMedia playserver. Also HTTP and HTTPS playservers are automatically added if the built-in default <playServerTable> tag is used.
The TV-out playserver is a special playserver because it allows all pre-positioned content to be played back. If you want to exclude content from being played back by other playservers, specify the TV-out playserver. ACNS software does not allow other playservers to play content once a playserver has been specified.
Acquiring Content through a Proxy Server
Acquisition through a proxy server can be configured when the root Content Engine cannot directly access the origin server because the origin server is set up to allow access only by a specified proxy server. When a proxy server is configured for root Content Engine content acquisition, the acquirer contacts the proxy server instead of the origin server, and all requests to that origin server go through the proxy server.
Note
Content acquisition through a proxy server is only supported for HTTP requests. It is not supported for HTTPS or FTP.
If a transparent WCCP proxy is used, you do not need to configure a proxy for content acquisition. In such cases, HTTP requests from the acquirer are redirected to the WCCP proxy by a router. However, when you do not have a router to redirect HTTP requests to the proxy server, you must configure the Content Engine to use the proxy server.
There are three ways to configure the proxy server: through the Content Distribution Manager GUI, through the Content Engine CLI, or through the manifest file. If you need to configure the Content Engine to use the proxy for both caching and pre-positioned content, use the CLI to configure the proxy. The CLI command is a global configuration command that configures the entire Content Engine to use the proxy. If only the acquirer portion of the Content Engine needs to use the proxy for acquiring pre-positioned content, use the manifest file or to specify the outgoing proxy. When you configure the proxy server in the manifest file, you are configuring the acquirer to use the proxy to fetch content for a particular channel.
Note
Proxy configurations in the manifest file take precedence over proxy configurations in the CLI. Furthermore, a noProxy configuration in the manifest file takes precedence over the other proxy server configurations in the manifest file.
You can also configure a proxy for fetching the manifest file by using the Content Distribution Manager GUI (the Creating New Channel or Modifying Channel window). When you configure a proxy server in the Content Distribution Manager GUI, the proxy configuration is valid only for acquiring the manifest file itself and not for acquiring the channel content. Requests for the manifest file go through the proxy server, whereas requests for content go directly to the origin server.
Tip
Before configuring a proxy server, verify that the root Content Engine is able to ping the proxy server. If the proxy is not servicing the configured port, you will get the message: "failed: Connection refused."
Configuring Host and Proxy Server Settings Using the Content Distribution Manager GUI
To configure proxy server settings for content items defined using the Content Distribution Manager GUI, follow these steps:
Step 1
Choose Services > Web > Channels. The Channels window appears, listing all channels in your ACNS network.
Step 2
Click the Edit icon next to the channel for which you want to configure proxy server settings for the manifest file items. The Modifying Channel window appears.
Step 3
In the Contents pane, choose Channel Content. The Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window appears.
Step 4
Click the Manage Host and Proxy Settings icon in the taskbar to configure proxy server information. The Content Hosts window appears, listing all previously created host URLs, the number of content items for each host, and proxy server (if configured).
Step 5
Check the check box next to the host URL for which you want to configure proxy server settings. You can also select multiple host URLs and configure proxy servers.
Step 6
Click the Manage Proxy for Selected Hosts button. A new window appears. Under the Defining proxy server for the following hosts heading, a bulleted list of host servers is displayed for which proxy servers are being configured.
To return to the Content Acquisition Method for Channel-Use GUI to Specify Content Acquisition window, click the Return to Content Listing button.
Step 7
Use the fields provided under the Proxy Server specifications section on the right to configure the proxy information for the acquirer to fetch the manifest file. If the proxy server is configured, requests to fetch the manifest file from the origin server will go through the proxy server. (See Table 6-4 for a description of the fields for configuring proxy server properties.)
You can configure the proxy server for the acquirer by using the fields in this section. These fields are associated with the attributes specified in the <proxyServer> tag in the manifest file. The proxy server configured in the manifest file is valid for all the hosts named in the manifest file. If the specified proxy fails, the acquirer, by default, contacts the origin server directly and tries to fetch the object.
Step 8
Click Add to add the proxy server to the Select a proxy server section on the left. If you need to change any of the fields, click Cancel to erase the values and start over.
Step 9
To modify a proxy host, click Edit at the bottom of the Select a proxy server section. The configured values are displayed in the fields provided under the Proxy Server specifications section on the right and can be modified. Once you have finished, click Update to save the modified settings.
Step 10
To delete a proxy host, click Delete at the bottom of the Select a proxy server section.
Step 11
In the Select a Proxy Server section, a list of already defined proxy servers (listed by proxy host name) is displayed.
•
To define a proxy server for a host URL, choose the proxy host name (the selected proxy host is highlighted) and click the Save Assignment button. The Content Hosts window appears.
•
To specify that all the requests for a particular host should go directly to the origin server and should not use the proxy, choose Do not use proxy server and click the Save Assignment button. The Content Hosts window appears.
This option is associated with the noProxy attribute specified in the <host>, <item>, or <crawler> tags. The noProxy= "1" designation specifies that the item and the crawl task data are to be fetched directly from the origin server. (See the "About the noProxy Attribute" section for more information.)
•
If you have selected multiple host URLs (from the Content Hosts window) for which you want to define and assign a proxy server, a "Hosts selected use different proxy servers." warning message is displayed on top of the window only if the selected host URLs have been previously assigned to different proxy servers.
To assign a common proxy server for all selected host URLs, choose the proxy host name (the selected proxy host is highlighted) and click the Save Assignment button. Otherwise, choose Do not change assignments to retain the previously defined proxy host assignments for the host URLs and click the Save Assignment button.
•
If you have changed the proxy server for host URLs and want to leave the assignment unchanged, click the Cancel Assignment button. The Content Hosts window appears.
Table 6-4 Proxy Server Settings
GUI Parameter
|
Function
|
Manifest Attribute
|
Proxy Hostname
|
Host name or IP address of the proxy server used by the acquirer for content acquisition. When you use a domain name instead of an IP address, make sure that the domain name can be resolved by the DNS server configured.
|
serverName
|
Proxy Port
|
Port number of the proxy server on which the acquirer fetches content. The range is 1 to 65535.
|
port
|
Disable Basic Authentication
|
When checked, NTLM headers cannot be stripped off to allow fallback to the basic authentication method.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method and the username and password information can be passed to the origin server in clear text with a basic authentication header.
|
disableBasicAuth
|
User Name
|
Name of the user to be authenticated to fetch the manifest file.
|
user
|
Password
|
Password of the user to pass authentication from the proxy.
|
password
|
Configuring an HTTP Proxy Server Using the CLI
To configure a proxy server using the CLI, use the http proxy outgoing global configuration command. This is a global proxy configuration for caching and pre-positioned content acquisition. You can use this command to configure a list of outgoing proxy servers through the CLI, and when configured, all the requests (for any host) will go through the proxy server. The global proxy configuration can also be reset using the no form of the command, as shown in this example:
ContentEngine(config)# http proxy outgoing ?
connection-timeout Timeout period used for probing outgoing proxy servers in microseconds
host Use Outgoing HTTP Proxy
monitor Interval at which to monitor the outgoing proxy servers
origin-server Use Origin Server if all outgoing proxies failed.
preserve-407 Preserve 407 HTTP authentication header
ContentEngine(config)# http proxy outgoing host ?
Hostname or A.B.C.D Hostname or IP address of outgoing proxy
ContentEngine(config)# http proxy outgoing host 128.107.192.24 ?
<1-65535> Port of outgoing proxy
ContentEngine(config)# http proxy outgoing host 128.107.192.24 80
If the acquirer is unable to contact a proxy server, the proxy is considered disabled. The acquirer retries a disabled proxy after the period specified in the monitor option of the http proxy outgoing command. If no monitor interval is configured, the retry interval defaults to every 30 minutes.
If all outgoing proxies fail and you want the acquirer to contact the origin server directly, use the origin-server key word, as shown in this example:
ContentEngine(config)# http proxy outgoing origin-server
If a list of outgoing proxy servers is configured in the CLI and the first proxy server fails, the next proxy server in the list will be tried, and so on. If all the proxies fail, the acquirer will fail over to the origin server, depending on the configuration specified in the http proxy outgoing origin-server command.
To verify that the outgoing proxy is configured in the CLI, use the show http proxy EXEC command, as shown in this example:
Not servicing incoming proxy mode connections.
Primary Proxy Server: 128.107.192.190 port 9090
Monitor Interval for Outgoing Proxy Servers is 60 seconds
Timeout period for probing Outgoing Proxy Servers is 300000 microseconds
Use of Origin Server upon Proxy Failures is disabled.
To confirm whether the content is being acquired through a proxy, use the acquisition and distribution transaction log, and check the proxyUsed line. This line shows the proxy server used for the request.
Configuring a Proxy Server Using the Manifest File
You can configure the proxy server for the acquirer by using the <proxyServer> tag in the manifest file. The proxy server configured in the manifest file is valid for all the hosts named in the manifest file. If the specified proxy fails, the acquirer, by default, contacts the origin server directly and tries to fetch the object.
If you have multiple proxy servers in the manifest file, and if you want to associate different proxy servers for different hosts, you must include the proxyServer attribute in the <host> tag. In this first example, the <proxyServer> tag is associated with the <host> tag through its proxyServer attribute (proxyServer="172.19.226.242"). All items that use this <host> tag will use this proxy server.
Note
The <proxyServer> tag must be located at the top level of the manifest file, directly under the <CdnManifest> tag; it cannot be used as a subtag of any other tags. It is associated with the <host>, <item>, and <crawler> tags through the proxyServer attribute.
serverName="172.19.226.242"
<server name="my-devbox">
<host name="http://vista2.cisco.com"
proxyServer="172.19.226.242"
<item src="HR/salary.htlm" />
<item src="HR/Holidays.html" />
In the following example, the proxyServer attribute is associated with the <item> tag and the <crawler> tag. The proxy server is applied to the second <item> tag and the <crawler> tag, but not the first <item> tag.
<item src="http://www.msnbc.com/computer/jokes.html" />
<proxyServer serverName="128.107.192.24" port = "80" />
<item src="http://www.cnn.com/War/World-war-two.jpg" />
<crawler start-url="http://www.abc.com/War/World-war-two/index.html" />
Instead of using an IP address for the serverName attribute in the <proxyServer> tag, you can also use a domain name, as shown in this example:
serverName="spachiap-uni1"
<server name="my-devbox">
<host name="http://128.107.150.26/nfs-obsidian/Unicorn"
proxyServer="spachiap-uni1"
When you use a domain name instead of an IP address, make sure that the domain name can be resolved by the DNS server configured.
About the noProxy Attribute
You can configure the noProxy attribute in the manifest file to specify that all the requests for a particular host should go directly to the origin server and should not use the proxy specified in the http proxy outgoing host command, if any proxies were configured in the CLI.
The noProxy attribute can be specified in the <host>, <item>, or <crawler> tags. In the following example, if the root Content Engine has a proxy configured through the http proxy outgoing host command, the first item in the manifest file uses that proxy, but the rest of the items or crawl tasks in this manifest file do not. The noProxy="1" designation specifies that the second item and the crawl task data are to be fetched directly from the origin server.
<item src="http://www.msnbc.com/computer/jokes.html" />
<item src=" http://www.cnn.com/War/World-war-two.jpg"
<crawler start-url="http://www.abc.com/War/World-war-two/index.html"
<host name="http://www.cisco.com" noProxy="1" />
<crawler start-url="product/routers/" />
Configuring a Proxy Server and Port for the Manifest File
A proxy server and port for fetching the manifest file can be set in the Channel Content window of the Content Distribution Manager GUI. If a proxy server is configured, requests for the manifest file go through the proxy server that is specified in the Content Distribution Manager GUI.
Note
The proxy configuration from the Channel Content window only applies to fetching the manifest file. It does not apply to content acquisition.
If the manifest file resides on an origin server that requires a proxy, you need to specify the proxy name or IP address and the proxy port in the fields provided in the Define Manifest Proxy Information section of the Channel Content window in the Content Distribution Manager GUI. (See Figure 6-9. For a description of the fields in this window, see Table 6-3.)
If the manifest file resides on an origin server that cannot use the proxy configured from the http proxy outgoing host CLI command, you must check the Disable All Proxy check box.
The Disable All Proxy check box, when checked, overrides the values in the Proxy Hostname and Proxy Port fields.
Note
To activate the Manifest Proxy Information fields in the Channel Content window, you must first enter a URL in the Manifest URL field.
Acquiring Content Using Indexed Web Servers
A root Content Engine using HTTP that crawls an origin server to acquire content first makes a request for the starting URL that was specified when you configured the Channel Content window settings in the Content Distribution Manager GUI (Services > Web > Channels > Channel Content). The root Content Engine then parses the HTML of that location for links to other files. Those other files are then requested. If those files are HTML files, they are also parsed for links to additional files. In this manner, the Content Engine crawls through the origin server much like a client browses through a website.
A website that has indexing enabled and the default document feature disabled generates HTML that contains a directory listing whenever a directory URL is given. That HTML contains links to the files contained in that directory. This indexing feature makes it very easy for the root Content Engine to get a full listing of all the content in that directory so that it can be acquired.
Note
When you specify a starting URL for acquisition, if that URL does not end with "/" then the acquirer parses the ".." link that is returned with a directory index, and the acquirer crawls the parent directory of that URL. This behavior is not usually desired. When you enter an indexed directory as the starting URL for a crawl, always end that URL with "/" so that crawling is limited to that subdirectory.
Certain limitations to the acquirer's crawling functionality might prevent the complete set of desired content from being acquired by the Content Engine. For example, the acquirer does not crawl within script elements. So, if a client browser runs a script that provides links to additional content, those links and that additional content cannot be seen or acquired by the Content Engine.
One way to ensure that all the desired content is acquired is to configure the Channel Content window settings or manifest file to acquire individual items rather than perform a crawl. However, that method is not practical for large web sites or for web sites that are constantly changing. A better workaround would be to create a multi-homed web server, or virtual website, to allow indexed access to the same directory structure. This virtual website can be secured so that only the root Content Engine is allowed access.
The steps for creating a virtual website and performing acquisition will vary based on the web server software you use, how your website is configured, where the content exists, and many other factors. For this discussion we use the Microsoft IIS server as an example, but other web servers can be configured similarly.
To create a virtual website and configure an IIS server to allow indexed access, follow these steps:
Step 1
Login to the IIS server and create a virtual website.
Step 2
In the Properties dialog box for the virtual website that you created, assign an IP address or TCP port to the virtual website that is different from your original website IP address or TCP port setting.
Step 3
Choose the Document tab and uncheck the Enable Default Document check box.
When you clear this check box, it prevents the default document from interfering with directory indexing.
Step 4
Set the Directory Security settings with the same security settings as your original website, or restrict access so that only the root Content Engine and any backup root Content Engines have access to the site.
Step 5
Choose the Home Directory tab, and in the Local Path field, enter the same path as your original website directory.
Step 6
Check the Direct browsing check box.
Step 7
Create the manifest file making sure that you enter the hostname of the virtual website and include the correct port.
The first line of your manifest file must contain the XML version tag <?xml version="1.0"?> so that the manifest can be validated without error, as shown in the following example:
<host name="http://10.86.46.42:81" />
<crawler start-url="/intro/" depth="5" />
The root Content Engine is now able to acquire all the content within the subdirectories.
Authentication Support
The acquirer supports two types of authentication schemes: basic and NTLM authentication. Authentication information is configured either in the Content Distribution Manager GUI or in the manifest file, depending on what is being acquired.
If authentication is required to fetch the manifest file from the origin server or from a proxy server, you can specify the authentication information in the Content Distribution Manager GUI. From the Services > Web > Channels Channel Content window, you can disable basic authentication, enter a username and password, and specify an NTLM user domain name. (Figure 6-9 shows the fields for entering authentication information, and Table 6-3 describes these fields.)
Note
The authentication configuration from the Channel Content window only applies to fetching the manifest file. It does not apply to content acquisition.
If authentication is required to fetch pre-positioned content from the origin server or from a proxy server, you must specify the authentication information in the manifest file. (See the next section, "Acquiring Content that Requires NTLM Authentication.")
Acquiring Content that Requires NTLM Authentication
In ACNS 5.x software, a root Content Engine can acquire content from HTTP origin servers or proxy servers after performing basic authentication. However, origin servers often support NTLM authentication only. In ACNS 5.5 software, the acquirer can act as an NTLM client to talk to NTLM-capable origin servers.
Note
The Content Engine acquirer only supports NTLM Version 1.
If authentication is required to fetch content, you must specify the authentication information in the manifest file by using following manifest file attributes:
•
user—User account
•
password—Password
If NTLM authentication is required, you can specify the following two additional manifest file attributes:
•
ntlmUserDomain—NTLM user domain name. This attribute is required for an NTLM authentication scheme.
•
disableBasicAuth—(Optional attribute) Disables basic authentication, if needed. If you always want your username and password to be used for NTLM authentication and not basic authentication, set this attribute to true. If this attribute is omitted, the default is false.
These attributes can be specified in the <host>, <item>, <item-group>, and <crawler> tags. See Appendix A, "Creating Manifest Files," for more information.
If an origin server or proxy server supports both basic authentication and NTLM authentication, the acquirer chooses the first supported authentication scheme from the response challenge list. The authentication scheme used by the acquirer can also depend on the specified authentication information. For example, an IIS server responds with NTLM as the first scheme and basic authentication as the second. In this case the acquirer chooses NTLM. However, if the ntlmUserDomain attribute is not specified in the manifest file, the acquirer chooses basic authentication. If the disableBasicAuth attribute is set to true, the acquirer chooses NTLM authentication.
You can view the acquisition and distribution transaction log from the root Content Engine CLI to determine which authentication scheme the acquirer has used.
Proxy Authentication Support
If the proxy for the root Content Engine was configured in the root Content Engine CLI (http proxy outgoing host command), you can set the authentication information for the proxy from either the Content Distribution Manager GUI (see the "Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI" section) or from the root Content Engine CLI (see the next section, "Configuring Authentication for a Proxy Using the CLI").
Configuring Authentication for a Proxy to Fetch the Manifest File
To configure the authentication information for fetching the manifest file from a proxy server, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Services > Web > Channels.
Step 2
Click the Edit icon next to the name of the channel that you want to modify.
Step 3
In the Contents pane, choose Channel Content.
Step 4
Enter the authentication information in the appropriate fields under the Define Manifest Proxy Information heading. (Figure 6-9 shows the fields for entering authentication information, and Table 6-3 describes these fields.)
Step 5
To save the settings, click Submit.
Configuring Authentication for a Proxy Using the Manifest File
The following example specifies authentication information for the proxy server. See Appendix A, "Creating Manifest Files," for more information.
<!-- specify a proxy server/port and its authentication info
this proxy supports Basic auth so only user/password is needed -->
<proxyServer serverName="128.107.192.24" port = "80"
user="johnz" password="xxx123yyy"/>
<!-- this item is below the proxyServer tag; it is using the above proxy -->
<item src="http://www.cnn.com/War/World-war-two.jpg" />
<!-- specify a proxy server/port and its authentication info
this proxy requires NTLM auth, so ntlmDomainName is needed
for extra password security, users want to disable basic auth
so their user/password is not sent over the wire -->
<proxyServer serverName="company-proxy" port = "80"
user="johnz" password="xxx123yyy"
ntlmUserDomain="cisco-eng"
<!-- this crawler is below the proxyServer tag; it is using the proxy -->
<crawler start-url="http://www.abc.com/War/World-war-two/index.html"/>
Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI
If a root Content Engine is configured to receive content through a proxy server, the acquirer running on the root Content Engine must be authenticated by the proxy server before it can obtain content from the origin server.
To configure authentication information for a proxy that was specified using the http proxy outgoing host command, you can use the Acquirer Outgoing Proxy Authentication section in the Content Distribution Manager GUI (Devices > Devices > Applications > Web> > HTTP > HTTP Connections) to set the authentication information for the proxy. ACNS software supports multiple proxies; therefore, you can set the proxy authentication information for multiple proxies as well.
Note
If you specify the <proxyServer> tag in the manifest file, or if you enter the proxy host IP address in the Manifest URL field in the Channel Content window of the Content Distribution Manager GUI (Services > Web > Channels), the http proxy outgoing host command is ignored.
To configure the acquirer outgoing proxy authentication information, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Devices.
Step 2
Click the Edit icon next to the name of the root Content Engine.
Step 3
In the Contents pane, choose Applications > Web > HTTP > HTTP Connections. The HTTP Connection Settings for Content Engine window appears.
Step 4
Under the Acquirer Outgoing Proxy Authentication heading, enter the following information for the outgoing proxy that is listed in each row:
a.
To acquire content from the origin server, enter the name of the user to be authenticated in the Username field. This username will be used for both NTLM and basic authentication.
b.
Enter the password of the user in the Password field. Reenter the same password in the Confirm Password field for confirmation. The password details appear as asterisks.
c.
In the NTLM User Domain field, enter the NTLM server domain name to be used to authenticate user access.
d.
To disallow the removal of NTLM headers and fall back to the basic authentication method, check the Disable basic authentication check box.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method against Microsoft Internet Information Services (IIS) servers; the username and password information can be passed to the origin server in clear text with a basic authentication header.
Step 5
To save the authentication settings for the outgoing proxy, click Submit.
Configuring Authentication for a Proxy Using the CLI
To configure the authentication information for a proxy that was specified using the http proxy outgoing host command, you can also use the acquirer proxy authentication outgoing command from the root Content Engine CLI. The following example shows the authentication configuration for a nontransparent proxy server (IP address 192.168.1.1, port 8080) with NTLM authentication:
CE(config)# acquirer proxy authentication outgoing 192.168.1.1 8080 myname password
password ntlm mydomain basic-auth-disable
Note
If there is a transparent proxy between the root Content Engine and the origin server, you can use the acquirer proxy authentication transparent command from the root Content Engine CLI to specify the authentication information for the transparent proxy. (See the "Configuring Authentication for a WCCP Proxy Using the CLI" section.)
To verify that the authentication information has been set up correctly, use the show acquirer proxy authentication EXEC command.
The acquirer supports proxy chaining as long as there is only one proxy in the chain that requires authentication. The acquirer will fail if more than one proxy in the chain requires authentication.
Configuring Authentication for a WCCP Proxy Using the Content Distribution Manager GUI
In a transparent caching environment using a WCCP proxy, requests for content are redirected to a Content Engine by a WCCP-enabled router. When an HTTP proxy-style request from the root Content Engine to the origin server is intercepted by a WCCP proxy that requires authentication, you can configure authentication settings for the WCCP proxy.
To configure acquirer WCCP proxy authentication settings in a transparent proxy environment, follow these steps:
Step 1
Choose Devices > Devices.
Step 2
Click the Edit icon next to the Content Engine for which you want to specify acquirer WCCP proxy authentication settings. The Device Home window appears.
Step 3
In the Contents pane, choose Prepositioning > Acquirer WCCP Proxy Authentication. The Acquirer WCCP Proxy Authentication for Content Engine window appears. (See Figure 6-10.)
Figure 6-10 Acquirer WCCP Proxy Authentication
Step 4
Check the Enable check box to enable transparent proxy authentication for the acquirer.
Step 5
In the Username field, enter the name of the user to be authenticated to acquire content from the origin server. This username is used for both NTLM and basic authentication.
Step 6
Enter the password of the user in the Password field. Reenter the same password in the Confirm Password field for confirmation. The password details appear as asterisks.
Step 7
In the NTLM user domain field, enter the NTLM server domain name to be used to authenticate user access.
Step 8
Check the Disable basic authentication check box to disallow the removal of NTLM headers. When checked, NTLM headers will not be stripped off, and therefore fallback to the basic authentication method is prevented.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method against Microsoft Internet Information Services (IIS) servers; the username and password information can be passed to the origin server in clear text with a basic authentication header.
Step 9
To save your settings, click Submit.
A "Click Submit to Save" message appears in red next to the Current Settings line when there are pending changes to be saved after you have applied default or device group settings. You can also revert to the previously configured window settings by clicking Reset. The Reset button is visible only when you apply default or device group settings to change the current device settings but the settings have not yet been submitted.
Modifying WCCP Proxy Authentication Settings
You can modify the following WCCP proxy authentication settings in any order:
•
To delete the configured settings for the Content Engine, click the Remove Device Settings icon in the taskbar to delete the settings. This icon appears only if you have configured the settings for the Content Engine.
•
To restore the factory default settings to the Content Engine, click the Apply Defaults icon in the taskbar.
•
To override the device group settings applied to the Content Engine with the factory default settings, click the Override Group Settings with Defaults icon in the taskbar. This icon appears only if you have applied the device group settings to the Content Engine.
•
When settings have been applied from device groups with which the Content Engine is associated, click the Override Group Settings icon in the taskbar to override the device group settings and configure the device settings. This icon appears only if you have applied the device group settings to the Content Engine.
•
When a Content Engine is associated with one or many device groups that have been configured with acquirer WCCP proxy authentication settings, a drop-down list appears in the taskbar. You can choose a device group name from this list if you want to apply settings from a different device group to this Content Engine.
Configuring Authentication for a WCCP Proxy Using the CLI
For a transparent WCCP proxy, if authentication is required, you can specify the authentication information by using the acquirer proxy authentication transparent global configuration command in the root Content Engine CLI. The following example shows the authentication configuration for a transparent WCCP proxy server with basic authentication. The username is admin, and the password is default.
CE(config)# acquirer proxy authentication transparent
CE(config)# acquirer proxy authentication transparent admin
CE(config)# acquirer proxy authentication transparent admin password
CE(config)# acquirer proxy authentication transparent admin password default
Verifying the Results
Use the following show CLI commands to verify acquisition results.
Table 6-5 show Commands for Content Acquisition
Command
|
Description
|
show acquirer channel
|
Shows how many channels use the Content Engine as the root Content Engine.
|
show acquirer progress
|
Shows the progress of the acquisition.
|
show statistics acquirer
|
Shows the result of content acquisition: how many items have been acquired and how much disk space has been used.
|
show statistics acquirer job-list
|
Displays the details of all the single items and crawler jobs for the channel specified by ID or name.
|
show statistics acquirer error
|
Shows the detailed error message for an acquisition failure.
Note For a crawl job, only the first 100 errors encountered are displayed.
|
show statistics replication
|
Shows the replication status.
|
Troubleshooting Content Acquisition
To monitor acquisition progress and to troubleshoot, use the following commands from the root Content Engine CLI:
•
Use show acquirer channel EXEC command to obtain channel information, such as the channel-id and channel-name, that you need to enter in other show acquirer commands, such as the show acquirer progress command. In the following example, the channel-id is 793 and the channel-name is group01-cifs.
CE# show acquirer channels
Acquirer information for all channels:
--------------------------------------
Channel-Name : group01-cifs
WebSite-Name : group01-cifs
Root-CE-Type : Configured
Origin FQDN : cdn.allcisco.com
Manifestfile-URL : ftp://10.1.1.1/cifs.xml
•
Use the show acquirer EXEC command to make sure that the acquirer process on the root Content Engine is working correctly, and that the device is using the expected amount of bandwidth for acquisition. The following example shows that the acquirer is running properly and that the device is configured with unlimited bandwidth for acquisition of content.
Content Engine# show acquirer
Current Acquisition Bandwidth:Not Limited
•
Use the show acquirer progress EXEC command to check how far the acquisition of content has progressed. A specific channel ID or channel name can be specified to obtain the progress for a specific channel. In the example below, the acquirer has already acquired 2237 items.
ContentEngine# show acquirer progress channel-id 793
Acquirer progress information for channel ID:793 Channel-Name:group01-cifs
-----------------------------------------------------------------
Acquired Single Items : 0 / 0
Acquired Crawl Items : 2237 / 2500 -- start-url=www.mtv.com//
•
Use the show statistics acquirer channel-id or show statistics acquirer channel-name EXEC command to obtain the detailed acquisition statistics for a given channel. In the example below, there was an error acquiring two items.
ContentEngine# show statistics acquirer channel-id 793
Statistics for Channel Channel-id :793 Channel-Name :group01-cifs
---------------------------------------------------------
Total Number of Acquired Objects :2237
Total Disk Used for Acquired Objects :981511280 Bytes
Total Number of Failed Objects :2
Total Number of Re-Check Failed Objects :0
•
Use the show statistics acquirer errors channel-id or show statistics acquirer errors channel-name EXEC command to see the reasons why the errors occurred. In the example below, one error occurred because there was a problem acquiring the URL. The other error occurred because the disk quota for the channel configured in the Content Distribution Manager GUI would have been exceeded if the specified URL had been acquired. You can increase the channel disk quota to correct this error.
Content Engine# show statistics acquirer errors channel-id 793
Acquisition Errors for the Channel ID:793
-------------------------------------
Crawl job:start-url http://www.mtv.com//
Internal Server Error(500):http://cgi.cnn.com/entries/intl-emailsubs-confirm
Exceeded Disk Quota(703):http://www.cdt.org/copyright/backgroundchart.pdf
•
If more detailed troubleshooting of content acquisition is required, you can increase the debug level of the acquirer using the debug acquirer trace EXEC command. The logs are written to local1/errorlog/acquirer-errorlog.current.
•
To verify that an expected object has been pre-positioned on the Content Engine, use the show distribution object-status EXEC command, as shown in the following example:
CE# show distribution object-status
http://172.18.81.168/Videos/SM-final%20Innebandy%202003.wmv
========== Website Information ==========
Origin Server FQDN: 172.18.81.168
Content UNS Reference #: 1
========== Channels Information ==========
*** Channel 1903 (name = A_Multicast) ***
File State: Ready for distribution
Multicast for Channel: Not Enabled
Replication Lock: Received by Unicast-Receiver/Acquirer
MD5 of MD5: tjS#DxqE5oUc024Z8XtFDw..
Source Url: http://172.18.81.168/Videos/SM-final%20Innebandy%202003.wmv
Source Last Modified Time: Wed Jan 7 19:03:48 2004
Requires Authentation: No
Play servers: HTTP HTTPS WMT
Content uns_id: NgcJTCU#JaY4ZGIPbsrONw..
Content gen-id: 1768:1136512329:2
========== CDNFS Information ==========
Internal File Name:
/disk00-04/d/http-172.18.81.168-k5bsm1o+y14jgiqsvwaohq/19/19f6d5cec7266c33f419709dc28c
8d9b.0.data.wmv
Actual File Size: 2756437 bytes
MD5 of MD5 (Re-calculated): tjS#DxqE5oUc024Z8XtFDw..
Metadata match with: Channel 1903
Source-url to CDN-object mapping:
Source-url: http://172.18.81.168/Videos/SM-final%20Innebandy%202003.wmv
Used by CDN object: ---- Yes ----
Internal File Name:
/disk00-04/d/http-172.18.81.168-k5bsm1o+y14jgiqsvwaohq/19/19f6d5cec7266c33f419709dc28c
8d9b.0.data.wmv
Actual File Size: 2756437 bytes
========== CDNFS lookup output ==========
Allowed Playback via HTTP WMT HTTPS
Last-modified Time Wed Jan 7 19:03:48 2004
cache-control max-age=864000
cdn_uns_id NgcJTCU#JaY4ZGIPbsrONw..
content-type video/x-ms-wmv
etag "042e6fa50d5c31:b39"
last-modified Wed, 07 Jan 2004 19:03:48 GMT
Internal path to data file:
/disk00-04/d/http-172.18.81.168-k5bsm1o+y14jgiqsvwaohq/19/19f6d5cec7266c33f419709dc28c
8d9b.0.data.wmv
By comparing fields, such as Total Size, Transfered Size, and Source URL in the Object Replication output and Actual File Size and Source URL in the Source-URL-to-CDN-Object Mapping output, you can determine whether or not the object that is stored is the same as the object that was requested.
•
To view the file directory structure on the Content Engine and verify the physical file on the disk, you can use the cdnfs browse EXEC command, as shown in the following example:
------ CDNFS interactive browsing ------
dir, ls: list directory contents
cd,chdir: change current working directory
info: display attributes of a file
more: page through a file
exit,quit: quit CDNFS browse shell
http-172.18.81.151-glfc4h-b9gywnf5rlnfweg/
http-172.18.81.163-og5o21u178nrhw1mctgtiq/
file--xrnfwxifgu62jtiwtyixvg/
http-172.18.81.168-k5bsm1o+y14jgiqsvwaohq/
376 Bytes manifest-Channel_1903.xml-lEYmrfnjt2o5GbUNwLCApA
/172.18.81.168/>cd Videos
/172.18.81.168/Videos/>ls
2756437 Bytes SM-final Innebandy 2003.wmv <===============Physical file on disk
/172.18.81.168/Videos/>quit
Retry and Refresh Mechanisms
ACNS 5.x software supports various retry and refresh mechanisms in the manifest file. Following are some things to note.
Using the <schedule> and <repeat> Subelements
ACNS 5.5 software supports two subelements in the manifest file: <schedule> and <repeat>. These tags allow you to specify a time to begin a recrawl or refetch and are more powerful than the ttl schema, which only allows you to specify the interval between refetch or recrawl occurrences. The <schedule> element can be a subtag of <options>, <item-group>, <item> or <crawler>. If both the <schedule> tag and the ttl attribute are specified in the manifest file, the <schedule> designation takes precedence over ttl. (See the "item" section on page A-44 for a detailed description of these subelements.)
Using the faiRetryInterval Attribute
When the acquirer tries to acquire a content item specified in the <item> or <crawler> tags, and the acquisition fails (for example, because of some intermittent error or for some other reason), the acquisition task (corresponding to the <item> or <crawler> tag) is retried after the interval specified in the failRetryInterval attribute.
If the failRetryInterval attribute is not specified, the default interval for retrying the task is 5 minutes. (For a crawl job, the task is considered to be a failure only when crawling fails to occur at all. If only some of the crawled pages fail to be acquired, the task is still considered a success.) The following rules apply to the retry mechanism:
•
Single item tasks specified in the <item> tag are retried for all errors except the EXCEED_DISK_QUOTA error.
•
Crawl tasks are not considered a failure if the error status is between 300 and 500. The EXCEED_DISK_QUOTA error does not cause a retry either.
•
When you change the disk quota using the Content Distribution Manager GUI, the acquirer is notified automatically and retries all status error nodes containing the EXCEED_DISK_QUOTA error.
Using the ttl Attribute or the Fetch Manifest Now Button
When an item is acquired successfully, and if a positive value is specified in the ttl attribute, the acquirer rechecks the content for freshness at the interval specified by the ttl attribute. A ttl value of 0 (zero) means that the acquirer will not recheck the item unless the manifest file is updated. A negative ttl value means that the acquirer will never recheck the item. The following rules apply to the refresh mechanism:
•
When ttl > 0: Recheck every ttl minutes. The acquirer also rechecks the content when the manifest file is reparsed and the content specification in the manifest file has changed, or when you click the Fetch Manifest Now button in the Content Distribution Manager GUI. (See the next section, "Updating Channel Content.")
•
When ttl = 0: Only acquire once. The acquirer only rechecks when the manifest file is reparsed and the content specification in the manifest file has changed, or when you click the Fetch Manifest Now button in the Content Distribution Manager GUI.
•
When ttl < 0: Only acquire once. The acquirer will not recheck even if manifest file is reparsed, or when you click the Fetch Manifest Now button in the Content Distribution Manager GUI.
Reparsing the Local Manifest File From Content Distribution Manager GUI
When you add content to a channel using the Content Distribution Manager GUI to define the content, a manifest file is automatically generated and stored locally in the Content Distribution Manager. When you make any changes such as adding, removing, or modifying content definitions in the GUI and click the Submit or Update button, the local manifest file is automatically reparsed, changes are detected, and the corresponding items are acquired or removed. Note that when you click the Submit or Update button, it does not trigger a recheck of all the content in the channel.
Updating Channel Content
At any point after you have replicated content to the Content Engines that are associated with your channel, you can update that content using the fetch manifest feature. For example, if you modify your manifest file to point to new content or remove references to content that you want to make obsolete, you must fetch the manifest file to begin replication of any new channel content, and to sever connections to content that you want to make obsolete.
Note
Content that is removed from the manifest file is made unavailable as soon as that updated manifest file is fetched. Obsolete content is not immediately deleted from the channel cache but is eventually removed to make room for new channel content.
To fetch a new or updated manifest file, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Services > Web > Channels.
Step 2
Click the Edit icon next to the name of the channel to open your channel for editing.
Step 3
In the Contents pane choose Channel Content.
Step 4
Verify that the Manifest URL field points to the correct manifest file for the channel.
Step 5
Click the Fetch Manifest Now button. You are prompted to confirm your decision.
When you click this button, the software checks to see if the manifest file has been updated, and the updated manifest file is downloaded and reparsed. Also, regardless whether the manifest file has been updated, all content in the channel is rechecked and the updated content is downloaded.
Step 6
Click OK to execute your request.
To force the replication of channel content and refresh the information, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Services > Web > Channels.
Step 2
Click the Edit icon next to the name of the channel that you want to modify to open your channel for editing.
Step 3
In the Contents pane, click Replication Status. The Replication Status for Channel window appears.
Step 4
Click the Force Replication information refresh icon in the taskbar. You are prompted to confirm your decision.
Step 5
Click OK. You are notified that your request has been sent and prompted to check back in a few minutes.
Step 6
Click OK. After a time, the Replication Status for Channel window refreshes.
For a description of the channel replication status data, see Chapter 11, "Viewing Content Replication Status."
Bandwidth Control
The bandwidth control feature allows you to specify how much bandwidth in the network is consumed by data replication from the root Content Engine to the edge Content Engines. Bandwidth controls allow you to specify the amount of bandwidth to be used at various times during the day and also to set up a weekly schedule that repeats week after week. This section describes the bandwidth controls as they relate to content acquisition and distribution. Bandwidth controls are available for the acquisition process, replication process, and multicast sender.
Configuring Acquisition and Distribution Default Bandwidth Settings
Default bandwidth settings can be configured for acquisition and distribution of content. Default bandwidth is the amount of bandwidth allocated for content acquisition and distribution when there is no scheduled bandwidth.
If a Content Engine is assigned to a device group and no default bandwidth has been set for the device, the device group default bandwidth settings are applied. If the Content Engine is part of multiple device groups, the most recently updated default bandwidth settings are applied.
However, if default bandwidth is specified for a device, it will override the settings at the device group level. This occurs when a Content Engine is a member of a device group.
To configure the default bandwidth settings for acquisition and distribution for the Content Engine, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Devices.
Step 2
Click the Edit icon next to the desired Content Engine. The Device Home window for the Content Engine appears.
Step 3
In the Contents pane, choose Prepositioning > Default Bandwidth. The Acquisition and Distribution Default Bandwidth for Content Engine window appears. (See Figure 6-11.)
Figure 6-11 Acquisition and Distribution Default Bandwidth Window
Step 4
In the Acquisition-in Bandwidth field, enter the bandwidth value in kbps for incoming content acquisition traffic from origin servers. The default is 1024 kbps. Enter zero (0) to disable acquisition and distribution activity during unscheduled times of the day.
Note
If you leave this field blank, acquisition and distribution traffic may use the maximum bandwidth allowance for the unscheduled periods.
Step 5
In the Distribution-in Bandwidth field, enter the bandwidth value in kbps for incoming unicast content distribution traffic from Content Engines. The default is 56 kbps. Enter zero (0) to disable acquisition and distribution activity during unscheduled times of the day.
Note
If you leave this field blank, acquisition and distribution traffic may use the maximum bandwidth allowance for the unscheduled periods.
Step 6
In the Minimum Nonstreaming Acquisition-in Bandwidth field, enter the bandwidth value in kbps for incoming nonstreaming content acquisition traffic from the origin server. The default is 50 kbps. Enter zero (0) to disable acquisition and distribution activity during unscheduled times of the day.
Minimum nonstreaming bandwidth is the bandwidth allocated for acquiring content using nonstreaming protocols such as HTTP, HTTPS, and FTP. This setting is useful when streaming protocols, such as RTSP, take up all the available bandwidth in certain cases, yet bandwidth needs to be reserved for acquiring manifest files and other content items from origin servers using nonstreaming protocols. We recommend that you retain the minimum nonstreaming bandwidth default setting, and adjust it only when it might be necessary to increase the value, such as when certain large files fail to be acquired because of many stream acquisitions taking place simultaneously.
Note
If you leave this field blank, acquisition and distribution traffic may use the maximum bandwidth allowance for the unscheduled periods.
Step 7
In the Distribution-out Bandwidth field, enter a bandwidth value in kbps for outgoing unicast content distribution traffic to various Content Engines in the ACNS network. The default is 128 kbps. Enter zero (0) to disable acquisition and distribution activity during unscheduled times of the day.
Note
The Multicast-out Bandwidth field is read-only. This field displays the multicast-out bandwidth setting for sender Content Engines. This value is configured in the Default Multicast-out Bandwidth field in the Creating New Multicast Cloud window. (See the "Configuring Multicast Cloud Properties" section on page 5-34.) For receiver Content Engines, this field is blank.
Note
If you leave this field blank, acquisition and distribution traffic may use the maximum bandwidth allowance for the unscheduled periods.
Step 8
To revert to the previously configured window settings, click Reset. The Reset button is visible only when you apply default or group settings to change the current device settings, but you have not yet clicked Submit.
Step 9
To save your settings, click Submit.
Displaying a Graphical Representation of the Acquisition and Distribution Bandwidth Settings
You can view a graphical representation of the bandwidth settings configured on a Content Engine for the acquisition and distribution of files. The vertical axis of the graph represents the amount of bandwidth in kbps and the horizontal axis represents the days of the week. The scale shown on the vertical axis is determined dynamically based on the bandwidth rate for a particular type of bandwidth and is incremented appropriately. The scale shown on the horizontal axis for each day is incremented for each hour. Each type of bandwidth is represented by a unique color. A legend at the bottom of the graph maps the colors to the corresponding bandwidths.
To view the graph that displays the acquisition and distribution bandwidth, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Devices.
Step 2
Click the Edit icon next to the Content Engine for which you want to view the acquisition and distribution bandwidth. The Device Home window appears.
Step 3
In the Contents pane, choose Prepositioning > Default Bandwidth. The Acquisition and Distribution Default Bandwidth for Content Engine window appears. (See Figure 6-11.)
Step 4
Click the Display Graph icon in the taskbar. A new Acquisition and Distribution bandwidth for Content Engine popup window appears, displaying the bandwidth graph.
You can choose the view that you wish to apply to the bandwidth graph by clicking the different viewing options. (See Table 6-6 for a description of the viewing options and their descriptions.)
Step 5
Click Close once you have finished viewing the settings. Alternatively, you can click Refresh to view the most currently applied bandwidth settings.
Table 6-6 Viewing Options in Acquisition and Distribution Bandwidth Graph
Item
|
Description
|
View specific servers
|
Displays the bandwidth settings for the corresponding bandwidth type selected.
|
Distribution In
|
Displays the bandwidth settings for incoming content distribution traffic.
|
Distribution Out
|
Displays the bandwidth settings for outgoing content distribution traffic.
|
Acquisition In
|
Displays the bandwidth settings for incoming content acquisition traffic.
|
All Servers
|
Displays a consolidated view of all configured bandwidth types. This is the default view combined with the Full Week view.
|
View mode
|
Displays detailed and composite bandwidth settings.
|
Show Detailed Bandwidth
|
Toggles with the Show Effective Bandwidth option. Displays the detailed bandwidth settings for the device and its associated device groups. The bandwidth settings of the device and device groups are shown in different colors for easy identification.
|
Show Effective Bandwidth
|
Toggles with the Show Detailed Bandwidth option. Displays the consolidated or composite bandwidth settings for the device and its associated device groups.
|
Show Aggregate View
|
Toggles with the Show Non-Aggregate View option. Displays the bandwidth settings configured for the corresponding device groups.
|
Show Non-Aggregate View
|
Toggles with the Show Aggregate View option. Hides the bandwidth settings configured for the corresponding device groups.
|
View by day
|
Displays the bandwidth settings for a particular day or all days of the week.
|
Sun, Mon, Tues, Wed, Thurs, Fri, Sat
|
Displays the bandwidth settings for the corresponding day of the week.
|
Full Week
|
Displays the bandwidth settings for the entire week. This is the default view combined with the All Servers view.
|
Configuring Acquisition and Distribution Bandwidth Settings for Scheduled Times
In addition to being able to set the default bandwidth limits for incoming and outgoing traffic on the Content Engine, you can use the Content Distribution Manager GUI to configure different limits for different time segments that form a week-long cycle. For example, you can configure the acquisition-in limit at a maximum of 100 kbps from 8:00 a.m. to 8:00 p.m., Monday through Friday, and extend that limit to as high as 10 Mbps during the nights from 8:00 p.m. to 8:00 a.m., Monday through Friday and all day Saturday and Sunday. These settings override the default value for the time and period that you specify.
Note
For a schedule from 8:00 p.m. to 8:00 a.m., the administrator must configure two schedules in order to span the two days: one from 8:00 p.m. to 11:59 p.m. (2000 to 2359) and another from 12:00 a.m. to 8:00 a.m. (0000 to 0800).
Note
Distribution bandwidth settings apply only to unicast distribution.
Note
To disable acquisition and distribution activity during unscheduled times of the day, you must enter zeros (0) for the default bandwidth settings in the Acquisition and Distribution Default Bandwidth window. (See Figure 6-11.) If you do not set the default bandwidth fields to 0, acquisition and distribution traffic continues during the unscheduled times of the day. If you leave the default bandwidth fields blank, acquisition and distribution traffic may use the maximum bandwidth allowance for the unscheduled periods.
To configure distribution bandwidth settings for specific days and times, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Devices > Devices.
Step 2
Click the Edit icon next to the name of the Content Engine that you want to view. The Device Home window appears.
Step 3
In the Contents pane, choose Prepositioning > Scheduled Bandwidth. The Acquisition and Distribution Bandwidth Schedule for Content Engine window appears.
Step 4
The Aggregate Settings Yes radio button is selected by default. This means that the acquisition and distribution bandwidth schedule is displayed for the Content Engine as well as for any associated device groups. The settings for the acquisition and distribution bandwidth schedule for device groups cannot be modified or deleted; they are read-only. If you click the Aggregate Settings No radio button, you can view and modify the settings for the acquisition and distribution bandwidth schedule for the Content Engine only. The the acquisition and distribution bandwidth schedule settings for any associated device groups are not even displayed.
Step 5
Click the Create New Bandwidth Setting icon in the taskbar. The Creating New Acquisition and Distribution Bandwidth Schedule for Content Engine window appears. (See Figure 6-12.)
Figure 6-12 Creating New Acquisition and Distribution Bandwidth Schedule Window
Step 6
Choose a bandwidth type from the drop-down list. (See Table 6-7 for a description of each field in this window. All fields are required.)
Step 7
Enter the bandwidth rate, start time, end time, and day of the week in the appropriate fields.
Table 6-7 Configuring Acquisition and Distribution Bandwidth Settings
Field
|
Description
|
Bandwidth Type
|
Distribution-in—For incoming unicast content distribution traffic from Content Engines.
Distribution-out—For outgoing unicast content distribution traffic to Content Engines.
Acquisition-in—For incoming content acquisition traffic from origin servers.
Multicast-out—For outgoing multicast content distribution traffic to Content Engines.
|
Bandwidth Rate
|
Maximum amount of bandwidth that you want to allow (in kbps).
|
Start Time
|
Time of day for the bandwidth setting to begin, using a 24-hour clock in local time (hh:mm).
|
End Time
|
Time of day for the bandwidth setting to end (hh:mm).
|
Day Selection
|
Days on which bandwidth settings apply.
• Full Week—Specifies that the allowable bandwidth settings are applied for an entire week.
• Sun, Mon, Tue, Wed, Thu, Fri, and Sat—Specifies individual days of the week on which the allowable bandwidth settings take effect.
|
Step 8
Click Submit.
Scheduling Bandwidth for Streaming Acquisition
When streaming acquisition is configured, the streaming content consumes a large amount of bandwidth over the playtime duration of the file. You must ensure that sufficient bandwidth is available for sufficiently large blocks of time. For multi-bit-rate streams (MBR), the required bandwidth is the sum total of all bit rates encoded in the stream.
Error Codes
This section describes the error codes for content replication. The following error codes are shown in the Content Distribution Manager GUI Replication Status window:
ACNS Unified Name Space Errors
1401: Bad magic number in unified name space (UNS) meta file
1402: Unknown version in UNS meta file
1403: Bad checksum on UNS meta file
1404: Internal error - URL mismatch between caller specified URL and URL in meta file
1405: Invalid URL syntax
1406: Attempt to create URL already in UNS
1407: Insufficient space to store requested object
1408: Internal logic error in UNS server code
1409: Requested object not found in UNS
1410: Requested UNS operation not implemented
1411: Failure in underlying RPC transport
1412: Destination URL already exists
1413: Channel does not exist
1414: Writes failed to all metadata files
1415: Object is not servable
1416: Object is out of presentation time
1417: Object playback not allowed by this playback server
1418: Live object, but attributes invalid
1419: Alternate media attributes invalid
1420: Alternate media is not servable
1421: Channel would be over disk quota
1422: Cannot change channel ID
1423: Object already in specified channel
1424: Object not in specified channel
1425: Metadata operations disabled (no file systems)
1426: Channel operations disabled (no file systems)
1427: Nonobject metadata services not initialized
1428: Out of handles for nonobject metadata files
1429: Internal error loading metadata file
1430: (No error text available)
1431: Too many CDN files (cannot add to URL for file system map [UFM])
1432: Specified legacy ECDNFS file not found
1433: Bad MD5 checksum passed by caller
1434: Bad tags present in supplied attributes (OBSOLETE)
1435: Cannot resize content file migrated from ECDN
1436: UNS symlink references nonexistent URL
1437: URL is not a UNS symlink
1438: Too many levels of UNS symlinks
1439: UNS entry metafile truncated
1440: Output to be returned is too big in size
1441: URL request was made on a nondefault service port
1442: Specified legacy file not on any UNS filesystem
1443: Specified legacy file is not in the `cache' directory
1444: Specified legacy file is referenced by a UNS entry
1445: Specified legacy file is not a .data file
1446: File open failed during ASX/SMIL rewrite operation
1447: Parse error during ASX/SMIL rewrite operation
1448: Initialization error during ASX/SMIL rewrite operation
1449: Actual Rewriting failed during ASX/SMIL rewrite operation
1450: No more file system slots
1451: Specified file system is already in use
1452: Specified file system not known to UNS
1453: Specified file system bytes in use exceeds target
1454: Cannot unuse the file system containing the symlink tree
1455: Cannot add file system because there is no local disk-based CDNFS storage
Acquirer Error Codes
The following error codes are used for content acquisition errors:
700: Acquirer internal error
701: Manifest parser error
702: Manifest parse warning
703: Exceeded disk quota
704: No space in UNS
705: It is a folder
706: ACCEPT failed
707: Connection refused
708: Listen failed
709: Mismatched in crawling
710: No-cache instructed from server
711: Disabled by the user
712: Downloaded size mismatched with content length
713: Invalid response received from server
714: Database access error
715: Not to acquire URL with ?
716: Invalid redirect foo to foo/
717: Illegal folder for file import
718: Unable to connect to proxy server
719: URL is too long
720: HTTP metadata is too long
721: Connection closed by peer
722: Invalid content length
730: MmsOverHttp is not supported
905: Socket timeout
906: No host
907: Zero bandwidth
908: File download aborted
909: Content expired before fetch
912: Request timed out
1000: UNS error
Where to Go Next
In this chapter you accomplished the following tasks:
•
Defined content for pre-positioning either directly through the Content Distribution Manager GUI or using a manifest file as described in Appendix A, "Creating Manifest Files"
•
Configured the basic manifest file settings in the Content Distribution Manager GUI so that the root Content Engine could fetch the manifest file and update the channel content at specified intervals
•
Configured bandwidth settings for the root Content Engine to optimize bandwidth usage in your network
•
Configured a proxy to fetch the manifest file, if needed
•
Configured authentication as needed to fetch the manifest file, acquire content, or both
Now you need to consider how the content is going to be played back to the end user. If the content is to be scheduled or rebroadcast, proceed to Chapter 7, "Creating and Managing Programs."