Table Of Contents
Configuring the ACNS Network for Content Acquisition
Acquiring Pre-Positioned Content
Set Up the Origin Server
Using HTTP and HTTPS
Using FTP
Using MMS and MMS-over-HTTP
Create a Channel
Assign a Root Content Engine
Assigning a Backup Root Content Engine
Create a Manifest File
Specify Channel Acquisition and Distribution Properties
Distribution Priority
Item Priority
Channel Quota
Update Interval
Configure Manifest File and Proxy Information for the Channel
Generate the Publishing URL
Assign a Playserver
Retry and Refresh Mechanisms
Verifying the Results
Bandwidth Control
Configuring Acquisition and Distribution Default and Maximum Bandwidth Settings
Displaying a Graphical Representation of the Acquisition and Distribution Bandwidth Settings
Configuring Acquisition and Distribution Bandwidth Settings for Scheduled Times
Scheduling Bandwidth for Streaming Acquisition
Proxy Support
Configuring an HTTP Proxy Server Using the CLI
Configuring a Proxy Server Using the Manifest File
Configuring the noProxy Attribute in the Manifest File
Configuring a Proxy Server and Port for the Manifest File
Authentication Support
Acquiring Content That Requires NTLM Authentication
Proxy Authentication Support
Configuring Authentication for a Proxy to Fetch the Manifest File
Configuring Authentication for a Proxy Using the Manifest File
Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI
Configuring Authentication for a Proxy Using the CLI
Configuring Authentication for a WCCP Proxy Using the Content Distribution Manager GUI
Modifying WCCP Proxy Authentication Settings
Configuring Authentication for a WCCP Proxy Using the CLI
Content Acquisition Guidelines
Error Codes
ACNS Unified Name Space Errors
Acquirer Error Codes
MMS-Specific Acquisition Error Codes
Configuring the ACNS Network for Content Acquisition
Content acquisition is an important part of content pre-positioning in the ACNS network. ACNS software uses the concept of a channel to map a set of content objects to a set of Content Engines. Before pre-positioned content can be acquired or distributed in the ACNS network, users must create a new channel, subscribe Content Engines to that channel, and designate one of the Content Engines as the root Content Engine. The source of the content might be stored in various file servers or web servers. The root Content Engine of the channel fetches the content objects from these origin servers and then replicates this content to all the Content Engines in the channel.

Note
Pre-positioned content is served only on ports that are standard for the protocol. If the incoming URL contains a port number other than the protocol's standard port (for example, HTTP uses port 80, RTSP uses port 554, and WMT uses port 1755), then the Content Engine does not attempt to serve the content from the pre-positioned file system (cdnfs). Instead, the Content Engine tries to serve the content from the cache file system (cfs) or tries to fetch the content from the origin server, depending on the existing configuration of the Content Engine.
The root Content Engine uses a software agent, referred to as the acquirer, that gathers channel content before it is distributed to the receiver Content Engines in the ACNS network. The acquirer maintains a task list, which it updates after receiving a notification of changes in its channel configuration.
This chapter outlines the tasks necessary for acquiring pre-positioned content in the ACNS network and contains information on the following topics:
•
Acquiring Pre-Positioned Content
•
Retry and Refresh Mechanisms
•
Verifying the Results
•
Bandwidth Control
•
Proxy Support
•
Authentication Support
•
Content Acquisition Guidelines
•
Error Codes
Acquiring Pre-Positioned Content
To configure your ACNS network to acquire content, you must complete the following tasks:
1.
Set Up the Origin Server
2.
Create a Channel
3.
Assign a Root Content Engine
4.
Create a Manifest File
5.
Specify Channel Acquisition and Distribution Properties
6.
Generate the Publishing URL
7.
Assign a Playserver
Set Up the Origin Server
Content is usually stored in file servers or web servers that are not part of the ACNS network of managed devices. In order to pre-position this content for ACNS network acquisition and distribution, the origin file server or web server must support at least one of the following protocols that are used by ACNS software to acquire content:
•
HTTP
•
HTTPS
•
FTP
•
MMS
•
MMS-HTTP
Using HTTP and HTTPS
Any standard web server supports the HTTP and HTTPS protocols. You can set up your web server as an origin server for pre-positioned content intended for the ACNS network by moving the content over to the web server or by configuring the web server to access the desired content. The following two web servers are the most popular:
•
Apache—Supported on UNIX, Linux, and Microsoft NT platforms
•
Microsoft IIS—Only supported on Microsoft platforms
For the HTTP and HTTPS protocols, content can be fetched as single content items by using the <item> tag in the manifest file, or content can be fetched by using the crawling feature to crawl the web server directories. If you want to use the crawl feature, you have to enable directory indexing and make sure the directory does not contain index.html, default.html, or home.html files.
Tip
You might need to install SSL certificates in order to set up the web server for HTTPS content acquisition. If your server is using an expired certificate, or a self-signed certificate, you should set sslAuthType="weak" in the manifest file <host> tag.
Using FTP
The root Content Engine acquirer supports acquiring files from FTP servers. Using FTP, content can be acquired as single content items by using the <item> tag in the manifest file, or content can be fetched by using the crawling feature to crawl the FTP server directories. The following popular FTP servers are supported:
•
Microsoft IIS 4.0, 5.0, 6.0—For Windows platforms
•
Wu-2.6.1-18—For Linux platforms
•
FTP Server—In SunOS (5.6)
•
proFTPd—For Linux platforms
Other supported Windows FTP servers are:
•
WS_FTP server
•
Bulletproof FTP server
•
SurgeFTP
•
SlimFTPd
You can use other FTP servers, as long as the following FTP commands are supported:
•
USER, PASS
•
[SIZE, MDTM] [or] [LIST -a]
•
PASV [or] PORT
•
CWD ~ [or] CWD <SPACE> [or] CWD /
•
RETR
Using MMS and MMS-over-HTTP
The root Content Engine supports acquiring streaming media files using either the MMS protocol or MMS-over-HTTP. In the manifest file, MMS-over-HTTP can be specified as MMS-HTTP or simply as HTTP.
When the MMS protocol is specified, MMST is used to fetch the stream over TCP. When MMS-HTTP is specified, the software uses the MMS-over-HTTP protocol to download the stream. When HTTP is specified, the software automatically detects whether the download is a regular HTTP download or a streaming MMS over HTTP download, and uses the corresponding protocol.
Users sometimes need to pre-position streaming content that is hosted on nonstandard non-Microsoft Windows Media Technologies (WMT) servers. Using the MMS-HTTP protocol addresses this situation by allowing HTTP streaming acquisition from nonstandard origin servers that do not allow automatic switch-over from HTTP-download to HTTP-stream-download.
Note
The MMS-HTTP protocol can be specified in the manifest file only when fetching single content items with the <host> and <item> tags.
To acquire content using the MMST protocol, make sure that port 1755 is open on the firewall. For MMS-HTTP, make sure that the HTTP port, which is typically port 80, is open.
Create a Channel
Channels map a set of content objects to a set of Content Engines. You must have channels configured before you can acquire pre-positioned content or distribute it. To create channels, see "Configuring the ACNS Network for Content Distribution." Chapter 5 discusses how to create channels for content acquisition and distribution by configuring the following network elements:
1.
Locations—Creating and Modifying Locations
2.
Content Providers—Creating and Modifying Content Providers
3.
Websites—Creating and Modifying Websites
4.
Channels—Creating and Modifying Channels
Assign a Root Content Engine
A root Content Engine is used to acquire content for a channel. A channel can have only one root Content Engine. We recommend that you choose a root Content Engine that has enough bandwidth to access the content at the origin server.
For information on how to assign a root Content Engine, see the "Designating the Root Content Engine" section.
Assigning a Backup Root Content Engine
You do not need to specify a backup root Content Engine; however, for backup purposes, you must have a second Content Engine in the same location as the primary Content Engine that is subscribed to the channel. If the designated root Content Engine becomes inactive, the other Content Engine in the location automatically becomes a temporary root Content Engine. If the designated root Content Engine comes back online, it takes over as the root and the temporary root Content Engine becomes a regular Content Engine.
During content acquisition, the acquirer uses a lot of CPU power. We recommend that you set up a dedicated high-end root Content Engine for content acquisition. The root Content Engine can be used for both streaming video files or for serving pre-positioned content; however, streaming video quality might suffer during periods of heavy content acquisition.
Create a Manifest File
See "Creating Manifest Files," for details on creating manifest files.
After you create the manifest file, use the Manifest Validator tool to verify the syntax. Next, specify the manifest URL in the Creating Channel window (when creating a new channel) or in the Modifying Channel window of the Content Distribution Manager GUI. If authentication is required to fetch the manifest file, specify a username and password as well. (See Figure 6-1.)
Figure 6-1 Manifest File and Proxy Information for a Channel
Specify Channel Acquisition and Distribution Properties
This section discusses four important content acquisition and distribution properties that you need to define:
•
Distribution priority
•
Item priority
•
Channel quota
•
Update interval
Some of these properties are configured in the Channels > Channels section of the Content Distribution Manager GUI and some are defined in the manifest file.
Distribution Priority
The distribution priority setting determines the priority of content acquisition and distribution. You configure this setting from the Distribution Priority drop-down list in the Content Distribution Manager GUI. The distribution priority values are High (750), Normal (500), or Low (250). Figure 6-2 shows the acquisition and distribution properties and the manifest properties that you can configure in the Content Distribution Manager GUI. See the "Creating a Channel" section for channel configuration procedures and field descriptions.
Figure 6-2 Acquisition and Distribution Properties for a Channel
The priority of content acquisition also depends on the origin server. Requests from different origin servers are processed in parallel. Requests from the same origin server are processed sequentially by their overall priority. However, this does not hold true for Microsoft Media Server (MMS) and MMS-HTTP requests. All MMS and MMS-HTTP requests are sequentially processed by their overall priority, where: overall priority = distribution priority * 10000 + item priority
Item Priority
The item priority is determined by the manifest file. If the priority attribute is specified in the manifest file under the <item> tag, it is used as the item priority; if it is not specified, the index number of the item in the manifest file is used as the item priority. (All crawled pages have the same item priority; therefore, you do not need to specify the priority attribute under the <crawler> tag.) See the "Specifying Content Priority" section for more information.
Channel Quota
The channel quota is the disk space allowed for the channel. You configure the channel quota in the Content Distribution Manager GUI (see Figure 6-2). See the "Configure Manifest File and Proxy Information for the Channel" section for configuration information.
When configuring the channel quota, keep in mind the following:
•
The total of channel quota in all subscribed channels should not exceed the cdnfs disk space allocation of the Content Engine. See the "Updating Storage Capacity Through the Content Distribution Manager GUI" section.
•
Total used disk space in a channel should not exceed the amount of disk space that you allocated for the channel in the Content Distribution Manager GUI (Channels > Channels > Basic Settings > Definintion) Channel Quota field.
Because of overhead, the amount of disk space used by a file is always larger than the size of the file itself. To figure the amount of disk space needed for a file, follow these steps:
a.
Divide the actual file size in kilobytes (KB) by the file system block size, which is a fixed 4-KB (4096-byte) unit, and then round up the result to the nearest integer. This provides the number of filled and partially filled 4-KB blocks used by a file.
(File size in KB / 4096) rounded up to the next integer value = Total number of blocks per file
b.
Multiply the total number of filesystem blocks used by 4 KB to calculate the actual disk space consumed in bytes.
Total blocks per file * 4096 = Total disk usage in bytes
c.
Multiply 4 KB by 4 and add the product to the total disk space consumed. (The integer 4 represents disk space that is reserved for internal system usage.)
Total disk usage in bytes + (4096 bytes * 4) = Disk usage per file
Also, because the software attempts to reserve enough space for other minor internal system functions, it is helpful to configure your channel quotas (and pre-positioned disk space) with a modest amount (perhaps 10 percent) of extra space beyond the total disk space consumed.
Channel quota in kilobytes = (Total disk usage in kilobytes) + (.1 * Total disk usage in kilobytes)
Update Interval
The update interval is the interval for the root Content Engine to check the manifest file itself. (This is not the update interval for checking content.) You configure the update interval in the Content Distribution Manager GUI (see Figure 6-2). See the "Configure Manifest File and Proxy Information for the Channel" section for configuration information.
Configure Manifest File and Proxy Information for the Channel
To configure manifest file and proxy information for the channel, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Channels > Channels.
Step 2
Click the Edit icon next to the name of the channel that you want to modify. The Modifying Channel window appears.
Step 3
Use the fields provided under the Manifest heading to configure the location or information identifying the manifest file for the channel. (See Figure 6-1.)
The manifest file provides information about the content to be pre-positioned through the channel or information about the live and video-on-demand (VOD) content served through the channel.
Step 4
Use the fields provided under the Manifest Proxy Information heading to configure manifest proxy information.
See Table 6-1 for a description of the manifest fields. Required fields are indicated by an asterisk in the GUI and in the table.
Table 6-1 Manifest Properties
Property
|
Description
|
Manifest
|
Manifest URL
|
Address of the manifest file for the channel. The manifest URL must be a well-formed URL. If the protocol (FTP, HTTP, or HTTPS) for the URL is not specified, HTTP is used.
|
Channel Quota*1
|
Maximum content storage size in megabytes for pre-positioning content for this channel. (This field is required if the manifest URL is specified.)
|
Update Interval*
|
Frequency in minutes (0 to 52560000) with which the Content Engines assigned to the channel checks for updates to the manifest file. (This field is required if the manifest file is specified.)
|
Weak Certificate Verification
|
When checked, enables weak certificate verification for the manifest file. This is applicable when the manifest file is fetched using the HTTPS protocol.
Note To use weak certification for channel content, you need to specify weak certification within the manifest file.
|
Manifest Username
|
Username to fetch the manifest. The manifest username must be a valid ID. If the server allows anonymous login, the user ID can be null.
Note The Username and Password fields allow you to enter any secure login information needed to access the manifest file at its remote location.
|
Manifest Password
|
Password for the user.
|
Confirm Password
|
Password confirmation.
|
Disable basic authentication
|
When checked, NTLM headers cannot be stripped off to allow fall back to the basic authentication method.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fall back to the basic authentication method and the username and password information can be passed to the origin server in clear text with a basic authentication header.
|
NTLM user domain name
|
NTLM user domain name to pass NTLM authentication scheme configured on the origin server.
|
Manifest Proxy Information
|
|
Disable All Proxy
|
Disables outgoing proxy server for fetching the manifest file. Any outgoing proxy server configured on the root Content Engine will be bypassed, and the acquirer will contact the origin server directly.
|
Proxy Hostname
|
Host name or IP address of the proxy server used by the acquirer to retrieve the manifest file.
|
Proxy Port
|
Port number of the proxy on which the acquirer fetches the manifest file. The range is from 1 to 65535.
|
Disable proxy basic authentication
|
When checked, NTLM headers will not be stripped off to allow fall back to the basic authentication method against Microsoft Internet Information Services (IIS) servers.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fall back to the basic authentication method; the username and password information can be passed to the proxy server in clear text with a basic authentication header.
|
Proxy Username
|
Name of the user to be authenticated to fetch the manifest file.
|
Proxy Password
|
Password of the user to pass authentication from the proxy.
|
Confirm Password
|
Reentry of the same password for confirmation to pass authentication from the proxy.
|
Proxy NTLM domain name
|
NTLM user domain name to pass NTLM authentication scheme configured on the proxy.
|
Generate the Publishing URL
A publishing URL is the URL that plays back pre-positioned content in the ACNS network. A complete publishing URL consists of three parts:
•
Scheme
•
Domain name
•
Path
The path includes both the file directory path and the filename. The playserver list determines the publishing URL for the ACNS network. The playserver list is generated directly through the manifest file, through the <playServerTable> tag in the manifest file, or through the default playserver table.
Scheme
The scheme of the publishing URL is the protocol used to play the content type. For example, if an .asf video file can be played by both an HTTP and a WMT playserver, two URL schemes can be used to access this content: HTTP and MMS.
The scheme is determined by the type of playserver. The direct mapping between playserver and scheme is as follows:
Playserver
|
Scheme
|
HTTP
|
HTTP
|
Real
|
RTSP
|
WMT
|
MMS
|
QTSS
|
RTSP
|
Domain Name
The domain name of the publishing URL is determined by the configuration of the ACNS network. If WCCP is used to redirect requests to a Content Engine, its domain name is the origin server FQDN (fully qualified domain name) in the website or channel. If content routing is used, the content routing FQDN (the FQDN of the website) becomes the domain name.
All content acquired through the manifest file is published under the domain name that is entered in the Content Distribution Manager GUI (Channels > Web Sites) in the website Origin Server field, in the case of WCCP routing, or in the Request Routed FQDN field, in the case of a Content Router.
The content must be accessible from the website origin server FQDN, even if it is acquired from a different server.
The manifest file contains a <server> tag. The "server" in the case of this manifest tag refers to "acquisition server," and may or may not be the same as the website origin server FQDN that is published during content serving. Note the distinction between the acquisition server and the website origin server FQDN.
Manifest file attributes such as requireAuth and noRedirectToOrigin refer to the website origin server. Attributes such as ttl and username-password are related to the acquisition server. Thus, if you specify the requireAuth attribute for any content item, make sure that the origin server FQDN you entered in the Content Distribution Manager GUI Create New Web Site window or Modifying Web Site window (accessed through Channels > Web Sites) is accurate and can do the following:
1.
Accept requests for the path as stated in manifest.
2.
Accept authentication requests from end users for this URL.
This reminder is applicable even when you have multiple acquisition servers specified in the manifest file. Because the "real" origin server is still the website origin server FQDN, you need to make sure that content is accessible from the website origin server FQDN and that the website origin server can accept authentication requests.
Path
In most cases, the path of the publishing URL is the relative src URL, or the src attribute in the <item> tags. For content crawling, it is a relative URL, relative to the host name of the origin server.
Certain attributes in the manifest file allow you to alter the publishing URL path. These attributes are cdn-url in the <item> tag, and srcPrefix or cdnPrefix in the <crawler> and <item-group> tags. These attributes convert a relative source URL into a completely new relative ACNS network URL.
For the content in the following example, the path uses default.html instead of index.html.
<item src="index.html" cdn-url="default.html" />
The relative URL is always relative to the host name. In the following example, the relative URL is index.html, not sport/index.html.
<host name="http://www.cnn.com/sport/" />
<item src="index.html" />
In the following example, the srcPrefix and cdnPrefix attributes convert the prefix of every crawled content object from NBA/ to ABC/. The relative cdn-url is ABC/*. The path for the start-url attribute is ABC/index.html.
start-url="NBA/index.html"
Assign a Playserver
The playserver is assigned in the manifest file. See the "Generating a Playserver List" section for more details.
The <playServer> tag is very important for playing back pre-positioned content; it contains a list of playservers to play back the content. All ACNS pre-positioned content needs to use one of the ACNS software-supported playservers to play back content to end users.
ACNS 5.x software supports playservers that play back the following pre-positioned content types on the ACNS network: HTTP, HTTPS, WMT, and RTSP (RealMedia and QuickTime Streaming Server [QTSS]).
You can use any protocol to request content. Actually, the protocol information implies which playserver is needed to play the content. The ACNS software checks whether the requested protocol matches the list in the playserver table. If it matches, the request is delivered. If it does not match, the request is rejected.
You can generate a playserver list through:
•
The manifest file, by configuring playServer attributes in an <item> tag
•
The <playServerTable> tag, by configuring playserver MIME-type extension names
To create the playserver list directly through the manifest file, configure playServer attributes of the playserver list in an <item> tag. If an <item> tag does not have a playServer attribute, its playserver list is generated through the <playServerTable> tag. If the <playServerTable> tag is omitted in the manifest file, a built-in default <playServerTable> tag is used to generate the playserver list. Multiple servers are separated by commas, as shown in the following example:
<item src="video.mpg" playServer="real,wmt" />
You can also generate the playserver list that supports these streaming media types through the <playServerTable> tag. The <playServerTable> tag maps content into a playserver list based on the MIME-type extension name. If there is a <playServerTable> tag in the manifest file, use it to generate the playserver list.
To generate the playserver list though the <playServerTable> tag, use MIME-type extension names to configure which playserver can play the particular pre-positioned content, as shown in the following example:
<contentType name="application/x-pn-realaudio" />
<contentType name="application/vnd.rn-rmadriver" />
<contentType name="application/pdf" />
<contentType name="application/postscript" />
The <playServerTable> tag is used to generate a playserver list for each content type. Note that in the preceding example, any file with a PDF file or a PostScript extension uses HTTP to play the content.
In ACNS 5.1 software, HTTP and HTTPS are the default playservers. In other words, if you did not use the customized <playServerTable> tag or the playServer name field to specify playservers for content, HTTP and HTTPS are added to the playServer name fields to make sure the content can be always played back through HTTP and HTTPS.
If you do use a <playServerTable> tag or playServer attribute in the manifest file, HTTP and HTTPS are not automatically allowed for playback. In this case, if you want to use HTTP or HTTPS to play certain content, you must specify the protocol. For example:
<playServer name="tvout">
<playServer name="https">
<contentType name="text/html" />
<item src="http://www.aaa.com/a.asf" />
<item src="http://www.aaa.com/b.html" />
<item src="http://www.aaa.com/c.mpg" />
<item src="http://www.aaa.com/d.jpg" playServer="http,https" />
<item src="http://www.aaa.com/e.rm" />
In this example, the playserver list is generated as follows:
a.asf: wmt: from <playServerTable> by extension
b.html: https: from <playServerTable> by contentType
c.mpg: tvout: from <playServerTable> by extension.
d.jpg : http + https: from "playServer" attribute
e.rm : real + http + https: from build-in <playServerTable>
Because there is no playserver specified in the manifest file for the .rm extension, the built-in rule matches it to the RealMedia playserver. Also http and https playservers are automatically added, if the built-in playserver table is used.
The TV-out playserver is a special playserver because it allows all pre-positioned content to be played back. If you want to exclude the content from being played back by other playservers, specify the TV-out playserver. ACNS software does not allow other playservers to play content once a playserver has been specified.
Retry and Refresh Mechanisms
When the acquirer tries to acquire a content item specified in the <item> or <crawler> tags, and the acquisition fails (for example, because of some intermittent error or for some other reason), the item task (corresponding to the <item> or <crawler> tag) is retried after the interval specified in the failRetryInterval attribute.
If the failRetryInterval attribute is not specified, the default interval for retrying the task is 5 minutes. (For a crawl job, the task is considered to be a failure only when crawling fails to occur at all. If only some of the crawled pages fail to be acquired, the task is still considered a success.) The following rules apply to the retry mechanism:
•
Single item tasks specified in the <item> tag are retried for all errors except the EXCEED_DISK_QUOTA error.
•
Crawl tasks are not considered a failure if the error status is between 300 and 500. The EXCEED_DISK_QUOTA error does not cause a retry, either.
•
When you change the disk quota from the Content Distribution Manager GUI, the acquirer is notified automatically and retries all status error nodes containing the EXCEED_DISK_QUOTA error.
When an item is acquired successfully, and if a positive value is specified in the ttl attribute, the acquirer rechecks the content for freshness at the interval specified by the ttl attribute. A ttl value of 0 (zero) means that the acquirer will not recheck the item unless the manifest file is updated. A negative ttl value means that the acquirer will never recheck the item. The following rules apply to the refresh mechanism:
•
When ttl > 0: recheck every ttl minutes. The acquirer also rechecks the content when the manifest file is reparsed, or when you click the Refetch button in the Content Distribution Manager GUI.
•
When ttl = 0: only acquire once. The acquirer only rechecks when the manifest file is reparsed, or when you click the Refetch button in the Content Distribution Manager GUI.
•
When ttl < 0: only acquire once. The acquirer will not recheck even if manifest file is reparsed, or when you click the Refetch button in the Content Distribution Manager GUI.
Verifying the Results
Use the following show CLI commands to verify acquisition results:
Table 6-2 show Commands for Content Acquisition
Command
|
Description
|
show acquirer channel
|
Shows how many channels use the Content Engine as root Content Engine.
|
show acquirer progress
|
Shows the acquisition progress.
|
show acquirer progress streams
|
Shows the progress of the MMS streaming download.
|
show statistics acquirer
|
Shows the result of content acquisition: how many items have been acquired and how much disk space has been used.
|
show statistics acquirer job-list
|
Shows the acquisition task lists.
|
show statistics acquirer error
|
Shows the detailed error message for an acquisition failure.
Note For a crawl job, only the first 100 errors encountered are displayed.
|
show statistics replication
|
Shows the replication status.
|
For a Content Engine that is the root Content Engine for a channel, use the following command to monitor acquisition progress and to troubleshoot.
Use the show acquirer EXEC command to make sure that the acquirer process on the root Content Engine is working correctly, and that the device is using the expected amount of bandwidth for acquisition. The following example shows that the acquirer is running properly and that the device is configured with unlimited bandwidth for acquisition of content.
Content Engine# show acquirer
Current Acquisition Bandwidth:Not Limited
Use the show acquirer progress EXEC command to check how far the acquisition of content has progressed. A specific channel ID or channel name can be specified to obtain the progress for a specific channel. In the example below, the acquirer has already acquired 2237 items.
ContentEngine# show acquirer progress channel-id 639
Acquirer progress information for channel ID:639 Channel-Name:external
-----------------------------------------------------------------
Acquired Single Items : 0 / 0
Acquired Crawl Items : 2237 / 2500 -- start-url=www.mtv.com//
Use the show statistics acquirer channel-id or show statistics acquirer channel-name EXEC command to obtain the detailed acquisition statistics for a given channel. In the example below, there was an error acquiring two items.
ContentEngine# show statistics acquirer channel-id 639
Statistics for Channel Channel-id :639 Channel-Name :external
---------------------------------------------------------
Total Number of Acquired Objects :2237
Total Disk Used for Acquired Objects :981511280 Bytes
Total Number of Failed Objects :2
Total Number of Re-Check Failed Objects :0
Use the show statistics acquirer errors channel-id or show statistics acquirer errors channel-name EXEC command to see the reasons why the errors occurred. In the example below, one error occurred because there was a problem acquiring the URL. The other error occurred because the disk quota for the channel configured in the Content Distribution Manager GUI would have been exceeded if the specified URL had been acquired. You can increase the channel disk quota to correct this error.
Content Engine# show statistics acquirer errors channel-id 639
Acquisition Errors for the Channel ID:639
-------------------------------------
Crawl job:start-url http://www.mtv.com//
Internal Server Error(500):http://cgi.cnn.com/entries/intl-emailsubs-confirm
Exceeded Disk Quota(703):http://www.cdt.org/copyright/backgroundchart.pdf
Use the acquirer test-url global configuration command to use WGET or an MMS test program for a Microsoft Media Server URL to see whether the problem URLs can be acquired by the Content Engine. This command can be used to detect network or server-side problems in individual URLs so that they can be corrected.
Content Engine# acquirer test-url http://cgi.cnn.com/entries/intl-emailsubs-confirm
--19:13:14-- http://cgi.cnn.com/entries/intl-emailsubs-confirm
Len - 50 , Restval - 0 , contlen - 0 , Res - 134727928Resolving cgi.cnn.com...
Connecting to cgi.cnn.com[207.25.71.15]:80... connected.
HTTP request sent, awaiting response... 500 An error occurred processing your request.
(365 to go)ERROR 500:An error occurred processing your request..
Use the show acquirer progress streams EXEC command to view the progress on the acquisition of streaming media. This command provides useful information for a channel that is acquiring streaming content. The example below shows that acquisition of the first URL is 43 percent complete and acquisition of the second one is 98 percent complete.
Content Engine# show acquirer progress streams
MMS URL = mms://172.19.224.235/MBR_Consumer_Med.wmv
Stream Start Time = Mon Sep 22 19:29:44 2003
Duration Requested = 258 sec
Duration Passed = 112 sec
Percentage Complete = 43%
Requested Bandwidth = 1067 kbps
Minimum Bandwidth Required = 361 kbps
Reserved Bandwidth = 1067 kbps
Size To Be Downloaded = 34418305 bytes
Size Downloaded = 14881905 bytes
MMS URL = http://172.19.224.235:8080/BillGSpeech_32k.wma
Stream Start Time = Mon Sep 22 19:29:44 2003
Duration Requested = 111 sec
Duration Passed = 112 sec
Percentage Complete = 98%
Requested Bandwidth = 32 kbps
Minimum Bandwidth Required = 24 kbps
Reserved Bandwidth = 32 kbps
Size To Be Downloaded = 456245 bytes
Size Downloaded = 450720 bytes
If more detailed troubleshooting of content acquisition is required, you can increase the debug level of the acquirer using the debug acquirer trace command. The logs are written to local1/errorlog/acquirer-errorlog.current. The logs for acquisition through the MMS protocol are written to local1/errorlog/acquirer-mms-errorlog.current.
Bandwidth Control
The bandwidth control feature allows you to specify how much bandwidth in the network is consumed by data replication from the root Content Engine to the edge Content Engines. Bandwidth controls allow you to specify the amount of bandwidth to be used at various times during the day and also to set up a weekly schedule that repeats week after week. This section describes the bandwidth controls as they relate to content acquisition and distribution. Bandwidth controls are available for the acquisition process, replication process, and multicast sender.
Configuring Acquisition and Distribution Default and Maximum Bandwidth Settings
Default bandwidth settings can be configured for acquisition and distribution of content. Default bandwidth is the amount of bandwidth allocated for content acquisition and distribution when there is no scheduled bandwidth.
If a Content Engine is assigned to a device group and no default bandwidth has been set for the device, the device group default bandwidth settings are applied. If the Content Engine is part of multiple device groups, the most recently updated default bandwidth settings are applied.
However, if default bandwidth is specified for a device, it will override the settings at the device group level. This occurs when a Content Engine is a member of a device group.
To configure the default bandwidth settings for acquisition and distribution for the Content Engine, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Content Engines. The Content Engines window appears.
Step 2
Click the Edit icon next to the desired Content Engine. The Modifying Content Engine window appears.
Step 3
In the Contents pane, choose CDN Settings > Acquisition & Distribution Default Bandwidth. The Default and Max Bandwidth for Content Engine window appears.
Step 4
In the Acquisition-in Bandwidth field, enter the bandwidth value in kbps for incoming content acquisition traffic from origin servers. The default is 1024 kbps.
Step 5
In the Distribution-in Bandwidth field, enter the bandwidth value in kbps for incoming unicast content distribution traffic from Content Engines. The default is 56 kbps.
Step 6
In the Minimum nonstreaming Acquisition-in Bandwidth field, enter the bandwidth value in kbps for incoming nonstreaming content acquisition traffic from the origin server. The default is 50 kbps.
Minimum nonstreaming bandwidth is the bandwidth allocated for acquiring content using nonstreaming protocols such as HTTP, HTTPS, and FTP. This setting is useful when streaming protocols, such as MMS, take up all the available bandwidth in certain cases, and bandwidth needs to be reserved for acquiring manifest files and other content items from origin servers using nonstreaming protocols. We recommend that you retain the minimum nonstreaming bandwidth default setting, and adjust it only when it might be necessary to increase the value, such as when certain large files fail to be acquired because of many stream acquisitions taking place simultaneously.
Step 7
In the Distribution-out Bandwidth field, enter a bandwidth value in kbps for outgoing unicast content distribution traffic to various Content Engines in the ACNS network. The default is 128 kbps.
Note
The Multicast-out Bandwidth field is read-only. This field displays the multicast-out bandwidth setting for sender Content Engines. This value is configured in the Default Multicast-out Bandwidth field in the Creating New Multicast Cloud window. (See the "Configuring Multicast Cloud Properties" section.) For receiver Content Engines, this field is blank.
Step 8
To revert to the previously configured window settings, click Reset. The Reset button is visible only when you apply default or group settings to change the current device settings, but you have not yet clicked Submit.
Step 9
Click Submit to save your settings.
Displaying a Graphical Representation of the Acquisition and Distribution Bandwidth Settings
You can view a graphical representation of the bandwidth settings configured on a Content Engine for the acquisition and distribution of files. The vertical axis of the graph represents the amount of bandwidth in kbps and the horizontal axis represents the days of the week. The scale shown on the vertical axis is determined dynamically based on the bandwidth rate for a particular bandwidth and is incremented appropriately. The scale shown on the horizontal axis for each day is incremented for each hour. Each type of bandwidth is represented by a unique color. A legend at the bottom of the graph maps the colors to the corresponding bandwidths.
To view the graph that displays the acquisition and distribution bandwidth, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Content Engines. The Content Engines window appears.
Step 2
Click the Edit icon next to the desired Content Engine. The Modifying Content Engine window appears.
Step 3
In the Contents pane, choose CDN Settings > Acquisition & Distribution Default Bandwidth. The Default and Max Bandwidth for Content Engine window appears.
Step 4
Click the Display Graph icon in the taskbar. A new Acquisition and Distribution bandwidth for Content Engine popup window appears, displaying the bandwidth configuration graph.
You can choose view that you wish to apply to the bandwidth graph by clicking the different viewing options. See Table 6-3 for a description of the viewing options and their descriptions.
Step 5
Click Close once you have finished viewing the settings. Alternatively, you can click Refresh to view the most currently applied bandwidth settings.
Table 6-3 Viewing Options in Acquisition and Distribution Bandwidth Graph
Item
|
Description
|
View specific servers
|
Displays the bandwidth settings for the corresponding bandwidth type selected.
|
Distribution In
|
Displays the bandwidth settings for incoming content distribution traffic.
|
Distribution Out
|
Displays the bandwidth settings for outgoing content distribution traffic.
|
Acquisition In
|
Displays the bandwidth settings for incoming content acquisition traffic.
|
All Servers
|
Displays a consolidated view of all configured bandwidth types. This is the default view combined with the Full Week view.
|
View mode
|
Displays detailed and composite bandwidth settings.
|
Show Detailed Bandwidth
|
Toggles with the Show Effective Bandwidth option. Displays the detailed bandwidth settings for the device and its associated device groups. The bandwidth settings of the device and device groups are shown in different colors for easy identification.
|
Show Effective Bandwidth
|
Toggles with the Show Detailed Bandwidth option. Displays the consolidated or composite bandwidth settings for the device and its associated device groups.
|
Show Aggregate View
|
Toggles with the Show Non-Aggregate View option. Displays the bandwidth settings configured for the corresponding device groups.
|
Show Non-Aggregate View
|
Toggles with the Show Aggregate View option. Hides the bandwidth settings configured for the corresponding device groups.
|
View by day
|
Displays the bandwidth settings for a particular day or all days of the week.
|
Sun, Mon, Tues, Wed, Thurs, Fri, Sat
|
Displays the bandwidth settings for the corresponding day of the week.
|
Full Week
|
Displays the bandwidth settings for the entire week. This is the default view combined with the All Servers view.
|
Configuring Acquisition and Distribution Bandwidth Settings for Scheduled Times
In addition to being able to set the default bandwidth limits for incoming and outgoing traffic on the Content Engine, you can use the Content Distribution Manager GUI to configure different limits for different time segments that form a week-long cycle. For example, you can configure the acquisition-in limit at a maximum of 100 kbps from 8:00 a.m. to 8:00 p.m., Monday through Friday, and extend that limit to as high as 10 Mbps during the nights from 8:00 p.m. to 8:00 a.m., Monday through Friday and all day Saturday and Sunday. These settings override the default value for the time and period that you specify.
Note
For a schedule from 8:00 p.m. to 8:00 a.m., the administrator must configure two schedules in order to span the two days: one from 8:00 p.m. to 11:59 p.m. (2000 to 2359) and another from 12:00 a.m. to 8:00 a.m. (0000 to 0800).
Note
Distribution bandwidth settings apply only to unicast distribution.
To configure distribution bandwidth settings for specific days and times, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Devices > Content Engines. The Content Engines window appears.
Step 2
Click the Edit icon next to the name of the Content Engine that you want to view. The Modifying Content Engine window appears.
Step 3
In the Contents pane, choose CDN Settings > Acquisition & Distribution Bandwidth. The Bandwidth Setting for Content Engine window appears.
Step 4
In the Bandwidth Setting window, the Aggregate Settings Yes option is chosen by default. This specifies that the bandwidth configurations of the Content Engine, as well as the device groups to which it is assigned, are displayed in this window. Click the No radio button to apply and view the bandwidth settings of the Content Engine only.
Step 5
Click the Create New Bandwidth Setting icon in the taskbar. The Create New A&D Bandwidth Settings window appears. (See Figure 6-3.)
Figure 6-3 Create New A&D Bandwidth Settings Window
Step 6
Choose a bandwidth type from the drop-down list. (See Table 6-4 for a description of each field in this window. All fields are required.)
Step 7
Enter the bandwidth rate, start time, end time, and day of the week in the appropriate fields.
Table 6-4 Configuring Acquisition and Distribution Bandwidth Settings
Field
|
Description
|
Bandwidth Type
|
Distribution-in—For incoming unicast content distribution traffic from Content Engines.
Distribution-out—For outgoing unicast content distribution traffic to Content Engines.
Acquisition-in—For incoming content acquisition traffic from origin servers.
|
Bandwidth Rate
|
Maximum amount of bandwidth that you want to allow (in kbps).
|
Start Time
|
Time of day for the bandwidth setting to begin, using a 24-hour clock in local time (hh:mm).
|
End Time
|
Time of day for the bandwidth setting to end (hh:mm).
|
Day Selection
|
Days on which bandwidth settings apply.
• Full Week—Specifies that the allowable bandwidth settings are applied for an entire week.
• Sun, Mon, Tue, Wed, Thu, Fri, and Sat—Specifies individual days of the week on which the allowable bandwidth settings take effect.
|
Step 8
Click Submit.
Scheduling Bandwidth for Streaming Acquisition
When MMS acquisition is configured, the streaming content consumes a large amount of bandwidth over the playtime duration of the file. You must ensure that sufficient bandwidth is available for sufficiently large blocks of time. For multi-bit-rate streams (MBR), the required bandwidth is the sum total of all the bit rates encoded in the stream.
Proxy Support
ACNS 5.1 software supports HTTP content acquisition through a proxy server. Acquisition through a proxy server can be configured when the root Content Engine cannot directly access the origin server, because the origin server is set up to allow access only by a specified proxy server. When a proxy server is configured for root Content Engine acquisition, the acquirer contacts the proxy server instead of the origin server and all requests to that origin server go through the proxy server.
Note
In ACNS 5.1 software, content acquisition through a proxy server is only supported for HTTP requests. It is not supported for HTTPS, FTP, MMS, or MMS-HTTP requests.
If a transparent WCCP proxy is used, you do not need to configure a proxy for content acquisition. In such cases, HTTP requests from the acquirer are redirected to the WCCP proxy by a router. However, when you do not have a router to redirect HTTP requests to the proxy server, you must configure the Content Engine to use the proxy server.
There are two ways to configure the proxy server: through the Content Engine CLI or through the manifest file. If you need to configure the Content Engine to use the proxy for both caching and pre-positioned content, use the CLI to configure the proxy. The CLI command is a global configuration command that configures the entire Content Engine to use the proxy. If only the acquirer portion of the Content Engine needs to use the proxy for acquiring pre-positioned content, use the manifest file to specify the outgoing proxy. When you configure the proxy server in the manifest file, you are configuring the acquirer to use the proxy to fetch content for a particular channel.
Note
Proxy configurations in the manifest file take precedence over proxy configurations in the CLI. Furthermore, a noProxy configuration in the manifest file takes precedence over the other proxy server configurations in the manifest file.
You can also configure a proxy for fetching the manifest file by using the Content Distribution Manager GUI (Creating New Channel or Modifying Channel window). When you configure a proxy server in the Content Distribution Manager GUI, the proxy configuration is valid only for acquiring the manifest file itself and not for acquiring the channel content. Requests for the manifest file go through the proxy server, whereas requests for content go directly to the origin server.
Tip
Before configuring a proxy server, verify that the root Content Engine is able to ping the proxy server. To check whether the proxy server is accepting incoming HTTP traffic at the configured port, use the acquirer test-url http:// proxyIP:proxyport command in the root Content Engine CLI, where the URL in the command is the URL of the proxy server being tested. If the proxy is not servicing the configured port, you will get the message: "failed: Connection refused."
Configuring an HTTP Proxy Server Using the CLI
To configure a proxy server using the CLI, use the http proxy outgoing global configuration command. This is a global proxy configuration for cache and pre-positioned content acquisition. You can use this command to configure a list of outgoing proxy servers through the CLI, and when configured, all the requests (for any host) will go through the proxy server. The global proxy configuration can also be reset using the no form of the command. For example:
ContentEngine(config)# http proxy outgoing ?
connection-timeout Timeout period used for probing outgoing proxy servers in microseconds
host Use Outgoing HTTP Proxy
monitor Interval at which to monitor the outgoing proxy servers
origin-server Use Origin Server if all outgoing proxies failed.
preserve-407 Preserve 407 HTTP authentication header
ContentEngine(config)# http proxy outgoing host ?
Hostname or A.B.C.D Hostname or IP address of outgoing proxy
ContentEngine(config)# http proxy outgoing host 128.107.192.24 ?
<1-65535> Port of outgoing proxy
ContentEngine(config)# http proxy outgoing host 128.107.192.24 80
If the acquirer is unable to contact a proxy server, the proxy is considered disabled. The acquirer retries a disabled proxy after the period specified in the monitor option of the http proxy outgoing command. If no monitor interval is configured, the retry interval defaults to every 30 minutes.
If all outgoing proxies fail and you want the acquirer to contact the origin server directly, use the origin-server key word. For example:
ContentEngine(config)# http proxy outgoing origin-server
If a list of outgoing proxy servers is configured in the CLI and the first proxy server fails, the next proxy server in the list will be tried, and so on. If all the proxies fail, the acquirer will fail over to the origin server, depending on the configuration specified in the http proxy outgoing origin-server command.
To verify that the outgoing proxy is configured in the CLI, use the show http proxy command. For example:
Not servicing incoming proxy mode connections.
Primary Proxy Server: 128.107.192.190 port 9090
Monitor Interval for Outgoing Proxy Servers is 60 seconds
Timeout period for probing Outgoing Proxy Servers is 300000 microseconds
Use of Origin Server upon Proxy Failures is disabled.
To confirm whether the content is being acquired through a proxy, use the acquisition and distribution transaction log, and check the proxyUsed line. This line shows the proxy server used for the request.
Configuring a Proxy Server Using the Manifest File
You can configure the proxy server for the acquirer by using the <proxyServer> tag in the manifest file. The proxy server configured in the manifest file is valid for all the hosts named in the manifest file. If the specified proxy fails, the acquirer, by default, contacts the origin server directly and tries to fetch the object.
If you have multiple proxy servers in the manifest file, and if you want to associate different proxy servers for different hosts, you must include the proxyServer attribute in the <host> tag. In this first example, the <proxyServer> tag is associated with the <host> tag through its proxyServer attribute (proxyServer="172.19.226.242"). All items that use this <host> tag will use this proxy server.
Note
The <proxyServer> tag must be located at the top level of the manifest file, directly under the <CdnManifest> tag; it cannot be used as a subtag of any other tags. It is associated with the <host>, <item>, and <crawler> tags through the proxyServer attribute.
serverName="172.19.226.242"
<server name="my-devbox">
<host name="http://vista2.cisco.com"
proxyServer="172.19.226.242"
<item src="HR/salary.htlm" />
<item src="HR/Holidays.html" />
In the following example, the proxyServer attribute is associated with the <item> tag and the <crawler> tag. The proxy server is applied to second <item> tag and the <crawler> tag, but not the first <item> tag.
<item src="http://www.msnbc.com/computer/jokes.html" />
<proxyServer serverName="128.107.192.24" port = "80" />
<item src="http://www.cnn.com/War/World-war-two.jpg" />
<crawler start-url="http://www.abc.com/War/World-war-two/index.html" />
Instead of using an IP address for the serverName attribute in the <proxyServer> tag, you can also use a domain name. For example:
serverName="spachiap-uni1"
<server name="my-devbox">
<host name="http://128.107.150.26/nfs-obsidian/Unicorn"
proxyServer="spachiap-uni1"
When you use a domain name instead of an IP address, make sure that the domain name can be resolved by the DNS server configured.
Configuring the noProxy Attribute in the Manifest File
You can configure the noProxy attribute in the manifest file to specify that all the requests for a particular host should go directly to the origin server and should not use the proxy specified in the http proxy outgoing host command, if any proxies were configured in the CLI.
The noProxy attribute can be specified in the <host>, <item>, or <crawler> tags. In the following example, if the root Content Engine has a proxy configured through the http proxy outgoing host command, the first item in the manifest file uses that proxy, but the rest of the items or crawl tasks in this manifest file do not. The noProxy="1" designation specifies that the second item and the crawl task data are to be fetched directly from the origin server.
<item src="http://www.msnbc.com/computer/jokes.html" />
<item src=" http://www.cnn.com/War/World-war-two.jpg"
<crawler start-url="http://www.abc.com/War/World-war-two/index.html"
<host name="http://www.cisco.com" noProxy="1" />
<crawler start-url="product/routers/" />
Configuring a Proxy Server and Port for the Manifest File
A proxy server and port for fetching the manifest file can be set in the Content Distribution Manager GUI, Creating New Channel or Modifying Channel window. If configured, requests for the manifest file go through the proxy server that is specified in the Content Distribution Manager GUI.
Note
The proxy configuration from the Channel window only applies to fetching the manifest file. It does not apply to content acquisition.
If the manifest file resides on an origin server that requires a proxy, you need to specify the proxy name or IP address and the proxy port in the fields provided in the Content Distribution Manager GUI, Creating New Channel or Modifying Channel window, Manifest Proxy Information section of the window. (See Figure 6-1.) If the manifest file resides on an origin server that cannot use the proxy configured from the http proxy outgoing host CLI command, you must check the Disable All Proxy check box.
The Disable All Proxy check box, when checked, overrides the values in the Proxy Hostname and Proxy Port fields.
Note
To activate the Manifest Proxy Information fields in the Creating New Channel window, you must first enter a URL in the Manifest URL field.
Authentication Support
The acquirer supports two types of authentication schemes: basic and NTLM authentication. Authentication information is configured either in the Content Distribution Manager GUI or in the manifest file, depending on what is being acquired.
If authentication is required to fetch the manifest file from the origin server or from a proxy server, you can specify the authentication information in the Content Distribution Manager GUI. From the Channels > Channels Creating New Channel or Modifying Channel window, you can disable basic authentication, enter a username and password, and specify an NTLM user domain name. (Figure 6-1 shows the fields for entering authentication information and Table 6-1 describes these fields.)
Note
The authentication configuration from the Channels > Channels Creating New Channel or Modifying Channel window only applies to fetching the manifest file. It does not apply to content acquisition.
If authentication is required to fetch pre-positioned content from the origin server or from a proxy server, you must specify the authentication information in the manifest file. (See the next section, "Acquiring Content That Requires NTLM Authentication.")
Acquiring Content That Requires NTLM Authentication
In ACNS 5.x software, a root Content Engine can acquire content from HTTP origin servers or proxy servers after performing basic authentication. However, origin servers often support NTLM authentication only. ACNS 5.1 software extends the authentication functionality so that the acquirer can act as an NTLM client to talk to NTLM-capable origin servers.
Note
The Content Engine acquirer only supports NTLM Version 1.
If authentication is required to fetch content, you must specify the authentication information in the manifest file by using following manifest file attributes:
•
user—User account
•
password—Password
If NTLM authentication is required, you can specify the following two additional manifest file attributes:
•
ntlmUserDomain—NTLM user domain name. This attribute is required for an NTLM authentication scheme.
•
disableBasicAuth—(Optional attribute) Disables basic authentication, if needed. If you always want your username and password to be used for NTLM authentication and not basic authentication, set this attribute to true. If this attribute is omitted, the default is false.
These attributes can be specified in the <host>, <item>, <item-group>, and <crawler> tags. See "Creating Manifest Files," for more information.
If an origin server or proxy server supports both basic authentication and NTLM authentication, the acquirer chooses the first supported authentication scheme from the response challenge list. The authentication scheme used by the acquirer can also depend on the specified authentication information. For example, an IIS server responds with NTLM as the first scheme and basic authentication as the second. In this case the acquirer chooses NTLM. However, if the ntlmUserDomain attribute is not specified in the manifest file, the acquirer chooses basic authentication. If the disableBasicAuth attribute is set to true, the acquirer chooses NTLM authentication.
You can view the acquisition and distribution transaction log from the root Content Engine CLI to determine which authentication scheme the acquirer has used.
Proxy Authentication Support
If the proxy for the root Content Engine was configured in the root Content Engine CLI (http proxy outgoing host command), you can set the authentication information for the proxy from either the Content Distribution Manager GUI (see the "Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI" section) or from the root Content Engine CLI (see the "Configuring Authentication for a Proxy Using the CLI" section).
Configuring Authentication for a Proxy to Fetch the Manifest File
To configure the authentication information for fetching the manifest file from a proxy server, follow these steps:
Step 1
From the Content Distribution Manager GUI, choose Channels > Channels.
Step 2
Click the Edit icon next to the name of a channel, or click the Create New Channel icon in the taskbar to get to the Creating new Channel or Modifying Channel window.
Step 3
Enter the authentication information in the appropriate fields under the Manifest Proxy Information heading. (Figure 6-1 shows the fields for entering authentication information and Table 6-5 describes these fields.)
Table 6-5 Manifest Proxy Information
Field
|
Description
|
Disable All Proxy
|
Disables outgoing proxy server for fetching the manifest file. Any outgoing proxy server configured on the root Content Engine will be bypassed and the acquirer contacts the origin server directly.
|
Proxy Hostname
|
Host name or IP address of the proxy server used by the acquirer to retrieve the manifest file.
|
Proxy Port
|
Port number of the proxy on which the acquirer fetches the manifest file. The range is from 1 to 65535.
|
Disable basic authentication
|
Disallows removal of NTLM headers to fallback to basic authentication method.
|
Proxy Username
|
Name of the user to be authenticated to fetch the manifest file.
|
Proxy Password
|
Password of the user to pass authentication from the proxy.
|
Confirm Password
|
Reentry of the same password for confirmation to pass authentication from the proxy.
|
Proxy NTLM domain name
|
NTLM user domain name to pass NTLM authentication scheme configured on the proxy.
|
Step 4
Click Submit to save the settings.
Configuring Authentication for a Proxy Using the Manifest File
The following example specifies authentication information for the proxy server. See "Creating Manifest Files," for more information.
<!-- specify a proxy server/port and its authentication info
this proxy supports Basic auth so only user/password is needed -->
<proxyServer serverName="128.107.192.24" port = "80"
user="johnz" password="xxx123yyy"/>
<!-- this item is below the proxyServer tag; it is using the above proxy -->
<item src="http://www.cnn.com/War/World-war-two.jpg" />
<!-- specify a proxy server/port and its authentication info
this proxy requires NTLM auth, so ntlmDomainName is needed
for extra password security, users want to disable basic auth
so their user/password is not sent over the wire -->
<proxyServer serverName="company-proxy" port = "80"
user="johnz" password="xxx123yyy"
ntlmUserDomain="cisco-eng"
<!-- this crawler is below the proxyServer tag; it is using the proxy -->
<crawler start-url="http://www.abc.com/War/World-war-two/index.html"/>
Configuring Authentication for an HTTP Proxy Using the Content Distribution Manager GUI
If a root Content Engine is configured to receive content through a proxy server, the acquirer running on the root Content Engine must be authenticated by the proxy server before it can obtain content from the origin server.
To configure authentication information for a proxy that was specified using the http proxy outgoing host command, you can use the Acquirer Outgoing Proxy Authentication section in the Content Distribution Manager GUI (Devices > Content Engines > HTTP/S > HTTP Connections) to set the authentication information for the proxy. ACNS software supports multiple proxies; therefore, you can set the proxy authentication information for multiple proxies as well.
Note
If you specify the <proxyServer> tag in the manifest file, or if you enter the proxy host IP address in the Manifest URL field in the Creating New Channel window of the Content Distribution Manager GUI (Channels > Channels), the http proxy outgoing host command is ignored.
To configure the acquirer outgoing proxy authentication information, follow these steps:
Step 1
In the Content Distribution Manager GUI, choose Devices > Content Engines.
Step 2
Click the Edit icon next to the name of the root Content Engine.
Step 3
In the Contents pane, choose HTTP/S > HTTP Connections. The HTTP Connection Settings for Content Engine window appears.
Step 4
Under the Acquirer Outgoing Proxy Authentication heading, enter the following information for the outgoing proxy that is listed in each row:
a.
To acquire content from the origin server, enter the name of the user to be authenticated in the Username field. This username will be used for both NTLM and basic authentication.
b.
Enter the password of the user in the Password field. Reenter the same password in the Confirm Password field for confirmation. The password details appear as asterisks.
c.
In the NTLM User Domain field, enter the NTLM server domain name to be used to authenticate user access.
d.
To disallow the removal of NTLM headers and fall back to the basic authentication method, check the Disable basic authentication check box.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method against Microsoft Internet Information Services (IIS) servers; the username and password information can be passed to the origin server in clear text with a basic authentication header.
Step 5
Click Submit to save the authentication settings for the outgoing proxy.
Configuring Authentication for a Proxy Using the CLI
To configure the authentication information for a proxy that was specified using the http proxy outgoing host command, you can also use the acquirer proxy authentication outgoing command from the root Content Engine CLI. The following example shows the authentication configuration for a nontransparent proxy server (IP address 192.168.1.1, port 8080) with NTLM authentication:
CE(config)# acquirer proxy authentication outgoing 192.168.1.1 8080 myname password
password ntlm mydomain basic-auth-disable
Note
If there is a transparent proxy between the root Content Engine and the origin server, you can use the acquirer proxy authentication transparent command from the root Content Engine CLI to specify the authentication information for the transparent proxy. (See the "Configuring Authentication for a WCCP Proxy Using the CLI" section.)
To verify that the authentication information has been set up correctly, use the show acquirer proxy authentication EXEC command.
The acquirer supports proxy chaining as long as there is only one proxy in the chain that requires authentication. The acquirer will fail if more than one proxy in the chain requires authentication.
Configuring Authentication for a WCCP Proxy Using the Content Distribution Manager GUI
In a transparent caching environment using a WCCP proxy, requests for content are redirected to a Content Engine by a WCCP-enabled router. When an HTTP proxy-style request from the root Content Engine to the origin server is intercepted by a WCCP proxy that requires authentication, you can configure authentication settings for the WCCP proxy.
To configure acquirer WCCP proxy authentication settings in a transparent proxy environment, follow these steps:
Step 1
Choose Devices > Content Engines. The Content Engines window appears.
Step 2
Click the Edit icon next to the Content Engine for which you want to specify acquirer WCCP proxy authentication settings. The Modifying Content Engines window appears.
Step 3
In the Contents pane, choose HTTP/S > Acquirer WCCP Proxy Authentication. The Acquirer WCCP Proxy Authentication for Content Engine window appears. (See Figure 6-4.)
Figure 6-4 Acquirer WCCP Proxy Authentication
Step 4
Check the Enable check box to enable transparent proxy authentication for the acquirer.
Step 5
Enter the name of the user to be authenticated to acquire content from the origin server in the Username field. This username is used for both NTLM and basic authentication.
Step 6
Enter the password of the user in the Password field. Reenter the same password in the Confirm Password field for confirmation. The password details appear as asterisks.
Step 7
In the NTLM user domain field, enter the NTLM server domain name to be used to authenticate user access.
Step 8
Check the Disable basic authentication check box to disallow removal of NTLM headers and fall back to the basic authentication method.
If you leave this check box unchecked, NTLM authentication headers can be stripped to allow fallback to the basic authentication method against Microsoft Internet Information Services (IIS) servers; the username and password information can be passed to the origin server in clear text with a basic authentication header.
Step 9
Click Submit to save your settings.
A "Click Submit to Save" message appears in red next to the current settings when there are pending changes to be saved. You can also revert to the previously configured window settings by clicking Reset. The Reset button is visible only when you apply default or device group settings to change the current device settings but the settings have not yet been submitted.
Modifying WCCP Proxy Authentication Settings
To modify WCCP proxy authentication settings, follow these steps:
Step 1
To delete the configured settings for the Content Engine, click the Remove Device Settings icon in the taskbar to delete the settings. This icon appears only if you have configured the settings for the Content Engine.
Step 2
To restore the factory default settings to the Content Engine, click the Apply Defaults icon in the taskbar.
Step 3
To override the device group settings applied to the Content Engine with the factory default settings, click the Override Group Settings with Defaults icon in the taskbar. This icon appears only if you have applied the device group settings to the Content Engine.
Step 4
When settings have been applied from device groups with which the Content Engine is associated, click the Override Group Settings icon in the taskbar to override the device group settings and configure the device settings. This icon appears only if you have applied the device group settings to the Content Engine.
Step 5
When a Content Engine is associated with one or many device groups that have been configured with acquirer WCCP proxy authentication settings, choose the device group name from the drop-down list that appears in the taskbar if you want to apply settings from a different device group to this Content Engine.
Configuring Authentication for a WCCP Proxy Using the CLI
For a transparent WCCP proxy, if authentication is required, you can specify the authentication information by using the acquirer proxy authentication transparent command in the root Content Engine CLI. The following example shows the authentication configuration for a transparent WCCP proxy server with basic authentication. The username is admin and the password is default.
CE(config)# acquirer proxy authentication transparent
CE(config)# acquirer proxy authentication transparent admin
CE(config)# acquirer proxy authentication transparent admin password
CE(config)# acquirer proxy authentication transparent admin password default
Content Acquisition Guidelines
The following guidelines apply to content acquisition in the ACNS network.
Special Characters
Certain special characters are not allowed in URLs, according to the RFC 2396 standard. If the URL of the content to be acquired contains illegal characters, you must rewrite the URL using the ASCII code equivalent escape characters to represent the illegal special characters.
For example, the escape characters for a blank space are %20. An illegal URL that uses blank spaces in the file name, such as:
http://www.cnn.com/file with space
Can be rewritten as:
http://www.cnn.com/file%20with%20space
You might have to rewrite the URL using the ASCII equivalent escape characters when you specify a URL in the Manifest URL field in the Content Distribution Manager GUI, Channel definition window (Channels > Channels > Basic Settings > Definition) or when you specify a URL in the manifest file within the following tags:
<host name=>
<item src=>
<crawler start-url=>
<crawler externalPrefixes=>
<match prefix=>
ACNS software enforces the following rules for escaping a URL:
•
The special characters, ! # $ & ' ( ) , ; = ? serve special functions in a URL. If your filename or folder name on the origin server uses these special characters, you must rewrite the URL.
In addition to the above special characters, you must also avoid using the following characters in filenames and folder names and substitute the ASCII equivalent shown:
space ==> %20
" ==> %22
% ==> %25
< ==> %3c
> ==> %3e
[ => %5b
\ ==> %5c
] ==> %5d
^ ==> %5e
{ ==> %7b
| ==> %7c
} ==> %7d
~ ==> %7e
•
If a URL in the manifest file or in an HTML file uses a "?" character, the URL is rejected. If a URL uses a "#" character, the portion of the URL after the "#" is discarded.
•
If a URL in a manifest file or in an HTML file uses other special characters, follow the requirements in RFC 2396.
Redirection Support
For a a single item HTTP redirect, the acquirer follows the redirect to the new link to obtain the file. If the HTTP redirect is a crawl job, the acquirer checks the new link against the crawl parameters. For example, if a start-url http://www.ibm.com/ is redirected to http://www.cnn.com/, the acquirer checks www.cnn.com to see if the start-url matches the crawl parameters. Unless the externalPrefixes attribute is specified, the redirection is usually rejected because it does not match the default prefix.
HTTP Response no-cache Directive
The no-cache directive is related to the HTTP protocol. The HTTP protocol uses headers to communicate between the client and server. The client sends a request header, and the server responds with a response header. Headers specify the various attributes for the resource being requested. For example, the client sends the request:
The server might send a response, as follows:
The no-cache directive in the response header tells the client that the content being requested is not cacheable. When an HTTP server responds to a request with a no-cache directive in the header, the acquirer behaves as follows:
•
If the content to be acquired is specified in an <item> tag, the acquirer ignores the no-cache directive and fetches the content anyway.
•
If the content to be acquired is specified in a <crawler> tag in the manifest file, and the response sent out by the server contains a no-cache directive in the response header, the acquirer honors the directive and does not fetch the content or pre-position it.
HTTP Response expires Directive
An expires directive in the HTTP response header indicates that the content being served is not valid after the time specified in the response. For example, if the response header contains
the client (the acquirer in this case) is directed not to use the content after the time specified. The expires directive is honored only for content that is specified in the <crawler> tag. Single items configured using the <item> tag in the manifest file do not honor the expires directive sent by the server.
EMBED and OBJECT ID Tag Support
If an HTML file includes an EMBED tag or an OBJECT ID tag, the acquirer parses the file, obtains the file link from the src and filename attributes, and fetches these files.
Crawl .asx File
The acquirer can parse an .asx file and fetch its contents.
expires Attribute Specified in the Manifest File
The expires attribute, specified as part of the <item> tag or <crawler> tag in the manifest file, designates a time in yyyy-mm-dd hh:mm:ss format for content to be removed.
If you specify the expires attribute in the manifest file, the root Content Engine deletes the content at the time specified. The root Content Engine then replicates the deletion event to receiver Content Engines, which also delete the content from the channel.
Secure FTP
The acquirer currently does not support secure FTP.
Parsing Scripts in an HTML File
If an HTML file contains a JavaScript or a VBScript (Microsoft's Visual Basic Scripting Edition is a scaled-down version of Visual Basic), and the script codes have links, the acquirer does not parse the script code to get the links.
Cookie Support
Cookies are a general mechanism that server-side connections (such as CGI scripts) use to both store and retrieve information on the client side of the connection. The addition of a simple, persistent, client-side state significantly extends the capabilities of web-based client/server applications. Although the acquirer is also a client, it does not support cookies.
Error Codes
This section describes the error codes for content replication. The following error codes are shown in the Content Distribution Manager GUI Replication Status window:
ACNS Unified Name Space Errors
1401: Bad magic number in unified name space (UNS) meta file
1402: Unknown version in UNS meta file
1403: Bad checksum on UNS meta file
1404: Internal error - URL mismatch between caller specified URL and URL in meta file
1405: Invalid URL syntax
1406: Attempt to create URL already in UNS
1407: Insufficient space to store requested object
1408: Internal logic error in UNS server code
1409: Requested object not found in UNS
1410: Requested UNS operation not implemented
1411: Failure in underlying RPC transport
1412: Destination URL already exists
1413: Channel does not exist
1414: Writes failed to all metadata files
1415: Object is not servable
1416: Object is out of presentation time
1417: Object playback not allowed by this playback server
1418: Live object, but attributes invalid
1419: Alternate media attributes invalid
1420: Alternate media is not servable
1421: Channel would be over disk quota
1422: Cannot change channel ID
1423: Object already in specified channel
1424: Object not in specified channel
1425: Metadata operations disabled (no file systems)
1426: Channel operations disabled (no file systems)
1427: Nonobject metadata services not initialized
1428: Out of handles for nonobject metadata files
1429: Internal error loading metadata file
1430: (No error text available)
1431: Too many CDN files (cannot add to URL for file system map [UFM])
1432: Specified legacy ECDNFS file not found
1433: Bad MD5 checksum passed by caller
1434: Bad tags present in supplied attributes (OBSOLETE)
1435: Cannot resize content file migrated from ECDN
1436: UNS symlink references nonexistent URL
1437: URL is not a UNS symlink
1438: Too many levels of UNS symlinks
1439: UNS entry metafile truncated
1440: Output to be returned is too big in size
1441: URL request was made on a nondefault service port
1442: Specified legacy file not on any UNS filesystem
1443: Specified legacy file is not in the `cache' directory
1444: Specified legacy file is referenced by a UNS entry
1445: Specified legacy file is not a .data file
1446: File open failed during ASX/SMIL rewrite operation
1447: Parse error during ASX/SMIL rewrite operation
1448: Initialization error during ASX/SMIL rewrite operation
1449: Actual Rewriting failed during ASX/SMIL rewrite operation
1450: No more file system slots
1451: Specified file system is already in use
1452: Specified file system not known to UNS
1453: Specified file system bytes in use exceeds target
1454: Cannot unuse the file system containing the symlink tree
1455: Cannot add file system because there is no local disk-based CDNFS storage
Acquirer Error Codes
The following error codes are common for HTTP, FTP, HTTPS, and MMS content acquisition:
700: Acquirer internal error
701: Manifest parser error
702: Manifest parse warning
703: Exceeded disk quota
704: No space in UNS
705: It is a folder
706: ACCEPT failed
707: Connection refused
708: Listen failed
709: Mismatched in crawling
710: No-cache instructed from server
711: Disabled by the user
712: Downloaded size mismatched with content length
713: Invalid response received from server
714: Database access error
715: Not to acquire URL with ?
716: Invalid redirect foo to foo/
717: Illegal folder for file import
718: Unable to connect to proxy server
719: URL is too long
720: HTTP metadata is too long
721: Connection closed by peer
722: Invalid content length
905: Socket timeout
906: No host
907: Zero bandwidth
908: File download aborted
909: Content expired before fetch
910: Insufficient bandwidth to download the stream
911: Live stream not supported
912: Request timed out
1000: UNS error
MMS-Specific Acquisition Error Codes
2002: Cannot connect to remote server
2003: Remote cannot connect
2004: Requested file not found
2005: Requested remote file not found
2006: Max number of connections is reached
2007: Remote server reached max number of connections
2008: Max bandwidth limit is reached
2009: Remote server reached max bandwidth limit
2010: Max bit rate limit reached
2011: Remote server reached max bit rate limit
2012: Illegal memory address encountered
2013: Illegal memory address encountered at remote server
2015: Error creating a socket
2016: Error creating a socket at remote server
2017: Internal system error
2018: Error receiving data from server
2019: Server error at remote server
2020: Authentication failed
2021: Remote server timed out
2022: Error in stream type
2023: Remote proxy error
2024: Invalid request at remote server
2025: Stream file is corrupt
2026: Stream file is corrupt at remote server
2027: Data received is corrupt
2028: Data received is corrupt at remote server
2029: Remote access denied
2030: Remote connection refused
2031: MMS over UDP not allowed in WCCP mode
2032: MMS over UDP has been disabled
2033: MMS over TCP has been disabled
2034: MMS over TCP and UDP has been disabled
2035: MMS over HTTP has been disabled
2036: Client error at first round of handshake over MMS
2037: Client error
2038: Client request blocked
2039: Client request filtered
2041: Unknown MMS error
2042: Unknown error from remote server
2043: Max incoming bandwidth limit is reached
2044: Max incoming bit rate limit reached
2100: Generic MMS acquisition error
2101: Acquirer could not be contacted to check header
2102: Check of head result failed to match criteria
2103: Could not write stream to disk
2104: Could not open file to write
2105: Stream header could not be retrieved
2106: Remote stream was closed abnormally
2107: Header obtained from remote stream was in invalid format
2108: Remote server closed the connection
2109: Connection to remote server timed out
2110: Remote file contains no streams
2111: Could not load ASF block for indexing
2112: Specified URL could not be resolved to a remote host
2113: File size is bigger than that supported
2114: Acquirer could not be contacted to send status of download
2115: Error in forking a process
2116: Stream acquisition terminated
2117: MMS acquisition was stopped
2118: Bandwidth not available for streaming in next packet
2119: RPC exception thrown when contacting main acquirer