Cisco ACNS Software Caching Configuration Guide, Release 4.2
Chapter 1: Cisco Cache Application Product Overview

Table Of Contents

Cisco Cache Application Overview

Cache Application Introduction

Introduction to Caching

Network Protocols and Caching

HTTP

HTTP Message Definition

FTP over HTTP

RealNetworks Real-Time Streaming Protocol

Windows Media Technologies MMS

Deployment Configurations

Transparent Caching

Transparent Caching Through a WCCP-Enabled Router

Transparent Caching with the Cisco CSS 11000 Series Switch

Nontransparent Caching

Using the Content Engine as a Proxy

Using a .pac File

Reverse Proxy Caching

Web Acceleration

Caching Hierarchy

Advanced Transparent Caching Service


Cisco Cache Application Overview


This chapter provides an overview of the Cisco caching application portion of Cisco Application and Content Networking System (ACNS) software deployed on Cisco Content Engines. This chapter also describes the three different deployment configurations for the Cisco Content Engines. This chapter contains the following sections:

Cache Application Introduction

Introduction to Caching

Network Protocols and Caching

Deployment Configurations

Caching Hierarchy

Advanced Transparent Caching Service

Cache Application Introduction

The Cache application portion of Cisco ACNS software deployed on Cisco Content Engines is one of the content delivery elements of the Content Delivery Network (CDN) solution from Cisco Systems. The CDN solution allows the proactive distribution of rich media files to Content Engines at the network edge for local access to e-business applications such as e-learning, e-commerce, knowledge sharing, and corporate communications. Designed for affordability and ease of installation, the CDN solution enables you to quickly deploy high-impact, high-bandwidth rich media, such as high-quality streaming video, with minimal administration.

Cisco Content Engines with Cache application software installed accelerate content delivery by caching frequently accessed content (transparently or proxy-style) and then locally fulfilling content requests rather than traversing the Internet or intranet to a distant server. This solution helps to protect your network from uncontrollable bottlenecks and accelerates the delivery of content, enabling service providers to offer higher service quality and enabling enterprise employees to be more productive. By caching content such as Hypertext Transfer Protocol (HTTP) and File Transfer Protocol (FTP) traffic, Cisco Content Engines minimize redundant network traffic that traverses WAN links. As a result, WAN bandwidth costs either decrease or grow less quickly. This bandwidth optimization increases network capacity for additional users or traffic and for new services, such as voice.

Introduction to Caching

Typically a web user (client) requests an object from a web server using a web browser. (See Figure 1-1.) Because caching refers to the ability to store web objects—such as web pages—for later retrieval, web browsers have a local cache that the user can set up to help retain recently viewed pages for easier access during the web viewing process. (See Figure 1-2.) For instance, clicking the Back button on a browser takes advantage of local caching, because this recently viewed page is rapidly displayed.

Figure 1-1 Basic Web Request

Figure 1-2 Local Browser Cache Configuration

Despite the inherent advantages obtained using a local cache, a device called a Content Engine is often used to provide specialized caching features to many users.

This ability to store information on a large scale for later retrieval has significant advantages for the user and Internet traffic on the whole:

1. It reduces network congestion by keeping web objects close to the user instead of accessing the same content through the network.

2. It reduces the time it takes to display a web page, because the page is stored locally or close to the user.

3. It reduces the load from the server, because this server does not have to redistribute content that has recently been acquired.

In this scenario, a user attempts to retrieve a web object from a web server. Because the local browser does not have the page stored in its cache, the browser sends the request to a web server using the Content Engine as its proxy for the web request. In this deployment, the Content Engine serves as a proxy to the client, and at first it tries to satisfy the web request from its cache of stored objects. See the "Using the Content Engine as a Proxy" section for more information on proxy caching.

Network Protocols and Caching

The interaction between a web browser and a web server makes use of existing standard application-layer Internet protocols such as HTTP, Microsoft Media Server (MMS), and Real-Time Streaming Protocol (RTSP). Every web object has a uniquely defined (URL) address that allows the web browser to retrieve web objects, assuming that the server address exists in the Domain Name System (DNS) space. The basic components of a URL address and its relationship to network protocol are shown in Figure 1-3.

Figure 1-3 Basic Components of URL Address

HTTP

HTTP is a request-response protocol used to send and receive messages between clients and servers. HTTP uses URLs to define web server addresses and objects that clients can retrieve.

The type of request-response scenario that concerns caching is the one defined by the presence of the Content Engine as an intermediary in the request-response process. As defined in the "Deployment Configurations" section, the Content Engine can serve as an intermediary in the following scenarios:

Direct proxy acting on behalf of the requesting client

Transparent caching agent intercepting traffic sent through either a router or a Layer 4 switch

Reverse proxy caching agent acting as a proxy on behalf of the server

HTTP Message Definition

An HTTP message consists of an HTTP header and an entity body. Two types of messages in HTTP are directly related to the exchange between a client and a server: an HTTP request and an HTTP response.

HTTP Request

The HTTP request message consists of a header field and an entity body. (See Figure 1-4.) Within the header field, the request line consists of what is requested from the server, such as the URL requested. The general headers qualify the nature of the request, such as content format. The request header pertains mainly to authorization credentials. These request headers are the headers that authentication servers, such as NTLM, LDAP, RADIUS, or TACACS+, examine when performing authentication.The entity header is concerned with the description of the data contained in the request. In particular, the request headers play an important role in providing the authorization credentials information that controls whether or not caching can be authorized when an HTTP request is made.

See the "HTTP Request Authentication" section for more information about how the Content Engine uses authentication servers to handle authentication request in an HTTP message.

Figure 1-4 Components of HTTP Request

In the case of HTTP request authentication, a server can require that a client supply a username and a password before the server can satisfy the request. This allows for restricted information and easier tracking of users visiting particular websites.

HTTP Response

The structure of the HTTP response is similar to the structure of the HTTP request shown in Figure 1-4. The main difference between the two is the presence of the status line in the header field of the HTTP response (See Figure 1-5). Here, the status line provides a response code and a descriptive explanation of the code used. (See Table 1-1.) For instance, the OK 200 status code is used to indicate a successful response, and the 404 Not Found status code is sent whenever the server or resource is not present.

See "HTTP Caching Parameter Settings," for more information regarding the configuration of features related to HTTP caching parameters.

Figure 1-5 Components of HTTP Response

Table 1-1 describes the HTTP response status codes. This table shows the status codes for different responses.

Table 1-1 HTTP Response Status Codes

Status Code
Response Categories

1xx

Informational

2xx

Successful

3xx

Redirection

4xx

Client error

5xx

Server error


FTP over HTTP

FTP over HTTP allows for sending and receiving files from remote FTP server locations using a web browser. See Figure 1-6 for an example of accessing public files over HTTP from an FTP server.

Figure 1-6 Example of FTP over HTTP Request

RealNetworks Real-Time Streaming Protocol

Real-Time Streaming Protocol (RTSP) is a streaming media protocol used to deliver two-way streaming media over IP networks. The Content Engine can be configured to accept transparently redirected RTSP requests as well as traditional proxy-style RTSP requests from RealPlayer client software. The redirection of RTSP traffic to the Content Engine media cache is enabled with the Content Engine CLI. The RealProxy software is configured with the RealAdministrator GUI, accessed from the RealProxy page of the Content Engine management GUI.

For more information regarding this feature, see the "Using the RealProxy Streaming Solution" section.

Windows Media Technologies MMS

Windows Media Technologies (WMT) uses an application-level protocol called Microsoft Media Server (MMS) to send active streaming format (ASF) files across the Internet. A URL that points to a streaming ASF file includes MMS as its protocol, as shown in the following example:

mms://servername/filename.asf

The MMS protocol automatically looks for the optimal transport to deliver the streaming media in the following order:

UDP (User Datagram Protocol)

TCP (Transmission Control Protocol)

HTTP

The UDP protocol is a connectionless, transport-layer protocol that is ideal for real-time media because it does not guarantee delivery. Although this sounds like a drawback rather than an advantage, it is a characteristic particularly suited for streaming media. Unlike data such as files or e-mail, which must be delivered in their entirety no matter how long the transmission time, the value of streaming media data is constrained by time. If a frame of video is lost, it is worthless because it will not arrive within the correct time frame.

See the "Understanding Streaming Media Caching" section for more information regarding this feature.

Deployment Configurations

The Content Engine can be deployed in three basic configurations:

Transparent Caching

Nontransparent Caching

Reverse Proxy Caching

Transparent Caching

In transparent caching, a user requests web objects directly from their source. In other words, the URL entered in the browser is that of the origin web server that holds the desired content. The user is not aware of the presence of the Content Engine in this configuration.

This request is intercepted by the Content Engine, which checks to see if it contains a copy of the requested content. If the Content Engine does not hold the page (cache miss), it forwards the intercepted address through a router (Figure 1-7) or Content Services Switch (Figure 1-9), which then sends the request to the origin server. Once the origin server returns the content, the Content Engine stores a copy of the requested object in its cache and then sends a copy of the content to the client that requested it.

Figure 1-7 Transparent Caching Using a Router

Transparent Caching Through a WCCP-Enabled Router

In transparent caching through the Web Cache Communication Protocol (WCCP), the user is unaware that the request made to an origin server is redirected to the Content Engine by a WCCP-enabled router. A request to a WCCP-enabled router allows for traffic interception on any port number for traffic that traverses the WCCP-enabled router or switch. WCCP contains many fail-safe mechanisms to ensure that the caching solution remains entirely transparent to the end user.

A Content Engine transparently caches content as follows. (See Figure 1-8.)

1. A user requests a web page from a browser.

2. The WCCP-enabled router analyzes the request, and based on TCP destination port number, the router determines whether it should transparently redirect the request to a Content Engine. Access lists are used to control which requests can be redirected.

3. If the Content Engine does not have the requested content, it sets up a separate TCP connection to the end server to retrieve the content (3a). The content returns to, and is stored on, the
Content Engine (3b).

4. The Content Engine sends the content to the client (4). Upon subsequent requests for the same content, the Content Engine transparently fulfills the request from its local storage.

See the "Transparent Caching Through WCCP" section for more information regarding this feature.

Figure 1-8 Transparent Caching Through WCCP

WCCP can also handle asymmetric packet flows and always maintains a consistent mapping of web servers to caches regardless of the number of switches or routers used in a WCCP service group (up to 32 routers or switches communicating with up to 32 caches in a cluster).

There are some significant advantages to deploying caches in transparent mode:

No end user configuration—The user does not have to point to the Content Engine.

Fail-safe operation—Caches are automatically fault-tolerant and fail-safe. Any cache failure does not cause denial of service to the end user.

Scalability—Cache service can be scaled by deploying multiple caches.

Automatic bypass—Sites which depend on end user authentication or which fail to conform to HTTP standards will automatically bypass a transparent cache.

Transparent Caching with the Cisco CSS 11000 Series Switch

Transparent caching deploys cache servers that are transparent to the browsers. You do not have to configure browsers to point to a cache server. Cache servers duplicate and store inbound Internet data previously requested by clients.

When you configure transparent caching on the CSS switch, the switch intercepts and redirects outbound client requests for Internet data to the cache servers on your network. The cache either returns the requested content if it has a local copy or sends a new request to the origin server for the information. If all cache servers are unavailable in a transparent cache configuration, the CSS switch allows all client requests to progress to the origin servers.

A CSS switch is introduced between the user and the cache. (See Figure 1-9.) You can configure the CSS switch to dynamically analyze the content and determine if it is cacheable or not. If it is cacheable, the CSS switch directs it to the cache service. If it is not cacheable, the CSS switch sends it directly to the
origin server.

See the "Configuring Transparent Caching with the Cisco CSS 11000 Series Switch" section for more information regarding transparent caching.

Figure 1-9 Transparent Caching Network Diagram with the CSS 11000 Series Switch

The Content Engines will be stocked with static data (that is, HTML, Audio Video Interleaved [AVI], Joint Photographic Experts Group [JPEG], or Graphics Interchange Format [GIF] files). Any files that are not cacheable will be passed directly to the server.

Requests for cacheable content are load-balanced over the two cache servers based on the URL. In a real-world scenario, they could also be balanced based on the domain name.

Nontransparent Caching

In nontransparent caching, the user specifically sends all the requests to the Content Engine. The Content Engine acts on behalf of the client as a proxy.

Using the Content Engine as a Proxy

A proxy-style request arrives with the same destination IP address as the Content Engine; it has been specifically routed to the Content Engine by the client. The Content Engine supports up to eight incoming ports each for FTP, HTTPS, HTTP, MMS, and RTSP proxy modes. The incoming proxy ports can be the same ports that are used by transparent mode services. The incoming proxy ports can be changed without stopping any WCCP services running on the Content Engine or on other Content Engines in the Content Engine farm.

In proxy mode, the Content Engine services any protocols for which it has been configured. The supported protocols are HTTP, HTTPS, FTP, MMS, and RTSP. If the Content Engine is not configured to support a received protocol, the proxy server returns an error. For example, if port 8080 is configured to run an HTTP and HTTPS proxy service, an FTP request coming to this port is rejected.

A Content Engine in proxy mode caches content as follows.

1. A user requests a web page from a browser.

2. If the Content Engine does not have the requested content (cache-miss), it sets up a connection to the web server to retrieve the content.

3. The content returns to, and is stored on, the Content Engine.

4. The Content Engine sends the content to the client.

5. Upon subsequent requests for the same content by the same user or a different user, the Content Engine transparently fulfills the request from its local storage (cache-hit).

See the "Proxy Mode Operation" section for more information regarding proxy-style caching.

Figure 1-10 Web Caching with the Content Engine in Proxy Mode

Using a .pac File

You can also use proxy automatic configuration files (.pac files) in the deployment of nontransparent caching. When you have multiple Content Engines that support many clients, you can use a .pac file to configure all of your browser clients. When the browser starts, it loads the .pac file and then uses this configuration file to obtain a proxy Content Engine IP address and port configuration information. An administrator can configure all browsers in any organization by using a single .pac file. For more information regarding the use of a .pac file, see the "Browser Autoconfiguration" section.

Reverse Proxy Caching

In reverse-proxy caching mode, the Content Engine acts as a proxy on behalf of the origin server. (See Figure 1-11.)

A Content Engine in reverse proxy mode caches content as follows.

1. A user requests a web page from a browser.

2. A WCCP-enabled router intercepts the request and forwards it to a Content Engine.

3. If the Content Engine does not have the requested content (cache miss), it sets up a connection to the web server to retrieve the content.

4. A Content Engine at the content provider site, acting as reverse proxy for the web server, tries to deliver the requested content.

5. If the Content Engine in reverse proxy mode does not have the content, it sets up a connection to the web server to retrieve original content requested.

6. The content returns to, and is stored on, the Content Engine at the enterprise.

7. The Content Engine at the enterprise sends the content to the client.

Upon subsequent requests for the same content by the same user or a different user, the Content Engine transparently fulfills the request from its local storage (cache hit).

Figure 1-11 Content Engine in Reverse Proxy Mode

Web Acceleration

The Content Engine accelerates web server performance by offloading common or static pages from the origin web server. Users requesting objects from the origin server receive the static pages from the Content Engine acting in reverse proxy mode rather than from the origin server. This provides an alternative to web server expansion, as well as a possible way of replicating content to geographically dispersed areas by deploying Content Engines in these areas.

Caching Hierarchy

Because a Cisco Content Engine can be transparent to the client and to network operation, customers can easily place Content Engines in several network locations in a hierarchical fashion. For example, if an Internet service provider (ISP) deploys a Content Engine at its main point of access to the Internet, all of its points of presence (POPs) benefit, because requested content can be available at this main point of access without going through the Internet. Figure 1-12 depicts a typical caching hierarchy using Content Engines.

Figure 1-12 Caching Hierarchy

Client requests reach the Content Engine and are fulfilled from its storage. To further improve service to favored clients, ISPs can deploy Content Engines at each POP. Then, when a client accesses the Internet, the request is first redirected to the POP Content Engine. If the POP Content Engine is unable to fulfill the request from local storage, it makes a normal web request to the end server.

Upstream, this request is redirected to the Content Engine at the main Internet access point. If the request is fulfilled by the Content Engine, traffic on the main Internet access link is avoided, the origin web servers experience lower demand, and the client experiences better network response times.

Enterprise networks can apply this hierarchical transparent architecture in the same way. (See Figure 1-13.)

Figure 1-13 Caching Hierarchy in Enterprise Solutions

Advanced Transparent Caching Service

Cisco Content Engines offer advanced transparent caching technologies that include:

Overload bypass—Prevents a Content Engine from becoming a bottleneck when traffic loads exceed the capacity of a Content Engine

Dynamic client bypass—Prevents source IP authentication problems by selectively allowing clients to directly connect to origin servers

Flow protection—Prevents existing flows from being broken when the WCCP cluster load distribution changes because of the addition or subtraction of a Content Engine into or from a cluster

WCCP slow start—Prevents cluster destabilization when a new Content Engine is added to a heavily loaded cluster

Rules Template—Enables flexible establishment of caching policies or rules, for example, "no-cache" policies, refresh policies, upstream proxy selection rules, and URL rewrite rules

To integrate with existing proxy infrastructures, the Cisco ACNS software supports a number of proxied protocols, including FTP, Hypertext Transfer Protocol Secure (HTTPS), HTTP 1.0, and HTTP 1.1. With the Rules Template feature, administrators can establish proxy policies, providing control over how traffic is proxied.

The Cisco Content Engines can be deployed in front of a website (reverse proxy) to transparently cache inbound requests for content, significantly reducing the traffic and TCP connection maintenance performed by origin servers.

By supporting WCCP Version 2 or by interoperating with Cisco CSS 11000 Series switches, a Content Engine can achieve a basic level of transparency that includes:

Transparently receiving content traffic

Fault tolerance

Scalable clustering

See the "Advanced Transparent Caching Features" section for information related to advanced transparent caching services.