Cisco ACNS Software Caching Configuration Guide, Release 4.1
Chapter 15: Configuring the Rules Template

Configuring the Rules Template

Rules Template

The Rules Template feature allows for requests to be matched using an arbitrary number of parameters with an arbitrary number of policies applied against the matches. Requests can be matched against regular expressions symbolizing domain names, source IP addresses and network masks, destination IP addresses and network masks, destination port numbers, MIME types, or regular expressions symbolizing a URL.

Policies that can be applied include:

  • Blocking the request

  • Bypassing authentication for the request

  • Resetting the request

  • Using a specific object freshness calculation factor

  • Not caching an object

  • Bypassing an upstream proxy for the request

  • Redirecting the request to a different URL

  • Revalidating the object with the origin server

  • Rewriting the URL

  • Selectively caching the object

  • Using a specific upstream proxy

  • Using a specific server for the request

  • Setting TOS/DSCP in response sent to client

  • Setting TOS/DSCP in response sent to server

The Rules Template feature is applicable only for HTTP, FTP, and HTTPS traffic and is not applicable for streaming protocols (RTSP, Progressive Networks Audio [PNA], and MMS) implemented in ACNS 4.1 software.


Note   To enter a question mark (?) character in a rule regular expression configuration from the command-line interface, use the escape character (\) followed by a question mark (?) character. This prevents the command-line interface from displaying context-sensitive help.

Actions and Patterns

A rule is an action and a pattern. An action is performed on an HTTP request if this request matches the pattern specified in the rule command.

An action is something that the Content Engine performs when processing an HTTP request, for instance, blocking the request, using an alternative proxy, and so forth.

A pattern defines the limits of an HTTP request; for instance, a pattern may specify that the source IP address fall in the subnet range 172.16.*.*.

Rules can be dynamically added, displayed, or deleted from the Content Engine. The rules are preserved across reboots because they are written into persistent storage such as NVRAM using the appropriate CLI commands. Only the system resources limit the number of rules that the Content Engine can support. Because rules consume resources, the more rules there are defined, the more Content Engine performance may be affected.

Actions

The Rules Template feature supports the following types of actions:

  • Block—Blocks this request.

  • DSCP—Configures the IP ToS/DSCP codepoint field.

    • client cache-hit—Configures the IP ToS/DSCP codepoint field for cache-hit responses to the client.

    • client cache-miss—Configures the IP ToS/DSCP codepoint field for cache-miss responses to the client.

Setting the Type of Service (ToS) or differentiated services code point (DSCP) is called packet marking, allowing you to partition network data into multiple priority levels or types of service. You can set the ToS or DSCP values in IP packets based on a URL match, a file type, a domain, a destination IP address, a source IP address, or a destination port.

You can set specific ToS or DSCP values for the following:

  • Requests from the Content Engine to the server

  • Responses to the client on cache hit

  • Responses to the client on cache miss

The ToS or DSCP may be set based on any of the policies matching the src-ip-address, dst-ip-address, dst-port-number, domain regex, url-regex, or mime-type regex options. In addition, you can now configure global ToS or DSCP settings with the ip dscp command.


Note   The Rules Template configuration takes precedence over the ip dscp command, and the url-filter command takes precedence over the rule command to the extent that even the rule no-block command is executed only if the url-filter command has not blocked the request.

  • DSCP server—Configures the IP ToS/DSCP codepoint field for requests to the origin server.

  • Freshness-factor—Determines the Time To Live if the request URL matches a specified regular expression. The refresh configuration takes priority over freshness-factor configurations.

  • No-auth—Does not authenticate.

Note that the no-auth rules result in the display of multiple authentication windows in the following scenario:

  • When the main page (for example, index.htm) is excluded from proxy authentication by using no-auth rules

  • When the user entry is not already included in the Content Engine authentication cache

  • When the index.htm page contains objects belonging to different domains

To avoid multiple authentication windows, configure the http avoid-multiple-auth-prompts command in global configuration mode. Once it is configured, check the configuration with the show http avoid-multiple-auth-prompts command as shown the following example.

    ContentEngine# show http avoid-multiple-auth-prompts
    
    Avoiding multiple authentication prompts due to no-auth rules is enabled
     
    

    Note   The command in the example is hidden, because it is applicable only to this specific scenario.

  • No-cache—Does not cache this object. If both no-cache and selective-cache actions are matched, no-cache takes precedence.

  • No-proxy—For a cache miss, does not use the configured upstream proxy but rather contacts the server directly.

  • Redirect—Redirects the original request to a specified URL. Redirect is relevant to the RADIUS server only if the RADIUS server has been configured for redirect.

  • Refresh—For a cache hit, forces an object freshness check with the server.

  • Reset—Issues a TCP RST. This reset request in useful when resetting Code Red or Nimda virus requests.

  • Rewrite—Rewrites the original request as a specified URL. The Content Engine searches for the rewritten URL in cache, and then on a cache miss, fetches the rewritten URL and returns the object transparently to the client. It is preferable to use a redirect rule rather than rewrite because of possible performance impacts.

The URL rewrite could change the domain name of the URL, which necessitates a DNS lookup to find the destination (dst) IP address of the new rewritten server to which the request must be sent. The original dst IP address derived from the WCCP redirect packet cannot be used.

  • Selective-cache—Caches this object only if it is a match and is allowed to be cached by HTTP. If one or more rules specify this action, an object is cached if and only if it matches at least one of the selective-cache rules and passes every other caching restriction such as the object-size check and the no-cache-on-authenticated-object check. If the object does not match any of the selective-cache rules, the object is not cached.

  • Use-proxy—For a cache miss, uses a specific upstream proxy. Specify the upstream proxy IP address (or domain name) and port number. If both no-proxy and use-proxy are matched, no-proxy takes precedence.

  • Use-proxy-failover—Supports failover capability. The use-proxy-failover rule is similar to the use-proxy rule, except that if the connection attempt on the configured outgoing proxy fails, the requests fail over to the outgoing proxies configured with the HTTP proxy outgoing configuration. The rule requests use the HTTP proxy outgoing origin-server option, if it is configured. The use-proxy-failover rule takes precedence over the use-proxy rule. If both no-proxy and use-proxy-failover are matched, no-proxy takes precedence.

The HTTP failover does not apply if the destination is on the exclude list. When in transparent mode, the setting for the original proxy takes precedence.

  • Use-server—Sends server-style HTTP requests from the Content Engine to the specified IP address and port on a cache miss.

Among use-server, no-proxy, and use-proxy rules, the use-server rule is the first one to be checked. If it results in a rule miss, no-proxy and use-proxy rules are executed in succession (use-proxy is not checked if a no-proxy rule matches).

If a rule is configured with a fully qualified domain name (FQDN) and a request is received with the partial domain name in transparent mode, the rule fails to be executed, as the FQDN is not in the request URL. In transparent mode, if a request is destined for a particular domain (for which a domain rule is configured) and does not contain the Host header, the rule pattern match fails.

Patterns

The Rules Template feature supports the following types of patterns.

  • Domain—Matches the domain name in the URL or the Host header against a regular expression. For example, ".*ibm.*" matches any domain name that contains the "ibm" substring. "\.foo\.com$" matches any domain name that ends with the ".foo.com" substring.


  • Note   In regular expression syntax, the dollar sign "$" metacharacter directs that a match is made only when the pattern is found at the end of a line.

  • Dst-ip—Matches the request's destination IP address and netmask. Specify an IP address and a netmask. In proxy mode, the Content Engine does a DNS lookup to resolve the destination IP address of the HTTP request, making the response time longer, and possibly negating the benefit of setting a dst-ip rule. When an outgoing proxy is configured, cache miss requests are forwarded by the Content Engine to the outgoing proxy without examination of the destination server IP address, making the dst-ip rule unenforceable on the first Content Engine.

  • Dst-port—Matches the request's destination port number. Specify a port number.

  • Mime-type—Matches the MIME type of the response. Specify a MIME type string, for example, "image/gif," as defined in RFC 2046 (http://www.faqs.org/rfcs/rfc2046.html ). The administrator can specify a substring, for example, "java" and have it apply to all MIME types with the "java" substring, such as "application/x-javascript."

  • Src-ip—Matches the request's source IP address and netmask. Specify an IP address and a netmask.

  • URL-regex—Matches the URL against a regular expression. The match is case insensitive. Specify a regular expression whose syntax can be found at the following URL:

http://yenta.www.media.mit.edu/projects/Yenta/Releases/Documentation/regex-0.12/ .

  • URL-regsub—For the rewrite and redirect actions, matches the URL against a regular expression to form a new URL per pattern substitution specification. The match is case insensitive. The valid substitution index range is from 1 to 9.

  • Header-field—Requests header field pattern.

Request header field patterns referer, request-line, and user-agent are supported for actions block, reset, redirect, and rewrite. The referer pattern matches against the Referer header in the request, request-line pattern matches against the first line of the request, and user-agent pattern matches against the User-Agent header in the request.

Rules Template Processing Considerations

There is a predefined order of execution among the actions and the patterns. In other words, a group of rules with the same action will always be executed either before or after another group of rules with a different action. See the "Rule Action Execution Order" section for the order of rule action execution.This order is not affected by the order in which the rules are entered using CLI commands.

Among the rules of the same action, there is a predefined execution order among the rules pattern. This means that within a group of rules of the same action, one group of rules with the same pattern will always be executed either before or after another group of rules with a different pattern. See the "Rule Pattern Execution Order" section for the order of rule pattern execution. This order is not affected by the order in which the rules are entered using CLI commands.

Rule Action Execution Order

The order of rule action execution is as follows:

1. No-Auth—Before authentication using RADIUS/LDAP/NTLM

2. Reset—Before cache lookup

3. Block—Before cache lookup

4. Redirect—Before cache lookup

5. Rewrite—Before cache lookup

6. Refresh—On cache hit

7. Freshness-factor—On cache hit

8. Use-server—On cache miss

9. No-proxy—On cache miss

10. Use-proxy-failover—On cache miss

11. Use-proxy—On cache miss

12. TOS/DSCP server—On cache miss

13. TOS/DSCP client

14. No-cache—On cache miss

15. Selective-cache—On cache miss


Note   The commands rule no-proxy, rule use-proxy-failover, and rule use-proxy take precedence over https proxy outgoing, http proxy outgoing, and ftp proxy outgoing commands.

During a request using the rules template CLI commands, rule actions 1-4 use the original URL request for pattern matches. After a URl rewrite (rule action 5), rule actions 6--15 use the transformed URL for rule executions.

The commands rule reset, rule block, rule rewrite, and rule redirect support the following additional patterns for rule templates request:

  • request-line—Matches first line.

  • referer—Matches referer header.

  • user-agent—Matches user-agent header.

Rule Pattern Execution Order

The order of rule pattern execution is as follows:

1. Dst-port—Destination port check.

2. Src-ip—Source IP address check.

3. URL-regex—URL regex check.

4. Domain—Domain rule check.

5. Dst-ip—Destination IP address check.

6. MIME-type—Mime-type regex check.


Note   Because the MIME type exists only in the response, only the actions freshness-factor, refresh, no-cache, and selective-cache apply to a rule of MIME type.

A search for a rule match with the remaining pattern will not be performed if a match has already been found. For instance, if a match for the rule block action is found with a URL-regex request, then the remaining patterns Domain, Dst-ip, or MIME-type are not searched.

Rules are ORed together. Multiple rules may all match a request; then all actions are taken, with precedence among conflicting actions. Each rule contains one pattern; patterns cannot be ANDed together. In future releases, ANDed patterns may be supported.

It is possible to circumvent some rules. For example, to circumvent a rule with the domain pattern, enter the web server IP address instead of the domain name in the browser. A rule may have unintended effects. For instance, a rule with the domain pattern specified as "ibm" that is intended to match "www.ibm.com" can also match domain names like www.ribman.com.

A src-ip rule may not apply as intended to requests that are received by a Content Engine from another proxy or Content Engine because the original client IP address is in an X-forwarded-for header. This means that the original request source IP address is transparently replaced with the sending Content Engine IP address to another proxy or Content Engine and then to the origin server.

If a rule pattern match occurs, then the rest of the patterns are not searched. If the server has already marked an object as non-cacheable, no-cache rules are not checked at all, since the server already recognizes that this object is not cached. Any no-cache rule checks are performed only for cacheable requests.

Order of Execution Among Rules of Same Action and Same Pattern

Among the rules of the same action and the same pattern, the order of execution of rules is in the reverse order in which the rules are entered. For instance, if the use-proxy commands are entered in the following order:

use-proxy 1.2.3.4 abc.abc.com

use-proxy 2.3.4.5 *.abc.com

then a request to abc.abc.com is sent to proxy 2.3.4.5 because the use-proxy 2.3.4.5 *.abc.com command is entered last and evaluated first. However, if the same commands are entered in a reverse order as follows:

use-proxy 2.3.4.5 *.abc.com

use-proxy 1.2.3.4 abc.abc.com

then a request to abc.abc.com is sent to proxy 1.2.3.4, as the use-proxy 1.2.3.4 abc.abc.com command is entered last and evaluated first.

Examples

In the following example the rule block command (action) blocks all domains that contain .foo.com in the URL request using the domain \.foo.com pattern.

ContentEngine(config)# rule block domain \.foo.com ?

 LINE    <cr>

 

Multiple patterns can be entered on the same line. If any of them matches the incoming HTTP request, the corresponding action is taken. In the following example the rule block command (action) block all domains that contain .foo.com and bar.com in the URL request pattern.

ContentEngine(config)# rule block domain \.foo.com bar.com

ContentEngine(config)#
 

The following example prevents caching of requests that match a URL request which contains the *cgi-bin* string.

ContentEngine(config)# rule no-cache url-regex \.*cgi-bin.*

ContentEngine(config)#
 

Most actions do not have any parameters, as in the preceding examples. Exceptions to this are use-server, freshness-factor, and use-proxy, as in the following example.

ContentEngine(config)# rule use-proxy CE.foo.com 8080 url-regex .*\.jpg$ .*\.gif$ .*\.pdf$ 

ContentEngine(config)#
 

To delete rules, use no in front of the rule creation command.

ContentEngine(config)# no rule block url-regex .*\.jpg$ .*\.gif$ .*\.pdf$

 

The following example sets the freshness factor for MIME-type images.

ContentEngine(config)# rule freshness-factor 75 mime-type image/.* 

 

The following example redirects a request for old-domain-name, which has been changed to new-domain-name.

ContentEngine(config)# rule redirect url-regsub http://old-domain-name/   
http://new-domain-name/ 

 

The following example redirects requests from an IETF site to one that is locally mirrored:

ContentEngine(config)# rule redirect url-regsub http://www.ietf.org/rfc/(.*)   
http://wwwin-eng.cisco.com/RFC/RFC/\1   

 

For the preceding example, if the request URL is http://www.ietf.org/rfc/rfc1111.txt, the Content Engine rewrites the URL as http://wwwin-eng.cisco.com/RFC/RFC/rfc1111.txt and sends a 302 Temporary Redirect response with the rewritten URL in the Location header to the client. The browser automatically initiates a request to the rewritten URL.

The following example redirects all requests for linux.org to a local server in India that is closer to where the Content Engine is located:

ContentEngine(config)# rule redirect url-regsub http://linux.org/(.*) 
http://linux.org.in/\1 

 

The following example rewrites requests from an IETF site to one that is locally mirrored:

ContentEngine(config)# rule rewrite url-regsub http://www.ietf.org/rfc/.* 
http://wwwin-eng.cisco.com/RFC/$1 
 

The following example replaces the string internal.domain.com in the URL request to dummy while forwarding the request to the server.

ContentEngine(config)# rule rewrite header-field referer internal.domain.com dummy

ContentEngine(config)# 
 

If an empty string is given as a replacement pattern, then the Referer header is stripped. The same usage applies to the user-agent pattern.