Cache Key Manipulation Plugin

Description

This plugin allows some common cache key manipulations based on various HTTP request components. It can

  • sort query parameters to prevent query parameter reordering being a cache miss
  • ignore specific query parameters from the cache key by name or regular expression
  • ignore all query parameters from the cache key
  • only use specific query parameters in the cache key by name or regular expression
  • include headers or cookies by name
  • capture values from the User-Agent header.
  • classify request using User-Agent and a list of regular expressions
  • capture and replace strings from the URI and include them in the cache key
  • do more - please find more examples below.

URI type

The plugin manipulates the remap URI (value set during URI remap) by default. If manipulation needs to be based on the pristine URI (the value before URI remapping takes place) one could use the following option:

  • --uri-type=[remap|pristine] (default: remap)

Detailed examples and troubleshooting

             |                           hierarchical part                                    query
HTTP request | ┌────────────────────────────────┴─────────────────────────────────────────┐┌────┴─────┐
components   |   URI host and port       HTTP headers and cookies               URI path    URI query
             | ┌────────┴────────┐┌────────────────┴─────────────────────────┐┌─────┴─────┐┌────┴─────┐
Sample 1     | /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
Sample 2     | /nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
             | └────────┬────────┘└───┬──┘└─────┬────┘└────┬─────┘└─────┬────┘└─────┬─────┘└────┬─────┘
Cache Key    |     host:port or   UA-class UA-captures   headers     cookies       path       query
components   |     custom prefix           replacement

The following is an example of how the above sample keys were generated (Sample 1 and Sample 2).

Traffic server configuration

$ cat etc/trafficserver/remap.config
map http://www.example.com http://www.origin.com \
    @plugin=cachekey.so \
        @pparam=--ua-whitelist=popular:popular_agents.config \
        @pparam=--ua-capture=(Mozilla\/[^\s]*).* \
        @pparam=--include-headers=H1,H2 \
        @pparam=--include-cookies=C1,C2 \
        @pparam=--include-params=a,b,c \
        @pparam=--sort-params=true

$ cat etc/trafficserver/popular_agents.config
^Mozilla.*
^Twitter.*
^Facebo.*

$ cat etc/trafficserver/plugin.config
xdebug.so

HTTP request

$ curl 'http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3' \
    -v -x 127.0.0.1:8080 -o /dev/null -s \
    -H "H1: v1" \
    -H "H2: v2" \
    -H "Cookie: C1=v1; C2=v2" \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A' \
    -H 'X-Debug: X-Cache-Key'
* About to connect() to proxy 127.0.0.1 port 8080 (#0)
*   Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3 HTTP/1.1
> Host: www.example.com
> Accept: */*
> Proxy-Connection: Keep-Alive
> H1: v1
> H2: v2
> Cookie: C1=v1; C2=v2
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A
> X-Debug: X-Cache-Key
>
< HTTP/1.1 200 OK
< Server: ATS/6.1.0
< Date: Thu, 19 Nov 2015 23:17:58 GMT
< Content-type: application/json
< Age: 0
< Transfer-Encoding: chunked
< Proxy-Connection: keep-alive
< X-Cache-Key: /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
<
{ [data not shown]
* Connection #0 to host 127.0.0.1 left intact
* Closing connection #0

The response header X-Cache-Key header contains the cache key:

/www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3

The xdebug.so plugin and X-Debug request header are used just to demonstrate basic cache key troubleshooting.

If we add --static-prefix=nice_custom_prefix to the remap rule then the cache key would look like the following:

/nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3

Usage examples

URI query parameters

Ignore the query string (all query parameters)

The following added to the remap rule will ignore the query, removing it from the cache key.

@plugin=cachekey.so @pparam=--remove-all-params=true

Cache key normalization by sorting the query parameters

The following will normalize the cache key by sorting the query parameters.

@plugin=cachekey.so @pparam=--sort-params=true

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use a=1&b=2&c=1&k=1&u=1&x=1&y=1

Ignore (exclude) certain query parameters

The following will make sure query parameters a and b will not be used when constructing the cache key.

@plugin=cachekey.so @pparam=--exclude-params=a,b

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&x=1&k=1&u=1&y=1

Ignore (exclude) certain query parameters from the cache key by using regular expression (PCRE)

The following will make sure query parameters a and b will not be used when constructing the cache key.

@plugin=cachekey.so @pparam=--exclude-match-params=(a|b)

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&x=1&k=1&u=1&y=1

Include only certain query parameters

The following will make sure only query parameters a and c will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-params=a,c

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&a=1

Include only certain query parameters by using regular expression (PCRE)

The following will make sure only query parameters a and c will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-match-params=(a|c)

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&a=1

White-list + black-list certain parameters using multiple parameters in the same remap rule.

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-params=x \
    @pparam=--exclude-params=y \
    @pparam=--exclude-params=z \
    @pparam=--include-params=y,c \
    @pparam=--include-params=x,b

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

White-list + black-list certain parameters using multiple parameters in the same remap rule and regular expressions (PCRE).

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-match-params=x \
    @pparam=--exclude-match-params=y \
    @pparam=--exclude-match-params=z \
    @pparam=--include-match-params=(y|c) \
    @pparam=--include-match-params=(x|b)

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

Mixing –include-params, –exclude-params, –include-match-param and –exclude-match-param

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-params=x \
    @pparam=--exclude-match-params=y \
    @pparam=--exclude-match-params=z \
    @pparam=--include-params=y,c \
    @pparam=--include-match-params=(x|b)

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

HTTP Headers

Include certain headers in the cache key

The following headers HeaderA and HeaderB will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-headers=HeaderA,HeaderB

HTTP Cookies

Include certain cookies in the cache key

The following headers CookieA and CookieB will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-headers=CookieA,CookieB

Prefix (host, port, capture and replace from URI)

Replacing host:port with a static cache key prefix

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so @pparam=--static-prefix=static_prefix

the cache key will be prefixed with /static_prefix instead of host:port when --static-prefix is not used.

Capturing from the host:port and adding it to the prefix section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*)

the cache key will be prefixed with /test_prefix/80 instead of test_prefix_371.example.com:80 when --capture-prefix is not used.

Capturing from the entire URI and adding it to the prefix section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/(test_prefix).*:.*(object).*$/$1_$2/

and if the request URI is the following

http://test_prefix_123.example.com/path/to/object?a=1&b=2&c=3

the the cache key will be prefixed with /test_prefix_object instead of test_prefix_123.example.com:80 when --capture-prefix-uri is not used.

Combining prefix plugin parameters, i.e. –static-prefix and –capture-prefix

If the plugin is used with the following plugin parameters in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*) \
    @pparam=--static-prefix=static_prefix

the cache key will be prefixed with /static_prefix/test_prefix/80 instead of test_prefix_371.example.com:80 when either --capture-prefix nor --static-prefix are used.

Path, capture and replace from the path or entire URI

Capture and replace groups from path for the “Path” section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path=/.*(object).*/const_path_$1/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

then the cache key will have /const_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

Capture and replace groups from whole URI for the “Path” section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

the the cache key will have /test_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

Combining path plugin parameters –capture-path and –capture-path-uri

If the plugin is used with the following plugin parameters in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path=/.*(object).*/const_path_$1/ \
    @pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

the the cache key will have /test_path_object/const_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

User-Agent capturing, replacement and classification

Let us say we have a request with User-Agent header:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3)
AppleWebKit/537.75.14 (KHTML, like Gecko)
Version/7.0.3 Safari/7046A194A

Capture PCRE groups from User-Agent header

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-capture=(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)

then Mozilla/5.0 and AppleWebKit/537.75.14 will be used when constructing the key.

Capture and replace groups from User-Agent header

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-capture=/(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)/$1_$2/

then Mozilla/5.0_AppleWebKit/537.75.14 will be used when constructing the key.

User-Agent white-list classifier

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-whitelist=browser:browser_agents.config

and if browser_agents.config contains:

^Mozilla.*
^Twitter.*
^Facebo.*

then browser will be used when constructing the key.

User-Agent black-list classifier

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-blacklist=browser:tool_agents.config

and if tool_agents.config contains:

^PHP.*
^Python.*
^curl.*

then browser will be used when constructing the key.

Cacheurl plugin to cachekey plugin migration

The plugin cachekey was not meant to replace the cacheurl plugin in terms of having exactly the same cache key strings generated. It just allows the operator to exctract elements from the HTTP URI in the same way the cacheurl does (through a regular expression, please see <capture_definition> above).

The following examples demonstrate different ways to achieve cacheurl compatibility on a cache key string level in order to avoid invalidation of the cache.

The operator could use –capture-path-uri, –capture-path, –capture-prefix-uri, –capture-prefix to capture elements from the URI, path and authority elements.

By using –separator=<string> the operator could override the default separator to an empty string –separator= and thus make sure there are no cache key element separators.

Example 1: Let us say we have a capture definition used in cacheurl. Now by using –capture-prefix-uri one could extract elements through the same caplture definition used with cacheurl, remove the cache key element separator –separator= and by using –capture-path-uri could remove the URI path and by using –remove-all-params=true could remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/.*/$0/ \
    @pparam=--capture-path-uri=/.*// \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 2: A more efficient way would be achieved by using –capture-prefix-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-path to remove the URI path and –remove-all-params=true to remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/.*/$0/ \
    @pparam=--remove-path=true \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 3: Same result as the above but this time by using –capture-path-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-prefix to remove the URI authority elements and by using –remove-all-params=true to remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-path-uri=/(.*)/$0/ \
    @pparam=--remove-prefix=true \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 4: Let us say that we would like to capture from URI in similar to cacheurl way but also sort the query parameters (which is not supported by cacheurl). We could achieve that by using –capture-prefix-uri to capture by using a caplture definition to process the URI before ? and using –remove-path to remove the URI path and –sort-params=true to sort the query parameters:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/([^?]*)/$1/ \
    @pparam=--remove-path=true \
    @pparam=--sort-params=true \
    @pparam=--separator=