Cache Key and Parent Selection URL Manipulation Plugin

Description

This plugin allows some common cache key or parent selection URL manipulations based on various HTTP request components. Although cache key is used everywhere in this document, the same manipulations can be applied to parent selection URL by switching key type. The plugin can

  • sort query parameters to prevent query parameter reordering being a cache miss

  • ignore specific query parameters from the cache key by name or regular expression

  • ignore all query parameters from the cache key

  • only use specific query parameters in the cache key by name or regular expression

  • include headers or cookies by name

  • capture values from the User-Agent header.

  • classify request using User-Agent and a list of regular expressions

  • capture and replace strings from the URI and include them in the cache key

  • do more - please find more examples below.

URI type

The plugin manipulates the remap URI (value set during URI remap) by default. If manipulation needs to be based on the pristine URI (the value before URI remapping takes place) one could use the following option:

  • --uri-type=[remap|pristine] (default: remap)

Key type

The plugin manipulates the cache key by default. If parent selection URL manipulation is needed the following option can be used:

  • --key-type=<list of target types> (default: cache_key) - list of cache_key or parent_selection_url, if multiple --key-type options are specified then all values are combined together.

An instance of this plugin can be used for applying manipulations to cache key, parent selection URL or both depending on the need. See simultaneous cache key and parent selection URL manipulation for examples of how to apply the same set of manipulations to both targets with a single plugin instance or applying different sets of manipulations to each target using separate plugin instances.

How to run the plugin

The plugin can run as a global plugin (a single global instance configured using plugin.config) or as per-remap plugin (a separate instance configured per remap rule in remap.config).

Global instance

$ cat plugin.config
cachekey.so \
    --include-params=a,b,c \
    --sort-params=true

Per-remap instance

$cat remap.config
map http://www.example.com http://www.origin.com \
    @plugin=cachekey.so \
        @pparam=--include-params=a,b,c \
        @pparam=--sort-params=true

If both global and per-remap instance are used the per-remap configuration would take precedence (per-remap configuration would be applied and the global configuration ignored).

Because of the ATS core (remap) and the CacheKey plugin implementation there is a slight difference between the global and the per-remap functionality when --uri-type=remap is used.

  • The global instance always uses the URI after remap (at TS_HTTP_POST_REMAP_HOOK).

  • The per-remap instance uses the URI during remap (after TS_HTTP_PRE_REMAP_HOOK and before TS_HTTP_POST_REMAP_HOOK) which leads to a different URI to be used depending on plugin order in the remap rule.

    • If CacheKey plugin is the first plugin in the remap rule the URI used will be practically the same as the pristine URI.

    • If the CacheKey plugin is the last plugin in the remap rule (which is right before TS_HTTP_POST_REMAP_HOOK) the behavior will be similar to the global instance.

Detailed examples and troubleshooting

             |                           hierarchical part                                    query
HTTP request | ┌────────────────────────────────┴─────────────────────────────────────────┐┌────┴─────┐
components   |   URI host and port       HTTP headers and cookies               URI path    URI query
             | ┌────────┴────────┐┌────────────────┴─────────────────────────┐┌─────┴─────┐┌────┴─────┐
Sample 1     | /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
Sample 2     | /nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
             | └────────┬────────┘└───┬──┘└─────┬────┘└────┬─────┘└─────┬────┘└─────┬─────┘└────┬─────┘
Cache Key    |     host:port or   UA-class UA-captures   headers     cookies       path       query
components   |     custom prefix           replacement

The following is an example of how the above sample keys were generated (Sample 1 and Sample 2).

Traffic Server configuration

$ cat etc/trafficserver/remap.config
map http://www.example.com http://www.origin.com \
    @plugin=cachekey.so \
        @pparam=--ua-allowlist=popular:popular_agents.config \
        @pparam=--ua-capture=(Mozilla\/[^\s]*).* \
        @pparam=--include-headers=H1,H2 \
        @pparam=--include-cookies=C1,C2 \
        @pparam=--include-params=a,b,c \
        @pparam=--sort-params=true

$ cat etc/trafficserver/popular_agents.config
^Mozilla.*
^Twitter.*
^Facebo.*

$ cat etc/trafficserver/plugin.config
xdebug.so

HTTP request

$ curl 'http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3' \
    -v -x 127.0.0.1:8080 -o /dev/null -s \
    -H "H1: v1" \
    -H "H2: v2" \
    -H "Cookie: C1=v1; C2=v2" \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A' \
    -H 'X-Debug: X-Cache-Key'
* About to connect() to proxy 127.0.0.1 port 8080 (#0)
*   Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3 HTTP/1.1
> Host: www.example.com
> Accept: */*
> Proxy-Connection: Keep-Alive
> H1: v1
> H2: v2
> Cookie: C1=v1; C2=v2
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A
> X-Debug: X-Cache-Key
>
< HTTP/1.1 200 OK
< Server: ATS/6.1.0
< Date: Thu, 19 Nov 2015 23:17:58 GMT
< Content-type: application/json
< Age: 0
< Transfer-Encoding: chunked
< Proxy-Connection: keep-alive
< X-Cache-Key: /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
<
{ [data not shown]
* Connection #0 to host 127.0.0.1 left intact
* Closing connection #0

The response header X-Cache-Key header contains the cache key:

/www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3

The xdebug.so plugin and X-Debug request header are used just to demonstrate basic cache key troubleshooting.

If we add --static-prefix=nice_custom_prefix to the remap rule then the cache key would look like the following:

/nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3

Usage examples

URI query parameters

Ignore the query string (all query parameters)

The following added to the remap rule will ignore the query, removing it from the cache key.

@plugin=cachekey.so @pparam=--remove-all-params=true

Cache key normalization by sorting the query parameters

The following will normalize the cache key by sorting the query parameters.

@plugin=cachekey.so @pparam=--sort-params=true

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use a=1&b=2&c=1&k=1&u=1&x=1&y=1

Ignore (exclude) certain query parameters

The following will make sure query parameters a and b will not be used when constructing the cache key.

@plugin=cachekey.so @pparam=--exclude-params=a,b

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&x=1&k=1&u=1&y=1

Ignore (exclude) certain query parameters from the cache key by using regular expression (PCRE)

The following will make sure query parameters a and b will not be used when constructing the cache key.

@plugin=cachekey.so @pparam=--exclude-match-params=(a|b)

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&x=1&k=1&u=1&y=1

Include only certain query parameters

The following will make sure only query parameters a and c will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-params=a,c

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&a=1

Include only certain query parameters by using regular expression (PCRE)

The following will make sure only query parameters a and c will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-match-params=(a|c)

If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&a=1

Include and exclude certain parameters using multiple parameters in the same remap rule.

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-params=x \
    @pparam=--exclude-params=y \
    @pparam=--exclude-params=z \
    @pparam=--include-params=y,c \
    @pparam=--include-params=x,b

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

Include and exclude certain parameters using multiple parameters in the same remap rule and regular expressions (PCRE).

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-match-params=x \
    @pparam=--exclude-match-params=y \
    @pparam=--exclude-match-params=z \
    @pparam=--include-match-params=(y|c) \
    @pparam=--include-match-params=(x|b)

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

Mixing –include-params, –exclude-params, –include-match-param and –exclude-match-param

If the plugin is used with the following plugin parameters in the remap rule:

@plugin=cachekey.so \
    @pparam=--exclude-params=x \
    @pparam=--exclude-match-params=y \
    @pparam=--exclude-match-params=z \
    @pparam=--include-params=y,c \
    @pparam=--include-match-params=(x|b)

and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1 the cache key will use c=1&b=1

HTTP Headers

Include certain headers in the cache key

The following headers HeaderA and HeaderB will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-headers=HeaderA,HeaderB

The following would capture from the Authorization header and will add the captured element to the cache key

@plugin=cachekey.so \
    @pparam=--capture-header=Authorization:/AWS\s(?<clientID>[^:]+).*/clientID:$1/

If the request looks like the following:

http://example-cdn.com/path/file
Authorization: AWS MKIARYMOG51PT0DLD:DLiWQ2lyS49H4Zyx34kW0URtg6s=

The cache key would be set to:

/example-cdn.com/80/clientID:MKIARYMOG51PTCKQ0DLD/path/file

HTTP Cookies

Include certain cookies in the cache key

The following headers CookieA and CookieB will be used when constructing the cache key and the rest will be ignored.

@plugin=cachekey.so @pparam=--include-headers=CookieA,CookieB

Prefix (host, port, capture and replace from URI)

Replacing host:port with a static cache key prefix

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so @pparam=--static-prefix=static_prefix

the cache key will be prefixed with /static_prefix instead of host:port when --static-prefix is not used.

Capturing from the host:port and adding it to the prefix section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*)

the cache key will be prefixed with /test_prefix/80 instead of test_prefix_371.example.com:80 when --capture-prefix is not used.

Capturing from the entire URI and adding it to the prefix section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/(test_prefix).*:.*(object).*$/$1_$2/

and if the request URI is the following

http://test_prefix_123.example.com/path/to/object?a=1&b=2&c=3

the cache key will be prefixed with /test_prefix_object instead of test_prefix_123.example.com:80 when --capture-prefix-uri is not used.

Combining prefix plugin parameters, i.e. –static-prefix and –capture-prefix

If the plugin is used with the following plugin parameters in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*) \
    @pparam=--static-prefix=static_prefix

the cache key will be prefixed with /static_prefix/test_prefix/80 instead of test_prefix_371.example.com:80 when either --capture-prefix nor --static-prefix are used.

Path, capture and replace from the path or entire URI

Capture and replace groups from path for the “Path” section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path=/.*(object).*/const_path_$1/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

then the cache key will have /const_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

Capture and replace groups from whole URI for the “Path” section

If the plugin is used with the following plugin parameter in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

the cache key will have /test_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

Combining path plugin parameters –capture-path and –capture-path-uri

If the plugin is used with the following plugin parameters in the remap rule.

@plugin=cachekey.so \
    @pparam=--capture-path=/.*(object).*/const_path_$1/ \
    @pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/

and the request URI is the following

http://test_path_123.example.com/path/to/object?a=1&b=2&c=3

the cache key will have /test_path_object/const_path_object in the path section of the cache key instead of /path/to/object when either --capture-path nor --capture-path-uri are used.

User-Agent capturing, replacement and classification

Let us say we have a request with User-Agent header:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3)
AppleWebKit/537.75.14 (KHTML, like Gecko)
Version/7.0.3 Safari/7046A194A

Capture PCRE groups from User-Agent header

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-capture=(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)

then Mozilla/5.0 and AppleWebKit/537.75.14 will be used when constructing the key.

Capture and replace groups from User-Agent header

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-capture=/(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)/$1_$2/

then Mozilla/5.0_AppleWebKit/537.75.14 will be used when constructing the key.

User-Agent allow-list classifier

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-allowlist=browser:browser_agents.config

and if browser_agents.config contains:

^Mozilla.*
^Twitter.*
^Facebo.*

then browser will be used when constructing the key.

User-Agent deny-list classifier

If the plugin is used with the following plugin parameter:

@plugin=cachekey.so \
    @pparam=--ua-denylist=browser:tool_agents.config

and if tool_agents.config contains:

^PHP.*
^Python.*
^curl.*

then browser will be used when constructing the key.

Cacheurl plugin to cachekey plugin migration

The plugin cachekey was not meant to replace the cacheurl plugin in terms of having exactly the same cache key strings generated. It just allows the operator to extract elements from the HTTP URI in the same way the cacheurl does (through a regular expression, please see <capture_definition> above).

The following examples demonstrate different ways to achieve cacheurl compatibility on a cache key string level in order to avoid invalidation of the cache.

The operator could use –capture-path-uri, –capture-path, –capture-prefix-uri, –capture-prefix to capture elements from the URI, path and authority elements.

By using –separator=<string> the operator could override the default separator to an empty string –separator= and thus make sure there are no cache key element separators.

Example 1: Let us say we have a capture definition used in cacheurl. Now by using –capture-prefix-uri one could extract elements through the same capture definition used with cacheurl, remove the cache key element separator –separator= and by using –capture-path-uri could remove the URI path and by using –remove-all-params=true could remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/.*/$0/ \
    @pparam=--capture-path-uri=/.*// \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 2: A more efficient way would be achieved by using –capture-prefix-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-path to remove the URI path and –remove-all-params=true to remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/.*/$0/ \
    @pparam=--remove-path=true \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 3: Same result as the above but this time by using –capture-path-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-prefix to remove the URI authority elements and by using –remove-all-params=true to remove the query string:

@plugin=cachekey.so \
    @pparam=--capture-path-uri=/(.*)/$0/ \
    @pparam=--remove-prefix=true \
    @pparam=--remove-all-params=true \
    @pparam=--separator=

Example 4: Let us say that we would like to capture from URI in similar to cacheurl way but also sort the query parameters (which is not supported by cacheurl). We could achieve that by using –capture-prefix-uri to capture by using a capture definition to process the URI before ? and using –remove-path to remove the URI path and –sort-params=true to sort the query parameters:

@plugin=cachekey.so \
    @pparam=--capture-prefix-uri=/([^?]*)/$1/ \
    @pparam=--remove-path=true \
    @pparam=--sort-params=true \
    @pparam=--separator=

Simultaneous cache key and parent selection URL manipulation

The following is an example of how to manipulate both cache key and parent selection URL in the same remap rule. For this purpose two separate instances are loaded for that remap rule:

@plugin=cachekey.so \
    @pparam=--key-type=parent_selection_url \
    @pparam=--static-prefix=this://goes.to/parent/selection/url \
    @pparam=--canonical-prefix=true \
@plugin=cachekey.so \
    @pparam=--key-type=cache_key \
    @pparam=--static-prefix=this://goes.to/cache/key \
    @pparam=--canonical-prefix=true

In the example above the first instance of the plugin sets the prefix to the parent selection URI and the second instance of the plugin sets the prefix to the cache key.

The same string manipulations can be applied to both cache key and parent selection url more concisely without chaining cachekey plugin instances by specifying multiple target types –key-type.

Instead of:

@plugin=cachekey.so \
    @pparam=--key-type=parent_selection_url \
    @pparam=--remove-all-params=true
@plugin=cachekey.so \
    @pparam=--key-type=cache_key \
    @pparam=--remove-all-params=true

one could write:

@plugin=cachekey.so \
    @pparam=--key-type=parent_selection_url,cache_key \
    @pparam=--remove-all-params=true