Cache Key and Parent Selection URL Manipulation Plugin
Description
This plugin allows some common cache key or parent selection URL manipulations based on various HTTP request components. Although cache key is used everywhere in this document, the same manipulations can be applied to parent selection URL by switching key type. The plugin can
sort query parameters to prevent query parameter reordering being a cache miss
ignore specific query parameters from the cache key by name or regular expression
ignore all query parameters from the cache key
only use specific query parameters in the cache key by name or regular expression
include headers or cookies by name
capture values from the
User-Agent
header.classify request using
User-Agent
and a list of regular expressionscapture and replace strings from the URI and include them in the cache key
do more - please find more examples below.
URI type
The plugin manipulates the remap
URI (value set during URI remap) by default. If manipulation needs to be based on the pristine
URI (the value before URI remapping takes place) one could use the following option:
--uri-type=[remap|pristine]
(default:remap
)
Key type
The plugin manipulates the cache key by default. If parent selection URL manipulation is needed the following option can be used:
--key-type=<list of target types>
(default:cache_key
) - list ofcache_key
orparent_selection_url
, if multiple--key-type
options are specified then all values are combined together.
An instance of this plugin can be used for applying manipulations to cache key, parent selection URL or both depending on the need. See simultaneous cache key and parent selection URL manipulation for examples of how to apply the same set of manipulations to both targets with a single plugin instance or applying different sets of manipulations to each target using separate plugin instances.
How to run the plugin
The plugin can run as a global plugin (a single global instance configured using plugin.config) or as per-remap plugin (a separate instance configured per remap rule in remap.config).
Global instance
$ cat plugin.config
cachekey.so \
--include-params=a,b,c \
--sort-params=true
Per-remap instance
$cat remap.config
map http://www.example.com http://www.origin.com \
@plugin=cachekey.so \
@pparam=--include-params=a,b,c \
@pparam=--sort-params=true
If both global and per-remap instance are used the per-remap configuration would take precedence (per-remap configuration would be applied and the global configuration ignored).
Because of the ATS core (remap) and the CacheKey plugin implementation there is a slight difference between the global and the per-remap functionality when --uri-type=remap
is used.
The global instance always uses the URI after remap (at
TS_HTTP_POST_REMAP_HOOK
).The per-remap instance uses the URI during remap (after
TS_HTTP_PRE_REMAP_HOOK
and beforeTS_HTTP_POST_REMAP_HOOK
) which leads to a different URI to be used depending on plugin order in the remap rule.If CacheKey plugin is the first plugin in the remap rule the URI used will be practically the same as the pristine URI.
If the CacheKey plugin is the last plugin in the remap rule (which is right before
TS_HTTP_POST_REMAP_HOOK
) the behavior will be similar to the global instance.
Detailed examples and troubleshooting
| hierarchical part query
HTTP request | ┌────────────────────────────────┴─────────────────────────────────────────┐┌────┴─────┐
components | URI host and port HTTP headers and cookies URI path URI query
| ┌────────┴────────┐┌────────────────┴─────────────────────────┐┌─────┴─────┐┌────┴─────┐
Sample 1 | /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
Sample 2 | /nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
| └────────┬────────┘└───┬──┘└─────┬────┘└────┬─────┘└─────┬────┘└─────┬─────┘└────┬─────┘
Cache Key | host:port or UA-class UA-captures headers cookies path query
components | custom prefix replacement
The following is an example of how the above sample keys were generated (Sample 1
and Sample 2
).
Traffic Server configuration
$ cat etc/trafficserver/remap.config
map http://www.example.com http://www.origin.com \
@plugin=cachekey.so \
@pparam=--ua-allowlist=popular:popular_agents.config \
@pparam=--ua-capture=(Mozilla\/[^\s]*).* \
@pparam=--include-headers=H1,H2 \
@pparam=--include-cookies=C1,C2 \
@pparam=--include-params=a,b,c \
@pparam=--sort-params=true
$ cat etc/trafficserver/popular_agents.config
^Mozilla.*
^Twitter.*
^Facebo.*
$ cat etc/trafficserver/plugin.config
xdebug.so
HTTP request
$ curl 'http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3' \
-v -x 127.0.0.1:8080 -o /dev/null -s \
-H "H1: v1" \
-H "H2: v2" \
-H "Cookie: C1=v1; C2=v2" \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A' \
-H 'X-Debug: X-Cache-Key'
* About to connect() to proxy 127.0.0.1 port 8080 (#0)
* Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET http://www.example.com/path/to/data?c=3&a=1&b=2&x=1&y=2&z=3 HTTP/1.1
> Host: www.example.com
> Accept: */*
> Proxy-Connection: Keep-Alive
> H1: v1
> H2: v2
> Cookie: C1=v1; C2=v2
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A
> X-Debug: X-Cache-Key
>
< HTTP/1.1 200 OK
< Server: ATS/6.1.0
< Date: Thu, 19 Nov 2015 23:17:58 GMT
< Content-type: application/json
< Age: 0
< Transfer-Encoding: chunked
< Proxy-Connection: keep-alive
< X-Cache-Key: /www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
<
{ [data not shown]
* Connection #0 to host 127.0.0.1 left intact
* Closing connection #0
The response header X-Cache-Key
header contains the cache key:
/www.example.com/80/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
The xdebug.so
plugin and X-Debug
request header are used just to demonstrate basic cache key troubleshooting.
If we add --static-prefix=nice_custom_prefix
to the remap rule then the cache key would look like the following:
/nice_custom_prefix/popular/Mozilla/5.0/H1:v1/H2:v2/C1=v1;C2=v2/path/to/data?a=1&b=2&c=3
Usage examples
URI query parameters
Ignore the query string (all query parameters)
The following added to the remap rule will ignore the query, removing it from the cache key.
@plugin=cachekey.so @pparam=--remove-all-params=true
Cache key normalization by sorting the query parameters
The following will normalize the cache key by sorting the query parameters.
@plugin=cachekey.so @pparam=--sort-params=true
If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use a=1&b=2&c=1&k=1&u=1&x=1&y=1
Ignore (exclude) certain query parameters
The following will make sure query parameters a and b will not be used when constructing the cache key.
@plugin=cachekey.so @pparam=--exclude-params=a,b
If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&x=1&k=1&u=1&y=1
Ignore (exclude) certain query parameters from the cache key by using regular expression (PCRE)
The following will make sure query parameters a
and b
will not be used when constructing the cache key.
@plugin=cachekey.so @pparam=--exclude-match-params=(a|b)
If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&x=1&k=1&u=1&y=1
Include only certain query parameters
The following will make sure only query parameters a and c will be used when constructing the cache key and the rest will be ignored.
@plugin=cachekey.so @pparam=--include-params=a,c
If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&a=1
Include only certain query parameters by using regular expression (PCRE)
The following will make sure only query parameters a
and c
will be used when constructing the cache key and the rest will be ignored.
@plugin=cachekey.so @pparam=--include-match-params=(a|c)
If the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&a=1
Include and exclude certain parameters using multiple parameters in the same remap rule.
If the plugin is used with the following plugin parameters in the remap rule:
@plugin=cachekey.so \
@pparam=--exclude-params=x \
@pparam=--exclude-params=y \
@pparam=--exclude-params=z \
@pparam=--include-params=y,c \
@pparam=--include-params=x,b
and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&b=1
Include and exclude certain parameters using multiple parameters in the same remap rule and regular expressions (PCRE).
If the plugin is used with the following plugin parameters in the remap rule:
@plugin=cachekey.so \
@pparam=--exclude-match-params=x \
@pparam=--exclude-match-params=y \
@pparam=--exclude-match-params=z \
@pparam=--include-match-params=(y|c) \
@pparam=--include-match-params=(x|b)
and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&b=1
Mixing –include-params, –exclude-params, –include-match-param and –exclude-match-param
If the plugin is used with the following plugin parameters in the remap rule:
@plugin=cachekey.so \
@pparam=--exclude-params=x \
@pparam=--exclude-match-params=y \
@pparam=--exclude-match-params=z \
@pparam=--include-params=y,c \
@pparam=--include-match-params=(x|b)
and if the URI has the following query string c=1&a=1&b=2&x=1&k=1&u=1&y=1
the cache key will use c=1&b=1
HTTP Headers
Include certain headers in the cache key
The following headers HeaderA
and HeaderB
will be used when constructing the cache key and the rest will be ignored.
@plugin=cachekey.so @pparam=--include-headers=HeaderA,HeaderB
The following would capture from the Authorization
header and will add the captured element to the cache key
@plugin=cachekey.so \
@pparam=--capture-header=Authorization:/AWS\s(?<clientID>[^:]+).*/clientID:$1/
If the request looks like the following:
http://example-cdn.com/path/file
Authorization: AWS MKIARYMOG51PT0DLD:DLiWQ2lyS49H4Zyx34kW0URtg6s=
The cache key would be set to:
/example-cdn.com/80/clientID:MKIARYMOG51PTCKQ0DLD/path/file
Prefix (host, port, capture and replace from URI)
Replacing host:port with a static cache key prefix
If the plugin is used with the following plugin parameter in the remap rule.
@plugin=cachekey.so @pparam=--static-prefix=static_prefix
the cache key will be prefixed with /static_prefix
instead of host:port
when --static-prefix
is not used.
Capturing from the host:port and adding it to the prefix section
If the plugin is used with the following plugin parameter in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*)
the cache key will be prefixed with /test_prefix/80
instead of test_prefix_371.example.com:80
when --capture-prefix
is not used.
Capturing from the entire URI and adding it to the prefix section
If the plugin is used with the following plugin parameter in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-prefix-uri=/(test_prefix).*:.*(object).*$/$1_$2/
and if the request URI is the following
http://test_prefix_123.example.com/path/to/object?a=1&b=2&c=3
the cache key will be prefixed with /test_prefix_object
instead of test_prefix_123.example.com:80
when --capture-prefix-uri
is not used.
Combining prefix plugin parameters, i.e. –static-prefix and –capture-prefix
If the plugin is used with the following plugin parameters in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-prefix=(test_prefix).*:([^\s\/$]*) \
@pparam=--static-prefix=static_prefix
the cache key will be prefixed with /static_prefix/test_prefix/80
instead of test_prefix_371.example.com:80
when either --capture-prefix
nor --static-prefix
are used.
Path, capture and replace from the path or entire URI
Capture and replace groups from path for the “Path” section
If the plugin is used with the following plugin parameter in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-path=/.*(object).*/const_path_$1/
and the request URI is the following
http://test_path_123.example.com/path/to/object?a=1&b=2&c=3
then the cache key will have /const_path_object
in the path section of the cache key instead of /path/to/object
when either --capture-path
nor --capture-path-uri
are used.
Capture and replace groups from whole URI for the “Path” section
If the plugin is used with the following plugin parameter in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/
and the request URI is the following
http://test_path_123.example.com/path/to/object?a=1&b=2&c=3
the cache key will have /test_path_object
in the path section of the cache key instead of /path/to/object
when either --capture-path
nor --capture-path-uri
are used.
Combining path plugin parameters –capture-path and –capture-path-uri
If the plugin is used with the following plugin parameters in the remap rule.
@plugin=cachekey.so \
@pparam=--capture-path=/.*(object).*/const_path_$1/ \
@pparam=--capture-path-uri=/(test_path).*(object).*/$1_$2/
and the request URI is the following
http://test_path_123.example.com/path/to/object?a=1&b=2&c=3
the cache key will have /test_path_object/const_path_object
in the path section of the cache key instead of /path/to/object
when either --capture-path
nor --capture-path-uri
are used.
User-Agent capturing, replacement and classification
Let us say we have a request with User-Agent
header:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3)
AppleWebKit/537.75.14 (KHTML, like Gecko)
Version/7.0.3 Safari/7046A194A
Capture PCRE groups from User-Agent header
If the plugin is used with the following plugin parameter:
@plugin=cachekey.so \
@pparam=--ua-capture=(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)
then Mozilla/5.0
and AppleWebKit/537.75.14
will be used when constructing the key.
Capture and replace groups from User-Agent header
If the plugin is used with the following plugin parameter:
@plugin=cachekey.so \
@pparam=--ua-capture=/(Mozilla\/[^\s]*).*(AppleWebKit\/[^\s]*)/$1_$2/
then Mozilla/5.0_AppleWebKit/537.75.14
will be used when constructing the key.
User-Agent allow-list classifier
If the plugin is used with the following plugin parameter:
@plugin=cachekey.so \
@pparam=--ua-allowlist=browser:browser_agents.config
and if browser_agents.config
contains:
^Mozilla.*
^Twitter.*
^Facebo.*
then browser
will be used when constructing the key.
User-Agent deny-list classifier
If the plugin is used with the following plugin parameter:
@plugin=cachekey.so \
@pparam=--ua-denylist=browser:tool_agents.config
and if tool_agents.config
contains:
^PHP.*
^Python.*
^curl.*
then browser
will be used when constructing the key.
Cacheurl plugin to cachekey plugin migration
The plugin cachekey was not meant to replace the cacheurl plugin in terms of having exactly the same cache key strings generated. It just allows the operator to extract elements from the HTTP URI in the same way the cacheurl does (through a regular expression, please see <capture_definition> above).
The following examples demonstrate different ways to achieve cacheurl compatibility on a cache key string level in order to avoid invalidation of the cache.
The operator could use –capture-path-uri, –capture-path, –capture-prefix-uri, –capture-prefix to capture elements from the URI, path and authority elements.
By using –separator=<string> the operator could override the default separator to an empty string –separator= and thus make sure there are no cache key element separators.
Example 1: Let us say we have a capture definition used in cacheurl. Now by using –capture-prefix-uri one could extract elements through the same capture definition used with cacheurl, remove the cache key element separator –separator= and by using –capture-path-uri could remove the URI path and by using –remove-all-params=true could remove the query string:
@plugin=cachekey.so \
@pparam=--capture-prefix-uri=/.*/$0/ \
@pparam=--capture-path-uri=/.*// \
@pparam=--remove-all-params=true \
@pparam=--separator=
Example 2: A more efficient way would be achieved by using –capture-prefix-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-path to remove the URI path and –remove-all-params=true to remove the query string:
@plugin=cachekey.so \
@pparam=--capture-prefix-uri=/.*/$0/ \
@pparam=--remove-path=true \
@pparam=--remove-all-params=true \
@pparam=--separator=
Example 3: Same result as the above but this time by using –capture-path-uri to capture from the URI, remove the cache key element separator –separator= and by using –remove-prefix to remove the URI authority elements and by using –remove-all-params=true to remove the query string:
@plugin=cachekey.so \
@pparam=--capture-path-uri=/(.*)/$0/ \
@pparam=--remove-prefix=true \
@pparam=--remove-all-params=true \
@pparam=--separator=
Example 4: Let us say that we would like to capture from URI in similar to cacheurl way but also sort the query parameters (which is not supported by cacheurl). We could achieve that by using –capture-prefix-uri to capture by using a capture definition to process the URI before ? and using –remove-path to remove the URI path and –sort-params=true to sort the query parameters:
@plugin=cachekey.so \
@pparam=--capture-prefix-uri=/([^?]*)/$1/ \
@pparam=--remove-path=true \
@pparam=--sort-params=true \
@pparam=--separator=
Simultaneous cache key and parent selection URL manipulation
The following is an example of how to manipulate both cache key and parent selection URL in the same remap rule. For this purpose two separate instances are loaded for that remap rule:
@plugin=cachekey.so \
@pparam=--key-type=parent_selection_url \
@pparam=--static-prefix=this://goes.to/parent/selection/url \
@pparam=--canonical-prefix=true \
@plugin=cachekey.so \
@pparam=--key-type=cache_key \
@pparam=--static-prefix=this://goes.to/cache/key \
@pparam=--canonical-prefix=true
In the example above the first instance of the plugin sets the prefix to the parent selection URI and the second instance of the plugin sets the prefix to the cache key.
The same string manipulations can be applied to both cache key and parent selection url more concisely without chaining cachekey plugin instances by specifying multiple target types –key-type.
Instead of:
@plugin=cachekey.so \
@pparam=--key-type=parent_selection_url \
@pparam=--remove-all-params=true
@plugin=cachekey.so \
@pparam=--key-type=cache_key \
@pparam=--remove-all-params=true
one could write:
@plugin=cachekey.so \
@pparam=--key-type=parent_selection_url,cache_key \
@pparam=--remove-all-params=true