Miscellaneous¶
This is text that should be in the document but doesn't have a good place at the moment.
Issues¶
What happens to the remap requirement? What counts as a remap match? Currently adding a comparison "always" which always matches, so that can count as a match.
How optional should
do
nodes be? For instance, for selection cases is it OK to omit the node entirely if there's nothing to actually do, it's just a match for some other reason (e.g. to prevent requests that match from matching subsequent cases)?Thinking about this more, it should be possible to omit the comparison or the
do
.If the comparison is omitted, it always matches.
If the
do
is omitted, then nothing is done.
with: ua-req-host
select:
- suffix: ".org"
do:
- with: ...
select:
- match: "apache"
do: # stuff for the exact domain "apache.org"
- suffix: ".apache"
do:
- with: ...
select:
- match: "mail"
do: # handle mail.apache.org
- match: "id"
do: # handle id.apache.org
# more comparisons
Working with HTTP Fields¶
Due to history and bad practices, working with the fields in an HTTP message can be challenging. Unfortunately this makes the configuration for these more intricate.
The general case, where there is a single field with a single value it straight forward. A field
extractor (such as proxy-req-field
) can be used to get the value, and a field directive
(such as proxy-req-field
) to change it.
Some fields are defined to be multi-valued in which case the field value is really a list. On the other hand, there are exceptions to this as well, most notably Set-Cookie which, while multi-valued, does not follow the standard mechanism.
To handle all of these cases, the extension field of the field extractors can be used to force
the field to be handled either by value or by field. In both cases the result is a tuple to which
the standard with
tuple handling can be used.
In addition, if the field is handled by field, there is special support in the corresponding field directive to modifying that specific field in the message.
For illustrative purposes, the examples will use the upstream response ("ursp") extractors and directives, but all of this applies to the client request ("creq"), proxy request ("preq"), and proxy response ("prsp") extractors and directives in exactly the same way.
In the case of single valued field such as "Server" fields, the value is extracted using
ursp-field<Server>` and can be changed with the directive ``ursp-field<Server>
. Here is an
example that looks at the "Transfer-Encoding" field to check for chunked encoding
with: upstream-field<Transfer-Encoding>
select:
- match<nc>: "chunked"
do:
# ... whatever
Another example is setting the "Server" field to "TxnBox-Enhanced-Server" if it is not already set
upstream-field<Server>: [ upstream-field<Server> , { else: "TxnBox-Enhanced-Server" } ]
Finally, for fields like "Set-Cookie", handling must be done by the actual fields in the message header. Here is an example that walks the "Set-Cookie" fields and if the domain is set to "example.one", that specific field is removed.
with: upstream-rsp-field<Set-Cookie>
for-each:
- with: ...
select:
- contains<nc>: "Domain=example.one;"
do:
- upstream-field<...>: NULL
The notation "..." is a reference to the active feature, just as it as an extractor. Not all
directives support this - if one does, this is noted in the deescription along with the associated
extractor(s) that extract a corresponding active feature. In particular, a field directive of
this form can only be used with an active feature by the corresponding extractor. In this case,
ua-req-field<...>
would be invalid, only upstream-rsp-field<...>
will work because the active feature
was extracted by the upstream-rsp-field
extractor.
It's interesting to compare this to an alternative that looks similar but isn't quite the same
with: upstream-rsp-field<Set-Cookie>::by-field
select:
- for-all:
- contains<nc>: "Domain=example.one;"
do:
- upstream-rsp-field<...>: NULL
The difference is for-all
is a comparison and will stop at the first element that doesn't match.
for-each
will always do all elements. Similar differences exist if for-any
or for-none
are used. A way around this difference would be
with: ursp-field<Set-Cookie>::by-field
select:
- for-all:
- contains<nc>: "Domain=example.one;"
do:
- ursp-field<...>: NULL
- {} # always match
While this will work by causing every element to match, it's a bit obscure and easy to get wrong. If
every element in a tuple needs to be processed, use for-each
.
Examples¶
For request URLs of the form "A.apache.org/path", change to "apache.org/A/path". This example has the full configuration to demonstrate that. Later examples have just the relevant configuration and assume this outer wrapper is present.
txn_box:
- when: creq
do:
- with: ua-req-host
select:
- suffix: ".apache.org"
do:
- rewrite-url: "{ua-req-scheme}://apache.org/{...}{ua-req-path}"
- match: "apache.org"
# Explicitly do nothing with the base host name to count as "matched".
Access Control¶
Access to upstream resources can be controlled on a fine grained level by doing a selection and
using the deny
directive. For instance, if access to id.example.one
should be restricted
to the GET
, POST
, and HEAD
methods, this could be done wth
with: ua-req-url
select:
- match: "id.example.one"
do:
- with: ua-req-method
select:
- none-of:
match: [ "GET", "POST", "HEAD" ]
do:
deny:
- proxy-req-field@Access: "Allowed" # mark OK and let the request go through.
If the method is not one of the allowed ones, the select
matches resulting in a denial.
Otherwise, because there is no match, further directives in the outside select
continue
to be performed and the transaction proceeds.
This could be done in another way
with: "{creq.url}"
select:
- match: "id.example.one"
do:
- with: "{ua-req-method}"
select:
- match: [ "GET", "POST", "HEAD" ]
do:
set-field: [ "Access", "Allowed" ] # mark OK and let the request go through.
- deny:
The overall flow is the same - if the method doesn't match an allowed, the with
is passed
over and the deny
is performed. Which of these is better depends on how much additional
processing is to be done.
In either case, the set-field
directive isn't required, it is present for illustrative
purposes. This form of the innter select
works as well.
with: "{creq.method}"
select:
- not:
match: [ "GET", "POST", "HEAD" ]
do:
deny:
More complex situations can be handled. Suppose all methods were to be allowed from the loopback
address. That could be done (considering just the access control select
) with
with: "{inbound.remove_addr}"
select:
- in: [ "127.0.0.0/8", "::1" ]
do: # nothing
with: "{creq.method}"
select:
- not:
match: [ "GET", "HEAD", "POST" ]
do:
deny:
Reverse Mapping¶
If the URL "example.one" is rewritten to "some.place", then it useful to rewrite the Location
header from "some.place" to "example.one". This could be done as
with: "{ua-req-host}"
select:
- match: "example.one"
do:
- rewrite-url-host: "some.place"
- when: "send-response"
do:
- with: "{prsp.field::Location}"
select:
- prefix: "http://some.place"
do:
- set-field: [ "Location", "http://example.com{...}" ]
Referer Remapping¶
While there is no direct support for the old style referer remapping, it is straight forward to achieve the same functionality using the field "Referer" extractor and selection.
with: "{creq.field::Referer}"
select:
- match: "" # no referrer, equivalent to the "-" notation
do:
deny: # must have referer
- suffix: [ ".example.one", ".friends.one" ]
do:
rewrite-url-host: "example.one"
- else:
do:
rewrite-url: "http://example.com/denied"
RECV_PORT¶
As with referer remapping, this is easily done by extracting and selecting on the local (proxy) port. To restrict the URL "example.one" to connections to the local port 8180
with: "{ua-req-host}"
select:
- match: "example.one"
do:
with: "{inbound.local_port}"
select:
- ne: 8180
do:
deny:
Note because of the no backtrack rule, the request will pass through unmodified if allowed. If it needed to be changed to "special.example.one" that would be
with: "{ua-req-host}"
select:
- match: "example.one"
do:
with: "{inbound.local_port}"
select:
- eq: 8180
do:
set-url-host: "special.example.on"
- else:
do:
deny:
A/B Testing¶
The random extractor and the hash modifier are useful for doing bucket testing, where some proportion of requests should be routed to a staging system instead of the main production upstream. This can be done randomly per request with the random extractor, or more deterministically using the has modifier.
Presuming the production destination is "example.one" and the test destination is "stage.example.one" then 1% of traffic can be sent to the staging system with
with: "{ua-req-host}"
select:
- match: "example.com"
do:
- with: "{rand::100}"
select:
- eq: 1
do:
- rewrite-url: "http://stage.example.com"
- always: # match to allow unmodified request to continue.
To be more deterministic, the bucket could be based on the client IP address.
with: "{ua-req-host}"
select:
- match: "example.com"
do:
- with: [ "{inbound.remove_addr}", { hash: 100} ]
select:
- eq: 1
do:
- rewrite-url: "http://stage.example.com"
- always: # match to allow unmodified request to continue.
As another alternative, this could be done with a cookie in the client request. If the cookie was "example" with the test bucket indicated by the value "test", then it would be.
with: "{ua-req-host}"
select:
- match: "example.com"
do:
- with: "{creq.cookie::example}"
select:
- match: "test"
do:
- rewrite-url: "http://stage.example.com"
- always: # match to allow unmodified request to continue.
Real Life¶
The following examples are not intended to be illustrative, but are based on actual production requests from internal sources or on the mailing list. These are here to serve as a guide to implementation.
Legacy Appliances¶
Support for legacy appliances is required. The problem is these have very old TLS stacks and among other things do not provide SNI data. However, it is unacceptable to accept such connections in general. The requirement is therefore to allow such connections only to specific upstream destinations and fail the connection otherwise.
Query Only¶
Rewrite the URL iff there is a non-empty query string.
Cache TTL¶
Set a cache TTL if
From a specific domain.
There are no cache TTL control fields.
This is a bit tricky because multiple fields are involved. Could this be done via an extraction vector, or better by doing a single extraction of all fields and checking if that is empty?
Mutual TLS¶
Control remapping based on whether the user agent provided a TLS certificate, whether the certificate was verified, and whether the SNI name is in a whitelist.
Footnotes