Access Control Plugin

Description

The access_control plugin covers common use-cases related to providing access control to the objects stored in CDN cache.

Requirements and features

  1. Cache Access Control - cached objects to be served only to properly authenticated and authorized users.

  2. Authorizing Multiple Requests - all requests from a particular UA in a defined time period to be authenticated and authorized only once.

  3. Multiple Page Versions - objects may have the same name (URI) but different content for each target audience and must be served only to the corresponding target audience.

  4. Proxy Only Mode - CDN proxies the request to the origin. In case of access control failure at the Edge, the UA would not be redirected to other services, the failed request would be forwarded to the origin.

Participants

  • UA - User Agent used by a user whose requests are subject to access control.

  • Target audience - an application specific audience (role, group of users, etc) to which the authenticated and authorized user belongs and which defines user's permissions.

  • CDN / Edge - Content Delivery Network, sometimes CDN and Edge are used interchangeably

  • IdMS - Identity Management Services

  • DS - Directory services

  • Origin / Application - origin server which is part of the particular application, sometimes origin and application are used interchangeably

Design considerations and limitations

The following are some thoughts and ideas behind the access control flow and the plugin design.

Existing standards

OAuth [1] [2] and SAML [3] were considered looking for use cases matching standard-based solutions but none seemed to match the requirements completely, also supporting some of the mentioned standards would require a more involved design and implementation.

Closed integration

Closed integration with existing IdMS and DS was considered in earlier design stages which resulted in a too specific and unnecessarily complicated access control flow, inflexible, with higher maintenance cost, possibly not scaling and not coping well with future changes.

Cache access approval only

Authentication and Authorization are to be performed only by the application using various services like IdMS, DS, etc. The application knows more about its users, their requests and permissions and need to deal with it regardless. It is assumed that the origin will perform its own access control. The CDN is to be concerned only with the access approval (the function of actually granting or rejecting access) to the objects in its own caches based on an access token provided by the application.

Access token

The access token should be a compact and self-contained token containing the information required for properly enforcing the access control. It can be extracted from HTTP headers, cookies, URI query parameters which would allow us to support various access approval use cases.

Access token subject

The subject of the token in some of the use cases would signify a separate target audience and should be opaque to CDN which would allow CDN to be taken out of the authorization and authentication equation. In those use cases the subject will be added to the cache key as a mechanism to support multiple page versions. Special considerations should be given to the cache-hit ratio and the distinct target audience count. The bigger the number of audiences the lesser the cache-hit ratio.

Cache key modifications

This plugin will not modify the cache key directly but rather rely on plugins like Cache Key and Parent Selection URL Manipulation Plugin.

TLS only

To combat the risk of man in the middle attacks, spoofing elements of the requests, unexpectedly leaking security information only TLS will be used when requesting protected assets and exchanging access tokens.

Use cases

Let us say CDN's domain name is example-cdn.com and origin's domain name is example.com, <access_token> is an access token value created by the origin and validated by CDN. When necessary the access token is passed from the origin to the CDN by using a response header TokenRespHdr and is stored at the UA in a persistent cookie TokenCookie.

@startuml

participant UA
participant CDN
participant Origin

autonumber "<b><00>"
UA -> CDN : GET https://example-cdn.com/path/object\n[Cookie: TokenCookie=<access_token>]
activate CDN

CDN->CDN:validate access_token

alt valid access_token
  CDN -> CDN: extract access_token //subject//\nand add it to cache key
else missing or invalid token
  CDN -> CDN: skip cache (same as cache-miss)
end

alt invalid access_token OR cache-miss
  alt config:use_redirect=//true//
    CDN -> Origin : HEAD https://example.com/path/object
    activate Origin
  else config:use_redirect=//false//
    CDN -> Origin : GET https://example.com/path/object
    deactivate CDN
  end

  Origin -> Origin: trigger authentication\n+ authorization flow
  note over UA,Origin #white
    Origin <=> UA authentication and authorization flow using IdMS and DS, etc.
  endnote

  alt user unauthorized
    Origin -> CDN : 401 Unauthorized
    activate CDN
    CDN -> UA : 401 Unauthorized
  else user authorized

    Origin -> Origin:create or reuse\naccess_token

    Origin -> CDN: 200 OK\nTokenRespHdr: <access_token>
    deactivate Origin

    CDN->CDN:validate access_token

    alt invalid access_token
      CDN -> UA : 520 Error
    else
      alt config:use_redirect=//true//
        CDN -> UA : 302 Redirect\nSet-Cookie: TokenCookie=<access_token>\nLocation: https://example-cdn.com/path/object
      else config:use_redirect=//false//
        CDN -> UA : 200 OK\nSet-Cookie: TokenCookie=<access_token>
      end
    end

  end
else valid access_token AND cache-hit
  CDN -> UA : 200 OK
  deactivate CDN
end


@enduml

Use Case 1: Proxy only mode using HTTP cookies.

<01-02>. When a request from the UA is received at the CDN the value of TokenCookie is extracted and the access token is validated (missing cookie or access token is same as invalid).

<03>. If the access token is valid its opaque subject is extracted, added to the cache key and a cache lookup is performed.

<06>. Missing or invalid access token or a cache-miss leads to forwarding the request to the origin either for user authorization and/or for fetching the object from origin.

<07>. The origin performs authentication and authorization of the user using IdMS, DS, etc. All related authentication and authorization flows are out of scope of this document.

<08-09>. If the user is unauthorized then 401 Unauthorized is passed from the origin to the CDN and then to UA.

<10-11>. If the user is authorized then an access token is returned by a response header TokenRespHdr to the CDN and gets validated at the Edge before setting the TokenCookie.

<12-13>. If the validation of the access token received from the origin fails the origin response is considered invalid and 520 Error is returned to UA.

<15>. If the validation of the access token received from the origin succeeds then the object is returned to UA and a TokenCookie is set to the new access token with the CDN response.

<16>. If the access token initial UA request is valid and there is a cache-hit the corresponding object is delivered to the UA.

In this use case the request with a missing or invalid token is never cached (cache skipped) since we don't have or cannot trust the subject from the access token to do a cache lookup and since Apache Traffic Server does not have the ability to change the cache key when it receives the Origin response it is impossible to cache the object based on the just received new valid token from the Origin.

All subsequent requests having a valid token will be cached normally and if the access token is valid for long enough time not caching just the first request should not be a problem, for use cases where we cannot afford not caching the first request please see use case 2.

Use Case 2: Proxy only mode using HTTP cookies and redirects.

This use case is similar to use case 1 but makes sure all (cacheable) requests are cached (even the one that does not have a valid access token in the UA request).

<05> In case of invalid access token instead of forwarding the original HTTP request to the origin a HEAD request with all headers from the original request is sent to the origin.

<14> When the origin responds with a valid access token in TokenRespHdr the CDN sets the TokenCookie by using a 302 Redirect response with a Location header containing the URI of original UA request.

In this way the after the initial failure the UA request is repeated with a valid access token and can be safely cached in the CDN cache (if the object is cacheable)

The support of this use case is still not implemented.

Access token

The access token could contain the following claims:

  • subject - the subject of the access token is an opaque string provided by the application, in use case 1 and use case 2 the subject signifies a target audience

  • expiration - Unix time after which the access token is not considered valid (expired)

  • not before time - Unix time before which the access token is not considered valid (used before its time)

  • issued at time - Unix time the access token was issued

  • token id - unique opaque token id for debugging and tracking assigned by the application,

  • version - access token version

  • scope - defines the scope in which this subject is valid

  • key id - the key in a map or database of secrets to be used to calculate the digest

  • signature type - name of the HMAC hash function / cryptographic signature scheme to be used for calculating the message digest, supported signature types: HMAC-SHA-256, HMAC-SHA-512, RSA-PSS (still not implemented)

  • message digest - the message digest that signs the access token

To make the plugin more configurable and to support more use cases various formats could be supported in the future, i.e Named Claim formats , Positional Claim formats, JWT [4], etc.

The format of the access token will be specifiable only through the plugin configuration by design and not the access token since migrations from one format to another during upgrades are not expected in normal circumstances.

Changes in claim names (claim positions in Positional Claims), in their interpretation, adding new claims, removing claims, switching from required to optional and vice versa will be handled by having a version claim in the token.

Version and signature type claims are part of the token ("user input") to allow easier migration between different versions and signature types, but they could be overridable through configuration in future versions to force the usage only to specific versions or signature types (in which case the corresponding claim could be omitted from the token and would be ignored if specified).

The following Named Claim format is the only one currently supported.

Query-Param-Style Named Claim format

  • claim names
    • sub for subject, required

    • exp for expiration, required

    • nbf for not before time, optional

    • iat for issued at time, optional

    • tid for token id, optional

    • ver for version, optional, defaults to ver=1 if not specified.

    • scope for scope, optional, ignored by the current version of the plugin, still not finalized (more applications and their use cases need to be studied to finalize the format)

    • kid for key id, required (tokens to be always signed)

    • st for signature type, optional (default would be SHA-256 if not specified)

    • md for message digest - this claim is required and expected to be always the last claim.

  • delimiters
    • claims are separated by &

    • keys and values in each claim are separated by =

  • notes and limitations
    • if any claim value contains & or = escaping would be necessary (i.e. through Percent-Encoding [6])

    • the size of the access token cannot be larger then 4K to limit the amount of data the application could fit in the opaque claims, in general the access token is meant to be small since it could end up stored in a cookie and be sent as part of lots and lots of requests.

    • during signing the access token payload should end with &md= and the calculated message digest would be appended directly to the payload to form the token (see the example below).

Let us say we have a user Kermit the Frog and a user Michigan J. Frog who are part of a target audience frogs-in-a-well and a user Nemo the Clownfish who is part of a target audience fish-in-a-sea.

Both users Kermit the Frog and Michigan J. Frog will be authorized with the following access token:

sub="frogs-in-a-well"   # opaque id assigned by origin
exp="1577836800"       # token expires   : 01/01/2020 @ 12:00am (UTC)
nbf="1514764800"       # don't use before: 01/01/2018 @ 12:00am (UTC)
iat="1514160000"       # token issued at : 12/25/2017 @ 12:00am (UTC)
tid="1234567890"       # unique opaque id assigned by origin (i.e UUID)
kid="key1"             # secret corresponding to this key is 'PEIFtmunx9'

Constructing the access token using openssl tool (from bash):

payload='sub=frogs-in-a-well&exp=1577836800&nbf=1514764800&iat=1514160000&tid=1234567890&kid=key1&st=HMAC-SHA-256&md='
signature=$(echo -n ${payload} | openssl dgst -sha256 -hmac "PEIFtmunx9")
access_token=${payload}${signature}

The application would create and send the access token in a response header TokenRespHdr:

TokenRespHdr: sub=frogs-in-a-well&exp=1577836800&nbf=1514764800&iat=1514160000&tid=1234567890&kid=key1&st=HMAC-SHA-256&md=8879af98ab6071315a7ab55e5245cbe1c106303bcc4690cbfc807a4402d11ab3

CDN would set a cookie TokenCookie:

TokenCookie=c3ViPWZyb2dzLWluLWEtd2VsbCZleHA9MTU3NzgzNjgwMCZuYmY9MTUxNDc2NDgwMCZpYXQ9MTUxNDE2MDAwMCZ0aWQ9MTIzNDU2Nzg5MCZraWQ9a2V5MSZzdD1ITUFDLVNIQS0yNTYmbWQ9ODg3OWFmOThhYjYwNzEzMTVhN2FiNTVlNTI0NWNiZTFjMTA2MzAzYmNjNDY5MGNiZmM4MDdhNDQwMmQxMWFiMw; Expires=Wed, 01 Jan 2020 00:00:00 GMT; Secure; HttpOnly

The value of the cookie is the access token provided by the origin encoded with a base64url Encoding without Padding [4]

The following attributes are added to the Set-Cookie header [5]:

  • Secure - instructs the UA to include the cookie in an HTTP request only if the request is transmitted over a secure channel, typically HTTP over Transport Layer Security (TLS)

  • HttpOnly - instructs the UA to omit the cookie when providing access to cookies via "non-HTTP" APIs such as a web browser API that exposes cookies to scripts

  • Expires - this attribute will be set to the time specified in the expiration claim

Just for a reference the following access token would be assigned to the user Nemo the Clownfish:

sub=fish-in-a-sea&exp=1577836800&nbf=1514764800&iat=1514160000&tid=2345678901&kid=key1&st=HMAC-SHA-256&md=a43d8a46804d9e9319b7d1337007eed73daf37105f1feaae1d68567389654f88

Plugin configuration

  • Specify where to look for the access token
    • --check-cookie=<cookie_name> (optional, default:empty/unused) - specifies the name of the cookie that contains the access token, although it is optional if not specified the plugin does not perform access control since this is the only currently supported access token source.

    • --token-response-header=<header_name> (optional, default:empty/unused) - specifies the origin response header name that contains the access token passed from the origin to the CDN, although it is optional this is the only currently supported way to get the access token from the origin.

  • Signify some common failures through HTTP status code.
    • --invalid-syntax-status-code=<http_status_code> (optional, default:400) - access token bad syntax error

    • --invalid-signature-status-code=<http_status_code> (optional, default:401) - invalid access token signature

    • --invalid-timing-status-code=<http_status_code> (optional, default:403) - bad timing when validating the access token (expired, or too early)

    • --invalid-scope-status-code=<http_status_code> (optional, default:403) - access token scope validation failed

    • --invalid-origin-response=<http_status_code> (optional, default:520) - origin response did not look right, i.e. the access token provided by origin is not valid.

    • --internal-error-status-code=<http_status_code> (optional, default:500) - unexpected internal error (should not happened ideally)

  • Extract information into a request header
    • --extract-subject-to-header=<header_name> (optional, default:empty/unused) - extract the access token subject claim into a request header with <header_name> for debugging purposes and logging or to be able to modify the cache key by using Cache Key and Parent Selection URL Manipulation Plugin plugin.

    • --extract-tokenid-to-header=<header_name> (optional, default:empty/unused) - extract the access token token id claim into a request header with <header_name> for debugging purposes and logging

    • --extract-status-to-header=<header_name> (optional, default:empty/unused) - extract the access token validation status request header with <header_name> for debugging purposes and logging

  • Plugin setup related
    • --symmetric-keys-map=<txt_file_name> (optional, default: empty/unused) - the name of a file containing a map of symmetric encrypt secrets, secrets are expected one per line in format key_name_N=secret_value_N (key names are used in access token signature validation, multiple keys would be useful for key rotation). Although it is optional this is the only source of secrets supported and if not specified / used access token validation would constantly fail.

    • --include-uri-paths-file (optional, default:empty/unused) - a file containing a list of regex expressions to be matched against URI paths. The access control is applied to paths that match.

    • --exclude-uri-paths-file (optional, default:empty/unused) - a file containing a list of regex expressions to be matched against URI paths. The access control is applied to paths that do not match.

  • Behavior modifiers to support various use-cases
    • --reject-invalid-token-requests (optional, default:false) - reject invalid token requests instead of forwarding them to origin.

    • --use-redirects (optional, default:false) - used to configure use case 2, not implemented yet.

Configuration and Troubleshooting examples

The following configuration can be used to implement use case 1

Configuration files

  • Apache Traffic Server remap.config

Cache Key and Parent Selection URL Manipulation Plugin is used to add the access token subject into the cache key (@TokenSubject). and should always follow the Access Control Plugin in the remap rule in order for this mechanism to work properly.

map https://example-cdn.com http://example.com \
    @plugin=access_control.so \
        @pparam=--symmetric-keys-map=hmac_keys.txt \
        @pparam=--check-cookie=TokenCookie \
        @pparam=--extract-subject-to-header=@TokenSubject \
        @pparam=--extract-tokenid-to-header=@TokenId \
        @pparam=--extract-status-to-header=@TokenStatus \
        @pparam=--token-response-header=TokenRespHdr \
    @plugin=cachekey.so \
        @pparam=--static-prefix=views \
        @pparam=--include-headers=@TokenSubject
  • Secrets map hmac_keys.txt

$ cat etc/trafficserver/hmac_keys.txt
key1=PEIFtmunx9
key2=BtYjpTbH6a
key3=SS75kgYonh
key4=qMmCV2vUsu
key5=YfMxMaygax
key6=tVeuPtfJP8
key7=oplEZT5CpB
  • Format the access_control.log

logging:
  formats:
  - format: '%<cqtq> sub=%<{@TokenSubject}cqh> tid=%<{@TokenId}cqh> status=%<{@TokenStatus}cqh> cache=%<{x-cache}psh> key=%<{x-cache-key}psh>'
    name: access_control_format
  logs:
  - filename: access_control
    format: access_control_format
    mode: ascii
  • X-Debug plugin added to plugin.config

$ cat etc/trafficserver/plugin.config
xdebug.so

Configuration tests and troubleshooting

Let us assume that the origin responds with the access tokens considered in Query-Param-Style Named Claim format corresponding to user Kermit the Frog and user Nemo the Clownfish

If user Kermit the Frog sends a request without a valid token, i.e TokenCookie is missing from the request. Cache key would be /views/object but never used since the cache is always skipped.

After the origin responds with a valid access token (assuming the user authentication and authorization succeeded) the plugin will respond with Set-Cookie header containing the new access token.

$ curl -sD - https://example-cdn.com/object \
    -H 'X-Debug:X-Cache, X-Cache-Key' \
  | grep -ie 'x-cache' -e 'tokencookie'

set-cookie: TokenCookie=c3ViPWZyb2dzLWluLWEtd2VsbCZleHA9MTU3NzgzNjgwMCZuYmY9MTUxNDc2NDgwMCZpYXQ9MTUxNDE2MDAwMCZ0aWQ9MTIzNDU2Nzg5MCZraWQ9a2V5MSZzdD1ITUFDLVNIQS0yNTYmbWQ9ODg3OWFmOThhYjYwNzEzMTVhN2FiNTVlNTI0NWNiZTFjMTA2MzAzYmNjNDY5MGNiZmM4MDdhNDQwMmQxMWFiMw; Expires=Wed, 01 Jan 2020 00:00:00 GMT; Secure; HttpOnly
x-cache-key: /views/object
x-cache: skipped

Now let us send the same request with a valid access token, add the TokenCookie to the request. Cache key will be /views/@TokenSubject:frogs-in-a-well/object but since the object is not in the cache we get miss.

$ curl -sD - https://example-cdn.com/object \
    -H 'X-Debug:X-Cache, X-Cache-Key' \
    -H 'cookie: TokenCookie=c3ViPWZyb2dzLWluLWEtd2VsbCZleHA9MTU3NzgzNjgwMCZuYmY9MTUxNDc2NDgwMCZpYXQ9MTUxNDE2MDAwMCZ0aWQ9MTIzNDU2Nzg5MCZraWQ9a2V5MSZzdD1ITUFDLVNIQS0yNTYmbWQ9ODg3OWFmOThhYjYwNzEzMTVhN2FiNTVlNTI0NWNiZTFjMTA2MzAzYmNjNDY5MGNiZmM4MDdhNDQwMmQxMWFiMw; Expires=Wed, 01 Jan 2020 00:00:00 GMT; Secure; HttpOnly' \
  | grep -ie 'x-cache' -e 'tokencookie'

x-cache-key: /views/@TokenSubject:frogs-in-a-well/object
x-cache: miss

Now let us send the same request again and since the object is in cache we get hit-fresh.

$ curl -sD - https://example-cdn.com/object \
    -H 'X-Debug:X-Cache, X-Cache-Key' \
    -H 'cookie: TokenCookie=c3ViPWZyb2dzLWluLWEtd2VsbCZleHA9MTU3NzgzNjgwMCZuYmY9MTUxNDc2NDgwMCZpYXQ9MTUxNDE2MDAwMCZ0aWQ9MTIzNDU2Nzg5MCZraWQ9a2V5MSZzdD1ITUFDLVNIQS0yNTYmbWQ9ODg3OWFmOThhYjYwNzEzMTVhN2FiNTVlNTI0NWNiZTFjMTA2MzAzYmNjNDY5MGNiZmM4MDdhNDQwMmQxMWFiMw; Expires=Wed, 01 Jan 2020 00:00:00 GMT; Secure; HttpOnly' \
  | grep -ie 'x-cache' -e 'tokencookie'

x-cache-key: /views/@TokenSubject:frogs-in-a-well/object
x-cache: hit-fresh

The previous activity should result in the following log (as defined in logging.yaml)

1521588755.424 sub=- tid=- status=U_UNUSED,O_VALID cache=skipped key=/views/object
1521588986.262 sub=frogs-in-a-well tid=this-year-frog-view status=U_VALID,O_UNUSED cache=miss key=/views/@TokenSubject:frogs-in-a-well
1521589276.535 sub=frogs-in-a-well tid=this-year-frog-view status=U_VALID,O_UNUSED cache=hit-fresh key=/views/@TokenSubject:frogs-in-a-well

Just for a reference the same request for user Nemo the Clownfish, with a different subject/target audience fish-in-a-sea, will end up having cache key /views/@TokenSubject:fish-in-a-sea/object and would never match the same object cached for users in the frogs-in-a-well audience as they use cache key /views/@TokenSubject:frogs-in-a-well/object.


References