Handling data in OPA policies

Oct 27th, 2021

Omri Gazitt

Gert Drapers

Open Policy Agent

binary code as a metaphor for data — Image by Gerd Altmann from Pixabay

The Open Policy Agent (OPA) is a decision engine for evaluating a policy based on a set of inputs. Aserto uses OPA to gate the decision of whether a user is authorized to perform a certain operation on a particular resource. Since authorization is often in the critical path of every application request, thinking through how to pass the data necessary for a decision into the decision engine is a critical consideration in designing a robust authorization system.

This post presents four patterns for handling data, each with its own tradeoffs.

Inputs

The decision engine needs three pieces of information in order to evaluate a decision:

Policy context: the set of rules that govern the decision
Identity context: the set of user attributes that can be referenced in the policy
Resource context: the resource (or scope of resources) that the decision applies to

These three inputs are provided by the caller. For example, when you call the is(allowed) API to determine whether a user is allowed to perform an operation on a resource, the payload you’ll send to the authorizer looks something like this:

{
  "policyContext": {
    "decisions": [
      "allowed"
    ],
    "id": "15e4ecf8-dfa0-11eb-98f2-018bdf252971",
    "path": "peoplefinder.PUT.api.users.__id"
  },
  "identityContext": {
    "identity": "euang@acmecorp.com",
    "type": "IDENTITY_TYPE_SUB"
  },
  "resourceContext": {
    “id”: “dfdadc39-7335-404d-af66-c77cf13a15f8”
  }
}

Policy context

A policy is stored and versioned in a policy registry like the Open Policy Registry (OPCR). The Aserto Authorizer can load one or more policies from the registry and have the OPA decision engine hold them in memory. The policy context that the caller provides contains a policy ID, a set of decisions to compute, and a policy path, all of which instruct the OPA engine on which of the rules to execute and which decisions to return.

Identity context

The identity context is often passed in as a JSON web token (JWT), or a JWT subject. Unless every attribute about a user is packed into the JWT, the identity context by itself is often insufficient for making policy decisions.

Most policies are written over roles that the user belongs to (commonly referred to as role-based access control, or RBAC); or attributes about the user that are stored in an identity provider or directory (attribute-based access control, or ABAC).

This is such a common scenario that Aserto has a special mechanism for handling this type of user data: we cache all the user information in our Authorizer, use the identity context as a key for looking up all the known data about that user in constant time, and make this data available to the policy as input.user. For example, directory attributes such as department are made available to the policy as input.user.department, and roles can be made available in a similar fashion.

For example, for the payload we introduced above, the identity context is “exploded” into a full set of attributes about the user, which are cached in the Authorizer.

{
  "identity": {
    "redacted": "..."
  },
  "policy": {
    "redacted": "..."
  },
  "user": {
    "id": "dfdadc39-7335-404d-af66-c77cf13a15f8",
    "display_name": "Euan Garden",
    "email": "euang@acmecorp.com",
    "identities": {
      "redacted": "..."
    },
    "applications": {
      "peoplefinder": {
        "properties": {
          "department": "Sales Engagement Management",
          "manager": "2bfaa552-d9a5-41e9-a6c3-5be62b4433c8",
          "phone": "+1-804-555-3383",
          "title": "Salesperson"
        },
        "roles": [
          "viewer"
        ]
      }
    },
    "attributes": {
      "redacted": "..."
    },
    "metadata": {
       "redacted": "..."
    },
  },
  "resource": {
    "redacted": "..."
  }
}

The example above is redacted for brevity; here's a full gist.

Resource context

The resource context is passed into the policy evaluation as a JSON object, and often contains keys to index other data structures.

Modeling data in policies

There are many scenarios where a policy wants to refer to additional data that should be used for evaluating authorization decisions. We’ve already described the most common one, which is to automatically “explode” the attributes of a user and make them available as input.user.

Stepping back, there are four ways to model data in policies, each involving a different set of tradeoffs between how live the data is and how much latency may be incurred in retrieving it.

Embedding data in a policy

The most straightforward way to provide “lookup” data in a policy is to bundle it along with the policy itself. This is done by embedding one or more data.json files inside of the policy. OPA loads this data in memory along with the policy itself. A good example of this is how Aserto’s own OPA policy groups permissions into roles using a data.json file.

Tradeoffs:

Latency: since the data is resident in memory, there is no additional overhead in loading it in at policy evaluation time.
Memory: since OPA loads the entire policy, the size of the data embedded in the policy has a direct impact on the working set of the engine.
Liveness: any refresh of the data requires creating a new policy bundle, and re-loading the bundle into the OPA engine. This can take a minute or two, depending on how often the Aserto authorizer is configured to polling the policy registry for new versions of the policy (defaulting to 60 seconds, with a maximum of 120 seconds).

Use this mechanism for a modest amount of data that doesn’t change very often, or changes with the policy - for example, lookup tables that map roles into more granular permissions.

Piggy-backing on the User mechanism

Aserto already knows how to load user attributes. By placing custom data into the user attributes in the Aserto directory, a policy can refer to these attributes. For example, to model per-tenant roles, Aserto uses the applications block in the user structure to store roles for each tenant that a user is a member of, and the policy can index into the applications block using the tenant ID in order to look up these roles at policy evaluation time.

Tradeoffs:

Latency: the data is loaded from the cached Aserto directory in constant time.
Memory: the data sits in an embedded database on the same container image that runs the OPA engine, so data is loaded on-demand. The dir.user() call is implemented as a natively compiled OPA built-in function.
Liveness: the data is cached in an embedded database on the authorizer container image. Refreshing the data requires calling one of the various APIs to set user roles, properties, or permissions. The new data can be made available in milliseconds to a small number of seconds.

Use this mechanism if the data you want to use in the policy is easy to model as a set of attributes (or permissions on resources) for each of your users. For example, if your application scopes resources under an “organization”, “tenant”, or “project”, and each user can belong to one or more of a bounded set of these constructs, this is a great mechanism to use. For a more concrete example, we wrote a blog post on how we use this mechanism to grant tenant-specific roles to Aserto users.

Using the Aserto Resource API

Some types of data don’t fit neatly into the User model: for example, modeling resources as a top-level construct, and providing access control lists (ACLs) that “hang off” each of these resources.

Aserto provides a set of APIs to manage resources (SetResource, GetResource, DeleteResource) which can be helpful to model these scenarios. These APIs are scoped to an Aserto tenant, and provide a key/value abstraction, where the key is a string and the value is a JSON object. A caller can set a new value for a key using SetResource and a policy can load the value for that key using the res.get(<key>) built-in.

This mechanism is modeled after the User data, so the tradeoffs are similar:

Latency: the data is loaded from the cached Aserto directory in constant time.
Memory: the data sits in an embedded database on the same container image that runs the OPA engine, so data is loaded on-demand. As in the case of dir.user(), res.get() is implemented as a natively compiled OPA built-in.
Liveness: the data is cached in an embedded database on the authorizer container image. Refreshing the data requires calling the SetResource API. The new data can be made available in milliseconds to a small number of seconds.

Use this mechanism if you have data that doesn’t neatly layer under the User abstraction, but can be easily modeled as a key/value pair. Any data that is tenant-scoped (as opposed to user-scoped) is a great candidate for this mechanism.

Loading external data

Finally, a policy can load data from an external data source by making a REST API call using the OPA built-in http.send(). This allows data that affects the evaluation of a policy to be stored in a separate system or database, and can be used as a general-purpose extension mechanism.

Tradeoffs:

Latency: the data is loaded over a network, so the latency (and availability) of that data is a function of an external system, and can be arbitrary. For example, if the policy makes a (blocking) REST call over the internet to an external web service, the latency can be significant since the authorization request can’t complete until the response is received. Furthermore, if the web service endpoint is down, the policy will fail.
Memory: the data is loaded at policy evaluation time, so there is no memory impact due to caching.
Liveness: the data can be retrieved from the “source of truth” with no caching involved.

Use this mechanism only if the liveness of the data is critical, and only when you can control the availability and latency characteristics of the external data store.

For example, you may have an external database that keeps track of resources and permissions on those resources, and want to easily adapt this system to use OPA (and Aserto) as the authorization API. In this case, your external data source is already likely deployed in the same subnet as your application; or if it’s a cloud data service, you may already be satisfied with the latency and availability characteristics relative to your application. In this scenario, the easiest path may be to use this approach.

Summary

We explored four different mechanisms that can be used to bring data to an OPA policy, each with its own set of latency, memory, and liveness tradeoffs. Use the one that best matches your scenario, and let us know on our community slack if there are other scenarios that don’t fit these use-cases!

Omri Gazitt

CEO, Aserto

Gert Drapers

Stay informed