How do I make user attributes stored in LDAP/AD available to OPA for making decisions?

This best-practice guide explains three options: JSON Web Tokens, synchronization with LDAP/AD, and calling into LDAP/AD during policy evaluation.

How does OPA do conflict resolution?

In Rego (OPA’s policy language), you can write statements that both allow and deny a request, such as

  1. package foo
  2. allow { input.name == "alice" }
  3. deny { input.name == "alice" }

Neither allow nor deny are keywords in Rego so if you want to treat them as contradictory, you control which one takes precedence explicitly. When you ask for a policy decision from OPA, you specify both the policy name (foo) and the virtual document that names the decision within foo. Typically in this scenario, you create a virtual document called authz and define it so that allow overrides deny or vice versa. Then when asking for a policy decision, you ask for foo/authz.

  1. # deny everything by default
  2. default authz = false
  3. # deny overrides allow
  4. authz {
  5. allow
  6. not deny
  7. }

If instead you want to resolve conflicts using a first-match strategy (where the first statement applicable makes the decision), see the FAQ entry on statement order.

Does Statement Order Matter?

The order in which statements occur does not matter in Rego. Reorder any two statements and the policy means exactly the same thing. For example, the following two statements mean the same thing whichever order you write them in.

  1. package unordered
  2. ratelimit = 4 { input.name = "alice" }
  3. ratelimit = 5 { input.name = "bob" }
  1. {
  2. "name": "bob"
  3. }
  1. ratelimit
  1. 5

Sometimes, though, you want the statement order to matter. For example, you might put more specific statements before more general statements so that the more specific statements take precedence (e.g. for conflict resolution). Rego lets you do that using the else keyword. For example, if you want to make the first statement above take precedence, you would write the following Rego.

  1. package ordered
  2. ratelimit = 4 {
  3. input.owner == "bob"
  4. } else = 5 {
  5. input.name == "alice"
  6. }
  1. {
  2. "name": "alice",
  3. "owner": "bob"
  4. }
  1. ratelimit
  1. 4

Which Equality Operator Should I Use?

Rego supports three kinds of equality: assignment (:=), comparison (==), and unification =. Both assignment (:=) and comparison (==) are only available inside of rules (and in the REPL), and we recommend using them whenever possible for policies that are easier to read and write.

  1. # Assignment: declare local variable x and give it value 7
  2. # If x appears before this statement in the rule, compiler throws error.
  3. x := 7
  4. y := {"a", "b", "c"}
  5. # Comparison: check if two values are the same.
  6. # Do not assign variables--variables must be "safe".
  7. x == 7
  8. x == y
  9. y == [1, 2, [3]]
  10. # Unification: assign variables to values that make the
  11. # equality true
  12. # Note: = is the only option outside of rule bodies
  13. x = 7 # causes x to be assigned 7
  14. [x, 2] = [3, y] # x is assigned 3 and y is assigned 2

Collaboration Using Import

OPA lets multiple teams contribute independent policies that you can then combine to make an overall decision. Each team writes their policy in a separate package, then you write one more policy that imports all the teams policies and makes a decision.

For example, suppose there is a network team, a storage team, and a compute team. Suppose they each write their own policy:

  1. package compute
  2. allow { ... }
  1. package network
  2. allow { ... }
  1. package storage
  2. allow { ... }

Now the cloud team, who is in charge of the overall decision, writes another policy that combines the decisions for each of the team policies. In the example below, all 3 teams must allow for the overall decision to be allowed.

  1. package main
  2. import data.compute
  3. import data.storage
  4. import data.network
  5. # allow if all 3 teams allow
  6. allow {
  7. compute.allow
  8. storage.allow
  9. network.allow
  10. }

The cloud team could have a more sophisticated scheme for combining policies, e.g. using just the compute policy for compute-only resources or requiring the compute policy to allow the compute-relevant portions of resource. Remember that allow is not special–it is just another boolean that the policy author can use to make decisions.

Functions Versus Rules

Rego lets you factor out common logic in 2 different and complementary ways.

One is the function, which is conceptually identical to functions from most programming languages. It takes any input and returns any output. Importantly, a function can take infinitely many inputs, e.g. any string.

  1. package functions
  2. trim_and_split(s) = result {
  3. t := trim(s, " ")
  4. result := split(t, ".")
  5. }
  1. trim_and_split(" hello.world ")
  1. [
  2. "hello",
  3. "world"
  4. ]

The other way to factor out common logic is with a rule. Rules differ in that (i) they support automatic iteration and (ii) they are only defined for finitely many inputs. (Those obviously go hand-in-hand.) For example, you could define a rule that maps an application to the hostnames that app is running on:

  1. package rules
  2. app_to_hostnames[app_name] = hostnames {
  3. app := apps[_]
  4. app_name := app.name
  5. hostnames := [hostname | name := app.servers[_]
  6. s := sites[_].servers[_]
  7. s.name == name
  8. hostname := s.hostname]
  9. }
  10. apps = [
  11. {
  12. "name": "web",
  13. "servers": ["s1", "s2"],
  14. },
  15. {
  16. "name": "mysql",
  17. "servers": ["s3"],
  18. },
  19. {
  20. "name": "mongodb",
  21. "servers": ["s4"],
  22. },
  23. ]
  24. sites = [
  25. {
  26. "servers": [
  27. {
  28. "name": "s1",
  29. "hostname": "hydrogen",
  30. },
  31. {
  32. "name": "s3",
  33. "hostname": "helium",
  34. },
  35. {
  36. "name": "s4",
  37. "hostname": "nitrogen",
  38. },
  39. ],
  40. },
  41. {
  42. "servers": [
  43. {
  44. "name": "s2",
  45. "hostname": "carbon",
  46. },
  47. ],
  48. },
  49. ]

And then we can iterate over all the key/value pairs of that app-to-hostname mapping (just like we could iterate over all key/value pairs of a hardcoded JSON object). You can also iterate over just the keys or just the values or you can look up the value for a key or lookup all the keys for a single value.

  1. # iterate over all key/value pairs
  2. app_to_hostnames[app]
  1. +-----------+-----------------------+
  2. | app | app_to_hostnames[app] |
  3. +-----------+-----------------------+
  4. | "web" | ["hydrogen","carbon"] |
  5. | "mysql" | ["helium"] |
  6. | "mongodb" | ["nitrogen"] |
  7. +-----------+-----------------------+
  1. # iterate over all values
  2. app_to_hostnames[_]
  1. +-----------------------+
  2. | app_to_hostnames[_] |
  3. +-----------------------+
  4. | ["hydrogen","carbon"] |
  5. | ["helium"] |
  6. | ["nitrogen"] |
  7. +-----------------------+
  1. # iterate over all keys
  2. app_to_hostnames[x] = _
  1. +-----------+
  2. | x |
  3. +-----------+
  4. | "web" |
  5. | "mysql" |
  6. | "mongodb" |
  7. +-----------+
  1. # lookup the value for key "web"
  2. app_to_hostnames["web"]
  1. [
  2. "hydrogen",
  3. "carbon"
  4. ]
  1. # lookup keys where value includes "carbon"
  2. app_to_hostnames[k][_] == "carbon"
  1. +-------+
  2. | k |
  3. +-------+
  4. | "web" |
  5. +-------+

Obviously with the trim_and_split function we cannot ask for all the inputs/outputs since there are infinitely many. We can’t provide 1 input and ask for all the other inputs that make the function return true, again, because there could be infinitely many. The only thing we can do with a function is provide it all the inputs and ask for the output.

Functions allow you to factor out common logic that has infinitely-many input/output pairs; rules allow you to factor out common logic with finitely many input/outputs and allow you to iterate over them in the same way as native JSON objects.

To achieve automatic iteration, there is an additional syntactic requirement on a rule that is NOT present for a function: safety. See the FAQ entry on safety for technical details. Every rule must be safe, which guarantees that OPA can figure out a finite list of possible values for every variable in the body and head of a rule.

We recommend using rules where possible and using functions when rules do not work.

Safety

The compiler will sometimes throw errors that say a rule is not safe. The goal of safety is to ensure that every rule has finitely many inputs/outputs. Safety ensures that every variable has finitely many possible values, so that OPA can iterate over them to find those values that make the rule true. Technically:

  1. Safety: every variable appearing in the head or in a builtin or inside a negation must appear in a non-negated, non-builtin expression in the body of the rule.

Examples:

  1. # Unsafe: x in head does not appear in body.
  2. # There are infinitely many values that make p true
  3. p[x] { some y; q[y]; r[y] }
  4. # Safe. q and r are both rules
  5. # Both q and r are finite; therefore p is also finite.
  6. p[x] = y { some x, y; q[x]; r[y] }
  7. # Unsafe: y appears inside a builtin (+) but not in the body.
  8. # y has infinitely many possible values; so too does x.
  9. p[x] { some y; x := y + 7 }
  10. # Safe: the only values for y are those in q.
  11. # Since q is a rule and finite so is p finite.
  12. p[x] { some y; x := y + 7; q[y]}
  13. # Unsafe: x appears inside a negation
  14. # If q is finite, all the x's not in q are infinite.
  15. p[x] { some x; not q[x] }
  16. # Safe: x appears inside of r so p is no larger than r
  17. # Since r is finite, so too is p
  18. p[x] { some x; not q[x]; r[x] }

Safety has one implication about negation: you don’t iterate over values NOT in a rule like q. Instead, you iterate over values in another rule like r and then use negation to CHECK whether if that value is NOT in q.

Embedded terms like not p[q[_]] sometimes produce difficult to decipher error messages. We recommend pulling the embedded terms out into the rule–the meaning is the same and often creates easier to read error messages:

  1. x := q[_]
  2. not p[x]

JSON Web Tokens (JWTs)

JSON Web Tokens (JWTs) are an industry standard for exchanging information between services. Often they are used to represent information about the users logged into a system. OPA has special-purpose code for dealing with JWTs.

All JWTs with OPA come in as strings. That string is a JSON Web Token encoded with JWS Compact Serialization. JWE and JWS JSON Serialization are not supported.

You can verify tokens are properly signed.

  1. # RS256 signature
  2. io.jwt.verify_rs256(string, certificate)
  3. # PS256 signature
  4. io.jwt.verify_ps256(string, certificate)
  5. # ES256 signature
  6. io.jwt.verify_es256(string, certificate)
  7. # HS256 signature
  8. io.jwt.verify_hs256(string, certificate)

You can decode JWTs and use the contents of the JWT to make policy decisions.

  1. package jwt_decode
  1. {
  2. "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiYWxpY2UiLCJhenAiOiJhbGljZSIsInN1Ym9yZGluYXRlcyI6W10sImhyIjpmYWxzZX0.rz3jTY033z-NrKfwrK89_dcLF7TN4gwCMj-fVBDyLoM"
  3. }
  1. io.jwt.decode(input.token)
  1. [
  2. {
  3. "alg": "HS256",
  4. "typ": "JWT"
  5. },
  6. {
  7. "azp": "alice",
  8. "hr": false,
  9. "subordinates": [],
  10. "user": "alice"
  11. },
  12. "af3de34d8d37df3f8daca7f0acaf3dfdd70b17b4cde20c02323f9f5410f22e83"
  13. ]

If nested signing was used, the header, payload and signature will represent the most deeply nested token.

You can decode and verify using io.jwt.decode_verify.

  1. io.jwt.decode_verify(input.token, {
  2. "secret": "secret",
  3. "alg": "hs256",
  4. })
  1. [
  2. false,
  3. {},
  4. {}
  5. ]

See the Policy Reference for additional verification constraints.

To get certificates into the policy, you can either hardcode them or provide them as environmental variables to OPA and then use the opa.runtime builtin to retrieve those variables.

  1. # all runtime information
  2. runtime := opa.runtime()
  3. # environment variables provided when OPA started
  4. runtime.env
  5. # the env variable PROD_CERTIFICATE
  6. runtime.env.PROD_CERTIFICATE

How do I Write Policies Securely?

Depending on the use case and the integration with OPA that you are using, the style of policy you choose can impact your overall security posture. Below we show three styles of authoring policy and compare them.

Default allow. This style of policy allows every request by default. The rules you write dictate which requests should be rejected.

  1. # entry point is 'deny'
  2. default deny = false
  3. deny { ... }
  4. deny { ... }

If you assume all of the rules you write are correct, then you know that every rejection the policy produces should truly be rejected. However, there could be requests that are allowed that you may not truly want allowed, but you simply neglected to write the rule for. For operations, this is often a useful style of policy authoring because it allows you to incrementally tighten the controls for a system from wherever that system starts. For security, this style is less appropriate because it allows unknown bad actions to occur.

Default deny. This style of policy rejects every request by default. The rules you write dictate which requests should be allowed.

  1. # entry point is 'allow'
  2. default allow = false
  3. allow { ... }
  4. allow { ... }

If you assume your rules are correct, the only requests that are accepted are known to be safe. Any statements you leave out reject requests that in actuality are safe but which you did not know were safe. For operations, these policies are less suitable for incrementally improving the policy posture of a system because the initial policy must explicitly allow all of the behaviors that are necessary for the system to operate correctly. For security, these policies ensure that any request that is allowed is known to be safe (because there is a rule saying it is safe).

Default allow with deny override. This style of policy rejects every request by default. You write rules that dictate which requests should be allowed, and optionally you write other rules that dictate which of those allowed requests should be rejected.

  1. # entry point is 'authz'
  2. default authz = false
  3. authz {
  4. allow
  5. not deny
  6. }
  7. allow { ... }
  8. deny { ... }

This hybrid approach to policy authoring combines the two previous styles. These policies allow relatively coarse grained parts of the request space and then carve out of each part what should actually be denied. Any deny statements that you forget lead to security problems; any allow statements you forget lead to operational problems. But since this approach allows you to implement either of the other two, it is a common pattern across use cases.

Non-boolean policies. The examples above focus on policies with boolean decisions. Policies that make non-boolean decisions typically have similar tradeoffs. Are you enumerating the conditions under which requests are permitted (e.g. the list of clusters to which an app SHOULD be deployed) or are you enumerating the conditions under which requests are prohibited (e.g. the list of clusters to which an app SHOULD NOT be deployed). While the details differ, the concepts are often similar.