- High Performance Policy Decisions
- Benchmarking Queries
- Benchmarking OPA Tests
- Resource Utilization
- Optimization Levels
- Key Takeaways
Policy Performance
High Performance Policy Decisions
For low-latency/high-performance use-cases, e.g. microservice API authorization, policy evaluation has a budget on the order of 1 millisecond. Not all use cases require that kind of performance, and OPA is powerful enough that you can write expressive policies that take longer than 1 millisecond to evaluate. But for high-performance use cases, there is a fragment of the policy language that has been engineered to evaluate quickly. Even as the size of the policies grow, the performance for this fragment can be nearly constant-time.
Linear fragment
The linear fragment of the language is all of those policies where evaluation amounts to walking over the policy once. This means there is no search required to make a policy decision. Any variables you use can be assigned at most one value.
For example, the following rule has one local variable user
, and that variable can only be assigned one value. Intuitively, evaluating this rule requires checking each of the conditions in the body, and if there were N of these rules, evaluation would only require walking over each of them as well.
package linear
allow {
some user
input.method = "GET"
input.path = ["accounts", user]
input.user = user
}
Use objects over arrays
One common mistake people make is using arrays when they could use objects. For example, below is an array of ID/first-name/last-names where ID is unique, and you’re looking up the first-name/last-name given the ID.
# DO NOT DO THIS.
# Array of objects where each object has a unique identifier
d = [{"id": "a123", "first": "alice", "last": "smith"},
{"id": "a456", "first": "bob", "last": "jones"},
{"id": "a789", "first": "clarice", "last": "johnson"}
]
# search through all elements of the array to find the ID
d[i].id == "a789"
d[i].first ...
Instead, use a dictionary where the key is the ID and the value is the first-name/last-name. Given the ID, you can lookup the name information directly.
# DO THIS INSTEAD OF THE ABOVE
# Use object whose keys are the IDs for the objects.
# Looking up an object given its ID requires NO search
d = {"a123": {"first": "alice", "last": "smith"},
"a456": {"first": "bob", "last": "jones"},
"a789": {"first": "clarice", "last": "johnson"}
}
# no search required
d["a789"].first ...
Use indexed statements
The linear-time fragment ensures that the cost of evaluation is no larger than the size of the policy. OPA lets you write non-linear policies, because sometimes you need to, and because sometimes it’s convenient. The blog on partial evaluation describes one mechanism for converting non-linear policies into linear policies.
But as the size of the policy grows, the cost of evaluation grows with it. Sometimes the policy can grow large enough that even the linear-fragment fails to meet the performance budget.
In the linear fragment, OPA includes special algorithms that index rules efficiently, sometimes making evaluation constant-time, even as the policy grows. The more effective the indexing is the fewer rules need to be evaluated.
Here is an example policy from the rule-indexing blog giving the details for these algorithms. See the rest of this section for details on indexed statements.
package indexed
default allow = false
allow {
some user
input.method = "GET"
input.path = ["accounts", user]
input.user = user
}
allow {
input.method = "GET"
input.path = ["accounts", "report"]
roles[input.user][_] = "admin"
}
allow {
input.method = "POST"
input.path = ["accounts"]
roles[input.user][_] = "admin"
}
roles = {
"bob": ["admin", "hr"],
"alice": ["procurement"],
}
allow
{
"user": "bob",
"path": ["accounts", "bob"],
"method": "GET"
}
true
Equality statements
For simple equality statements (=
and ==
) to be indexed one side must be a non-nested reference that does not contain any variables and the other side must be a variable, scalar, or array (which may contain scalars and variables). For example:
Expression | Indexed | Reason |
---|---|---|
input.x = “foo” | yes | n/a |
input.x.y = “bar” | yes | n/a |
input.x = [“foo”, i] | yes | n/a |
input.x[i] = “foo” | no | reference contains variables |
input.x[input.y] = “foo” | no | reference is nested |
Glob statements
For glob.match(pattern, delimiter, match)
statements to be indexed the pattern must be recognized by the indexer and the match be a non-nested reference that does not contain any variables. The indexer recognizes patterns containing the normal glob (*
) operator but not the super glob (**
) or character pattern matching operators.
Expression | Indexed | Reason |
---|---|---|
glob.match(“foo::bar”, [“:”], input.x) | yes | n/a |
glob.match(“foo::bar”, [“:”], input.x) | no | pattern contains
|
glob.match(“foo::bar”, [“:”], input.x[i]) | no | match contains variable(s) |
Comprehension Indexing
Rego does not support mutation. As a result, certain operations like “group by” require use of comprehensions to aggregate values. To avoid O(n^2) runtime complexity in queries/rules that perform group-by, OPA may compute and memoize the entire collection produced by comprehensions at once. This ensures that runtime complexity is O(n) where n is the size of the collection that group-by/aggregation is being performed on.
For example, suppose the policy must check if the number of ports exposed on an interface exceeds some threshold (e.g., any interface may expose up to 100 ports.) The policy is given the port->interface mapping as a JSON array under input
:
{
"exposed": [
{
"interface": "eth0",
"port": 8080,
},
{
"interface": "eth0",
"port": 8081,
},
{
"interface": "eth1",
"port": 443,
},
{
"interface": "lo1",
"port": 5000,
}
]
}
In this case, the policy must count the number of ports exposed on each interface. To do this, the policy must first aggregate/group the ports by the interface name. Conceptually, the policy should generate a document like this:
{
"exposed_ports_by_interface": {
"eth0": [8080, 8081],
"eth1": [443],
"lo1": [5000]
}
}
Since multiple ports could be exposed on a single interface, the policy must use a comprehension to aggregate the port values by the interface names. To implement this logic in Rego, we would write:
some i
intf := input.exposed[i].interface
ports := [port | some j; input.exposed[j].interface == intf; port := input.exposed[j].port]
Without comprehension indexing, this query would be O(n^2) where n is the size of input.exposed
. However, with comprehension indexing, the query remains O(n) because OPA only computes the comprehension once. In this case, the comprehension is evaluated and all possible values of ports
are computed at once. These values are indexed by the assignments of intf
.
To implement the policy above we could write:
deny[msg] {
some i
count(exposed_ports_by_interface[i]) > 100
msg := sprintf("interface '%v' exposes too many ports", [i])
}
exposed_ports_by_interface := {intf: ports |
some i
intf := input.exposed[i].interface
ports := [port |
some j
input.exposed[j].interface == intf
port := input.exposed[j].port
]
}
Indices can be built for comprehensions (nested or not) that generate collections (i.e., arrays, sets, or objects) based on variables in an outer query. In the example above:
intf
is the variable in the outer query.[port | some j; input.exposed[j].interface == intf; port := input.exposed[j].port]
is the comprehension.ports
is the variable the collection is assigned to.
In order to be indexed, comprehensions must meet the following conditions:
- The comprehension appears in an assignment or unification statement.
- The expression containing the comprehension does not include a
with
statement. - The expression containing the comprehension is not negated.
- The comprehension body is safe when considered independent from the outer query.
- The comprehension body closes over at least one variable in the outer query and none of these variables appear as outputs in references or
walk()
calls or inside nested comprehensions.
The following examples show cases that are NOT indexed:
not_indexed_because_missing_assignment {
x := input[_]
[y | some y; x == input[y]]
}
not_indexed_because_includes_with {
x := input[_]
ys := [y | some y; x := input[y]] with input as {}
}
not_indexed_because_negated {
x := input[_]
not data.arr = [y | some y; x := input[y]]
}
not_indexed_because_safety {
obj := input.foo.bar
x := obj[_]
ys := [y | some y; x == obj[y]]
}
not_indexed_because_no_closure {
ys := [y | x := input[y]]
}
not_indexed_because_reference_operand_closure {
x := input[y].x
ys := [y | x == input[y].z[_]]
}
not_indexed_because_nested_closure {
x = 1
y = 2
_ = [i |
x == input.foo[i]
_ = [j | y == input.bar[j]]]
}
The 4th and 5th restrictions may be relaxed in the future.
Profiling
You can also profile your policies using opa eval
. The profiler is useful if you need to understand why policy evaluation is slow.
The opa eval
command provides the following profiler options:
Option | Detail | Default |
---|---|---|
—profile | Enables expression profiling and outputs profiler results. | off |
—profile-sort | Criteria to sort the expression profiling results. This options implies —profile . | total_time_ns => num_eval => num_redo => file => line |
—profile-limit | Desired number of profiling results sorted on the given criteria. This options implies —profile . | 10 |
Sort criteria for the profile results
total_time_ns
- Results are displayed is decreasing order of expression evaluation timenum_eval
- Results are displayed is decreasing order of number of times an expression is evaluatednum_redo
- Results are displayed is decreasing order of number of times an expression is re-evaluated(redo)file
- Results are sorted in reverse alphabetical order based on the rego source filenameline
- Results are displayed is decreasing order of expression line number in the source file
When the sort criteria is not provided total_time_ns
has the highest priority while line
has the lowest.
Example Policy
The different profiling examples shown later on this page use the below sample policy.
package rbac
# Example input request
input = {
"subject": "bob",
"resource": "foo123",
"action": "write",
}
# Example RBAC configuration.
bindings = [
{
"user": "alice",
"roles": ["dev", "test"],
},
{
"user": "bob",
"roles": ["test"],
},
]
roles = [
{
"name": "dev",
"permissions": [
{"resource": "foo123", "action": "write"},
{"resource": "foo123", "action": "read"},
],
},
{
"name": "test",
"permissions": [{"resource": "foo123", "action": "read"}],
},
]
# Example RBAC policy implementation.
default allow = false
allow {
some role_name
user_has_role[role_name]
role_has_permission[role_name]
}
user_has_role[role_name] {
binding := bindings[_]
binding.user == input.subject
role_name := binding.roles[_]
}
role_has_permission[role_name] {
role := roles[_]
role_name := role.name
perm := role.permissions[_]
perm.resource == input.resource
perm.action == input.action
}
Example: Display ALL
profile results with default
ordering criteria
opa eval --data rbac.rego --profile --format=pretty 'data.rbac.allow'
Sample Output
false
+----------+----------+----------+-----------------+
| TIME | NUM EVAL | NUM REDO | LOCATION |
+----------+----------+----------+-----------------+
| 47.148µs | 1 | 1 | data.rbac.allow |
| 28.965µs | 1 | 1 | rbac.rego:11 |
| 24.384µs | 1 | 1 | rbac.rego:41 |
| 23.064µs | 2 | 1 | rbac.rego:47 |
| 15.525µs | 1 | 1 | rbac.rego:38 |
| 14.137µs | 1 | 2 | rbac.rego:46 |
| 13.927µs | 1 | 0 | rbac.rego:42 |
| 13.568µs | 1 | 1 | rbac.rego:55 |
| 12.982µs | 1 | 0 | rbac.rego:56 |
| 12.763µs | 1 | 2 | rbac.rego:52 |
+----------+----------+----------+-----------------+
+------------------------------+----------+
| METRIC | VALUE |
+------------------------------+----------+
| timer_rego_module_compile_ns | 1871613 |
| timer_rego_query_compile_ns | 82290 |
| timer_rego_query_eval_ns | 257952 |
| timer_rego_query_parse_ns | 12337169 |
+------------------------------+----------+
As seen from the above table, all results are displayed. The profile results are sorted on the default sort criteria.
Example: Display top 5
profile results
opa eval --data rbac.rego --profile-limit 5 --format=pretty 'data.rbac.allow'
Sample Output
+----------+----------+----------+-----------------+
| TIME | NUM EVAL | NUM REDO | LOCATION |
+----------+----------+----------+-----------------+
| 46.329µs | 1 | 1 | data.rbac.allow |
| 26.656µs | 1 | 1 | rbac.rego:11 |
| 24.206µs | 2 | 1 | rbac.rego:47 |
| 23.235µs | 1 | 1 | rbac.rego:41 |
| 18.242µs | 1 | 1 | rbac.rego:38 |
+----------+----------+----------+-----------------+
The profile results are sorted on the default sort criteria. Also --profile
option is implied and does not need to be provided.
Example: Display top 5
profile results based on the number of times an expression is evaluated
opa eval --data rbac.rego --profile-limit 5 --profile-sort num_eval --format=pretty 'data.rbac.allow'
Sample Profile Output
+----------+----------+----------+-----------------+
| TIME | NUM EVAL | NUM REDO | LOCATION |
+----------+----------+----------+-----------------+
| 26.675µs | 2 | 1 | rbac.rego:47 |
| 9.274µs | 2 | 1 | rbac.rego:53 |
| 43.356µs | 1 | 1 | data.rbac.allow |
| 22.467µs | 1 | 1 | rbac.rego:41 |
| 22.425µs | 1 | 1 | rbac.rego:11 |
+----------+----------+----------+-----------------+
As seen from the above table, the results are arranged first in decreasing order of number of evaluations and if two expressions have been evaluated the same number of times, the default criteria is used since no other sort criteria is provided. In this case, total_time_ns => num_redo => file => line. Also --profile
option is implied and does not need to be provided.
Example: Display top 5
profile results based on the number of times an expression is evaluated
and number of times an expression is re-evaluated
opa eval --data rbac.rego --profile-limit 5 --profile-sort num_eval,num_redo --format=pretty 'data.rbac.allow'
Sample Profile Output
+----------+----------+----------+-----------------+
| TIME | NUM EVAL | NUM REDO | LOCATION |
+----------+----------+----------+-----------------+
| 22.892µs | 2 | 1 | rbac.rego:47 |
| 8.831µs | 2 | 1 | rbac.rego:53 |
| 13.767µs | 1 | 2 | rbac.rego:46 |
| 10.78µs | 1 | 2 | rbac.rego:52 |
| 42.338µs | 1 | 1 | data.rbac.allow |
+----------+----------+----------+-----------------+
As seen from the above table, result are first arranged based on number of evaluations, then number of re-evaluations and finally the default criteria is used. In this case, total_time_ns => file => line. The --profile-sort
options accepts repeated or comma-separated values for the criteria. The order of the criteria on the command line determine their priority.
Another way to get the same output as above would be the following:
opa eval --data rbac.rego --profile-limit 5 --profile-sort num_eval --profile-sort num_redo --format=pretty 'data.rbac.allow'
Benchmarking Queries
OPA provides CLI options to benchmark a single query via the opa bench
command. This will evaluate similarly to opa eval
but it will repeat the evaluation (in its most efficient form) a number of times and report metrics.
Example: Benchmark rbac allow
Using the same policy source as shown above:
$ opa bench --data rbac.rego 'data.rbac.allow'
Will result in an output similar to:
+-------------------------------------------+------------+
| samples | 27295 |
| ns/op | 45032 |
| B/op | 20977 |
| allocs/op | 382 |
| histogram_timer_rego_query_eval_ns_stddev | 25568 |
| histogram_timer_rego_query_eval_ns_99.9% | 335906 |
| histogram_timer_rego_query_eval_ns_99.99% | 336493 |
| histogram_timer_rego_query_eval_ns_mean | 40355 |
| histogram_timer_rego_query_eval_ns_median | 35846 |
| histogram_timer_rego_query_eval_ns_99% | 133936 |
| histogram_timer_rego_query_eval_ns_90% | 44780 |
| histogram_timer_rego_query_eval_ns_95% | 50815 |
| histogram_timer_rego_query_eval_ns_min | 31284 |
| histogram_timer_rego_query_eval_ns_max | 336493 |
| histogram_timer_rego_query_eval_ns_75% | 38254 |
| histogram_timer_rego_query_eval_ns_count | 27295 |
+-------------------------------------------+------------+
These results capture metrics of samples
runs, where only the query evaluation is measured. All time spent preparing to evaluate (loading, parsing, compiling, etc.) is omitted.
Note: all
*/op
results are an average over the number ofsamples
(orN
in the JSON format)
Options for opa bench
Option | Detail | Default |
---|---|---|
—benchmem | Report memory allocations with benchmark results. | true |
—metrics | Report additional query performance metrics. | true |
—count | Number of times to repeat the benchmark. | 1 |
Benchmarking OPA Tests
There is also a --bench
option for opa test
which will perform benchmarking on OPA unit tests. This will evaluate any loaded tests as benchmarks. There will be additional time for any test-specific actions are included so the timing will typically be longer than what is seen with opa bench
. The primary use-case is not for absolute time, but to track relative time as policies change.
Options for opa test --bench
Option | Detail | Default |
---|---|---|
—benchmem | Report memory allocations with benchmark results. | true |
—count | Number of times to repeat the benchmark. | 1 |
Example Tests
Adding a unit test file for the policy source as shown above:
package rbac
test_user_has_role_dev {
user_has_role["dev"] with input as {"subject": "alice"}
}
test_user_has_role_negative {
not user_has_role["super-admin"] with input as {"subject": "alice"}
}
Which when run normally will output something like:
$ opa test -v ./rbac.rego ./rbac_test.rego
data.rbac.test_user_has_role_dev: PASS (605.076µs)
data.rbac.test_user_has_role_negative: PASS (318.047µs)
--------------------------------------------------------------------------------
PASS: 2/2
Example: Benchmark rbac unit tests
opa test -v --bench ./rbac.rego ./rbac_test.rego
Results in output:
data.rbac.test_user_has_role_dev 44749 27677 ns/op 23146 timer_rego_query_eval_ns/op 12303 B/op 229 allocs/op
data.rbac.test_user_has_role_negative 44526 26348 ns/op 22033 timer_rego_query_eval_ns/op 12470 B/op 235 allocs/op
--------------------------------------------------------------------------------
PASS: 2/2
Example: Benchmark rbac unit tests and compare with benchstat
The benchmark output formats default to pretty
, but support a gobench
format which complies with the Golang Benchmark Data Format. This allows for usage of tools like benchstat to gain additional insight into the benchmark results and to diff between benchmark results.
Example:
opa test -v --bench --count 10 --format gobench ./rbac.rego ./rbac_test.rego | tee ./old.txt
Will result in an old.txt
and output similar to:
BenchmarkDataRbacTestUserHasRoleDev 45152 26323 ns/op 22026 timer_rego_query_eval_ns/op 12302 B/op 229 allocs/op
BenchmarkDataRbacTestUserHasRoleNegative 45483 26253 ns/op 21986 timer_rego_query_eval_ns/op 12470 B/op 235 allocs/op
--------------------------------------------------------------------------------
PASS: 2/2
.
.
Repeated 10 times (as specified by the --count
flag).
This format can then be loaded by benchstat
:
benchstat ./old.txt
Output:
name time/op
DataRbacTestUserHasRoleDev 29.8µs ±18%
DataRbacTestUserHasRoleNegative 32.0µs ±35%
name timer_rego_query_eval_ns/op
DataRbacTestUserHasRoleDev 25.0k ±18%
DataRbacTestUserHasRoleNegative 26.7k ±35%
name alloc/op
DataRbacTestUserHasRoleDev 12.3kB ± 0%
DataRbacTestUserHasRoleNegative 12.5kB ± 0%
name allocs/op
DataRbacTestUserHasRoleDev 229 ± 0%
DataRbacTestUserHasRoleNegative 235 ± 0%
If later on a change was introduced that altered the performance we can run again:
opa test -v --bench --count 10 --format gobench ./rbac.rego ./rbac_test.rego | tee ./new.txt
BenchmarkDataRbacTestUserHasRoleDev 27415 43671 ns/op 39301 timer_rego_query_eval_ns/op 17201 B/op 379 allocs/op
BenchmarkDataRbacTestUserHasRoleNegative 27583 44743 ns/op 40152 timer_rego_query_eval_ns/op 17369 B/op 385 allocs/op
--------------------------------------------------------------------------------
PASS: 2/2
.
.
(Repeated 10 times)
Then we can compare the results via:
benchstat ./old.txt ./new.txt
name old time/op new time/op delta
DataRbacTestUserHasRoleDev 29.8µs ±18% 47.4µs ±15% +59.06% (p=0.000 n=9+10)
DataRbacTestUserHasRoleNegative 32.0µs ±35% 47.1µs ±14% +47.48% (p=0.000 n=10+9)
name old timer_rego_query_eval_ns/op new timer_rego_query_eval_ns/op delta
DataRbacTestUserHasRoleDev 25.0k ±18% 42.6k ±15% +70.51% (p=0.000 n=9+10)
DataRbacTestUserHasRoleNegative 26.7k ±35% 42.3k ±14% +58.15% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
DataRbacTestUserHasRoleDev 12.3kB ± 0% 17.2kB ± 0% +39.81% (p=0.000 n=10+10)
DataRbacTestUserHasRoleNegative 12.5kB ± 0% 17.4kB ± 0% +39.28% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
DataRbacTestUserHasRoleDev 229 ± 0% 379 ± 0% +65.50% (p=0.000 n=10+10)
DataRbacTestUserHasRoleNegative 235 ± 0% 385 ± 0% +63.83% (p=0.000 n=10+10)
This gives clear feedback that the evaluations have slowed down considerably by looking at the delta
Note that for benchstat you will want to run with
--count
to repeat the benchmarks a number of times (5-10 is usually enough). The tool requires several data points else thep
value will not show meaningful changes and thedelta
will be~
.
Resource Utilization
Policy evaluation is typically CPU-bound unless the policies have to pull additional data on-the-fly using built-in functions like http.send()
(in which case evaluation likely becomes I/O-bound.) Policy evaluation is currently single-threaded. If you are embedding OPA as a library, it is your responsibility to dispatch concurrent queries to different Goroutines/threads. If you are running the OPA server, it will parallelize concurrent requests and use as many cores as possible. You can limit the number of cores that OPA can consume by starting OPA with the GOMAXPROCS
environment variable.
Memory usage scales with the size of the policy (i.e., Rego) and data (e.g., JSON) that you load into OPA. Raw JSON data loaded into OPA uses approximately 20x more memory compared to the same data stored in a compact, serialized format (e.g., on disk). This increased memory usage is due to the need to load the JSON data into Go data structures like maps, slices, and strings so that it can be evaluated. For example, if you load 8MB worth of JSON data representing 100,000 permission objects specifying subject/action/resource triplets, OPA would consume approximately 160MB of RAM.
Memory usage also scales linearly with the number of rules loaded into OPA. For example, loading 10,000 rules that implement an ACL-style authorization policy consumes approximately 130MB of RAM while 100,000 rules implementing the same policy (but with 10x more tuples to check) consumes approximately 1.1GB of RAM.
Optimization Levels
The --optimize
(or -O
) flag on the opa build
command controls how bundles are optimized.
Optimization applies partial evaluation to precompute known values in the policy. The goal of partial evaluation is to convert non-linear-time policies into linear-time policies.
By specifying the --optimize
flag, users can control how much time and resources are spent attempting to optimize the bundle. Generally, higher optimization levels require more time and resources. Currently, OPA supports three optimization levels. The exact optimizations applied in each level may change over time.
-O=0 (default)
By default optimizations are disabled.
-O=1 (recommended)
Policies are partially evaluated. Rules that DO NOT depend on unknowns (directly or indirectly) are evaluated and the virtual documents they produce are inlined into call sites. Virtual documents that are required at evaluation time are not inlined. For example, if a base or virtual document is targetted by a with
statement in the policy, the document will not be inlined.
Rules that depend on unknowns (directly or indirectly) are also partially evaluated however the virtual documents they produce ARE NOT inlined into call sites. The output policy should be structurally similar to the input policy.
The opa build
automatically marks the input
document as unknown. In addition to the input
document, if opa build
is invoked with the -b
/--bundle
flag, any data
references NOT prefixed by the .manifest
roots are also marked as unknown.
-O=2 (aggressive)
Same as -O=1
except virtual documents produced by rules that depend on unknowns may be inlined into call sites. In addition, more aggressive inlining is applied within rules. This includes copy propagation and inlining of certain negated statements that would otherwise generate support rules.
Key Takeaways
For high-performance use cases:
- Write your policies to minimize iteration and search.
- Use objects instead of arrays when you have a unique identifier for the elements of the array.
- Consider partial-evaluation to compile non-linear policies to linear policies.
- Write your policies with indexed statements so that rule-indexing is effective.
- Use the profiler to help identify portions of the policy that would benefit the most from improved performance.
- Use the benchmark tools to help get real world timing data and detect policy performance changes.