Runtime
The runtime configuration specifies a virtual file system tree that contains re-loadable configuration elements. This virtual file system can be realized via a series of local file system, static bootstrap configuration, RTDS and admin console derived overlays.
Virtual file system
Layering
The runtime can be viewed as a virtual file system consisting of multiple layers. The layered runtime bootstrap configuration specifies this layering. Runtime settings in later layers override earlier layers. A typical configuration might be:
layers:
- name: static_layer_0
static_layer:
health_check:
min_interval: 5
- name: disk_layer_0
disk_layer: { symlink_root: /srv/runtime/current, subdirectory: envoy }
- name: disk_layer_1
disk_layer: { symlink_root: /srv/runtime/current, subdirectory: envoy_override, append_service_cluster: true }
- name: admin_layer_0
admin_layer: {}
In the deprecated runtime bootstrap configuration, the layering was implicit and fixed:
with values in higher layers overriding corresponding values in lower layers.
File system layout
Various sections of the configuration guide describe the runtime settings that are available. For example, here are the runtime settings for upstream clusters.
Each ‘.’ in a runtime key indicates a new directory in the hierarchy, The terminal portion of a path is the file. The contents of the file constitute the runtime value. When reading numeric values from a file, spaces and new lines will be ignored.
numerator or denominator are reserved keywords and may not appear in any directory.
Static bootstrap
A static base runtime may be specified in the bootstrap configuration via a protobuf JSON representation.
Local disk file system
When the runtime virtual file system is realized on a local disk, it is rooted at symlink_root + subdirectory. For example, the health_check.min_interval key would have the following full file system path (using the symbolic link):
/srv/runtime/current/envoy/health_check/min_interval
Overrides
An arbitrary number of disk file system layers can be overlaid in the layered runtime bootstrap configuration.
In the deprecated runtime bootstrap configuration, there was a distinguished file system override. Assume that the folder /srv/runtime/v1
points to the actual file system path where global runtime configurations are stored. The following would be a typical configuration setting for runtime:
symlink_root:
/srv/runtime/current
subdirectory:
envoy
override_subdirectory:
envoy_override
Where /srv/runtime/current
is a symbolic link to /srv/runtime/v1
.
Cluster-specific subdirectories
In the deprecated runtime bootstrap configuration, the override_subdirectory is used along with the --service-cluster
CLI option. Assume that --service-cluster
has been set to my-cluster
. Envoy will first look for the health_check.min_interval key in the following full file system path:
/srv/runtime/current/envoy_override/my-cluster/health_check/min_interval
If found, the value will override any value found in the primary lookup path. This allows the user to customize the runtime values for individual clusters on top of global defaults.
With the layered runtime bootstrap configuration, it is possible to specialize on service cluster via the append_service_cluster option at any disk layer.
Updating runtime values via symbolic link swap
There are two steps to update any runtime value. First, create a hard copy of the entire runtime tree and update the desired runtime values. Second, atomically swap the symbolic link root from the old tree to the new runtime tree, using the equivalent of the following command:
/srv/runtime:~$ ln -s /srv/runtime/v2 new && mv -Tf new current
It’s beyond the scope of this document how the file system data is deployed, garbage collected, etc.
Runtime Discovery Service (RTDS)
One or more runtime layers may be specified and delivered by specifying a rtds_layer. This points the runtime layer at a regular xDS endpoint, subscribing to a single xDS resource for the given layer. The resource type for these layers is a Runtime message.
Admin console
Values can be viewed at the /runtime admin endpoint. Values can be modified and added at the /runtime_modify admin endpoint. If runtime is not configured, an empty provider is used which has the effect of using all defaults built into the code, except for any values added via /runtime_modify.
Attention
Use the /runtime_modify endpoint with care. Changes are effectively immediately. It is critical that the admin interface is properly secured.
At most one admin layer may be specified. If a non-empty layered runtime bootstrap configuration is specified with an absent admin layer, any mutating admin console actions will elicit a 503 response.
Atomicity
The runtime will reload and a new snapshot will be generated in a variety of situations, i.e.:
When a file move operation is detected under the symlink root or the symlink root changes.
When an admin console override is added or modified.
All runtime layers are evaluated during a snapshot. Layers with errors are ignored and excluded from the effective layers, see num_layers. Walking the symlink root will take a non-zero amount of time, so if true atomicity is desired, the runtime directory should be immutable and symlink changes should be used to orchestrate updates.
Disk layers with the same symlink root will only trigger a single refresh when a file movement is detected. Disk layers with overlapping symlink root paths that are not identical may trigger multiple reloads when a file movement is detected.
Protobuf and JSON representation
The runtime file system can be represented inside a proto3 message as a google.protobuf.Struct modeling a JSON object with the following rules:
Dot separators map to tree edges.
Scalar leaves (integer, strings, booleans, doubles) are represented with their respective JSON type.
FractionalPercent is represented with via its canonical JSON encoding.
An example representation of a setting for the health_check.min_interval key in YAML is:
health_check:
min_interval: 5
Note
Integer values that are parsed from doubles are rounded down to the nearest whole number.
Comments
Lines starting with #
as the first character are treated as comments.
Comments can be used to provide context on an existing value. Comments are also useful in an otherwise empty file to keep a placeholder for deployment in a time of need.
Using runtime overrides for deprecated features
The Envoy runtime is also a part of the Envoy feature deprecation process.
As described in the Envoy breaking change policy, feature deprecation in Envoy is in 3 phases: warn-by-default, fail-by-default, and code removal.
In the first phase, Envoy logs a warning to the warning log that the feature is deprecated and increments the deprecated_feature_use runtime stat. Users are encouraged to go to deprecated to see how to migrate to the new code path and make sure it is suitable for their use case.
In the second phase the field will be tagged as disallowed_by_default and use of that configuration field will cause the config to be rejected by default. This disallowed mode can be overridden in runtime configuration by setting envoy.deprecated_features:full_fieldname or envoy.deprecated_features:full_enum_value to true. For example, for a deprecated field Foo.Bar.Eep
set envoy.deprecated_features:Foo.bar.Eep
to true
. Use of this override is strongly discouraged. Fatal-by-default configuration indicates that the removal of the old code paths is imminent. It is far better for both Envoy users and for Envoy contributors if any bugs or feature gaps with the new code paths are flushed out ahead of time, rather than after the code is removed!
Statistics
The file system runtime provider emits some statistics in the runtime. namespace.
Name | Type | Description |
---|---|---|
admin_overrides_active | Gauge | 1 if any admin overrides are active otherwise 0 |
deprecated_feature_use | Counter | Total number of times deprecated features were used. Detailed information about the feature used will be logged to warning logs in the form “Using deprecated option ‘X’ from file Y”. |
load_error | Counter | Total number of load attempts that resulted in an error in any layer |
load_success | Counter | Total number of load attempts that were successful at all layers |
num_keys | Gauge | Number of keys currently loaded |
num_layers | Gauge | Number of layers currently active (without loading errors) |
override_dir_exists | Counter | Total number of loads that did use an override directory |
override_dir_not_exists | Counter | Total number of loads that did not use an override directory |