Overview

The Envoy v2 APIs are defined as proto3 Protocol Buffers in the api tree. They support:

  • Streaming delivery of xDS API updates via gRPC. This reduces resource requirements and can lower the update latency.

  • A new REST-JSON API in which the JSON/YAML formats are derived mechanically via the proto3 canonical JSON mapping.

  • Delivery of updates via the filesystem, REST-JSON or gRPC endpoints.

  • Advanced load balancing through an extended endpoint assignment API and load and resource utilization reporting to management servers.

  • Stronger consistency and ordering properties when needed. The v2 APIs still maintain a baseline eventual consistency model.

See the xDS protocol description for further details on aspects of v2 message exchange between Envoy and the management server.

Bootstrap configuration

To use the v2 API, it’s necessary to supply a bootstrap configuration file. This provides static server configuration and configures Envoy to access dynamic configuration if needed. This is supplied on the command-line via the -c flag, i.e.:

  1. ./envoy -c <path to config>.{json,yaml,pb,pb_text}

where the extension reflects the underlying v2 config representation.

The Bootstrap message is the root of the configuration. A key concept in the Bootstrap message is the distinction between static and dynamic resources. Resources such as a Listener or Cluster may be supplied either statically in static_resources or have an xDS service such as LDS or CDS configured in dynamic_resources.

Example

Below we will use YAML representation of the config protos and a running example of a service proxying HTTP from 127.0.0.1:10000 to 127.0.0.2:1234.

Static

A minimal fully static bootstrap config is provided below:

  1. admin:
  2. access_log_path: /tmp/admin_access.log
  3. address:
  4. socket_address: { address: 127.0.0.1, port_value: 9901 }
  5. static_resources:
  6. listeners:
  7. - name: listener_0
  8. address:
  9. socket_address: { address: 127.0.0.1, port_value: 10000 }
  10. filter_chains:
  11. - filters:
  12. - name: envoy.http_connection_manager
  13. typed_config:
  14. "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
  15. stat_prefix: ingress_http
  16. codec_type: AUTO
  17. route_config:
  18. name: local_route
  19. virtual_hosts:
  20. - name: local_service
  21. domains: ["*"]
  22. routes:
  23. - match: { prefix: "/" }
  24. route: { cluster: some_service }
  25. http_filters:
  26. - name: envoy.router
  27. clusters:
  28. - name: some_service
  29. connect_timeout: 0.25s
  30. type: STATIC
  31. lb_policy: ROUND_ROBIN
  32. load_assignment:
  33. cluster_name: some_service
  34. endpoints:
  35. - lb_endpoints:
  36. - endpoint:
  37. address:
  38. socket_address:
  39. address: 127.0.0.1
  40. port_value: 1234

Mostly static with dynamic EDS

A bootstrap config that continues from the above example with dynamic endpoint discovery via an EDS gRPC management server listening on 127.0.0.1:5678 is provided below:

  1. admin:
  2. access_log_path: /tmp/admin_access.log
  3. address:
  4. socket_address: { address: 127.0.0.1, port_value: 9901 }
  5. static_resources:
  6. listeners:
  7. - name: listener_0
  8. address:
  9. socket_address: { address: 127.0.0.1, port_value: 10000 }
  10. filter_chains:
  11. - filters:
  12. - name: envoy.http_connection_manager
  13. typed_config:
  14. "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
  15. stat_prefix: ingress_http
  16. codec_type: AUTO
  17. route_config:
  18. name: local_route
  19. virtual_hosts:
  20. - name: local_service
  21. domains: ["*"]
  22. routes:
  23. - match: { prefix: "/" }
  24. route: { cluster: some_service }
  25. http_filters:
  26. - name: envoy.router
  27. clusters:
  28. - name: some_service
  29. connect_timeout: 0.25s
  30. lb_policy: ROUND_ROBIN
  31. type: EDS
  32. eds_cluster_config:
  33. eds_config:
  34. api_config_source:
  35. api_type: GRPC
  36. grpc_services:
  37. envoy_grpc:
  38. cluster_name: xds_cluster
  39. - name: xds_cluster
  40. connect_timeout: 0.25s
  41. type: STATIC
  42. lb_policy: ROUND_ROBIN
  43. http2_protocol_options: {}
  44. upstream_connection_options:
  45. # configure a TCP keep-alive to detect and reconnect to the admin
  46. # server in the event of a TCP socket half open connection
  47. tcp_keepalive: {}
  48. load_assignment:
  49. cluster_name: xds_cluster
  50. endpoints:
  51. - lb_endpoints:
  52. - endpoint:
  53. address:
  54. socket_address:
  55. address: 127.0.0.1
  56. port_value: 5678

Notice above that xds_cluster is defined to point Envoy at the management server. Even in an otherwise completely dynamic configurations, some static resources need to be defined to point Envoy at its xDS management server(s).

It’s important to set appropriate TCP Keep-Alive options in the tcp_keepalive block. This will help detect TCP half open connections to the xDS management server and re-establish a full connection.

In the above example, the EDS management server could then return a proto encoding of a DiscoveryResponse:

  1. version_info: "0"
  2. resources:
  3. - "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
  4. cluster_name: some_service
  5. endpoints:
  6. - lb_endpoints:
  7. - endpoint:
  8. address:
  9. socket_address:
  10. address: 127.0.0.2
  11. port_value: 1234

The versioning and type URL scheme that appear above are explained in more detail in the streaming gRPC subscription protocol documentation.

Dynamic

A fully dynamic bootstrap configuration, in which all resources other than those belonging to the management server are discovered via xDS is provided below:

  1. admin:
  2. access_log_path: /tmp/admin_access.log
  3. address:
  4. socket_address: { address: 127.0.0.1, port_value: 9901 }
  5. dynamic_resources:
  6. lds_config:
  7. api_config_source:
  8. api_type: GRPC
  9. grpc_services:
  10. envoy_grpc:
  11. cluster_name: xds_cluster
  12. cds_config:
  13. api_config_source:
  14. api_type: GRPC
  15. grpc_services:
  16. envoy_grpc:
  17. cluster_name: xds_cluster
  18. static_resources:
  19. clusters:
  20. - name: xds_cluster
  21. connect_timeout: 0.25s
  22. type: STATIC
  23. lb_policy: ROUND_ROBIN
  24. http2_protocol_options: {}
  25. upstream_connection_options:
  26. # configure a TCP keep-alive to detect and reconnect to the admin
  27. # server in the event of a TCP socket half open connection
  28. tcp_keepalive: {}
  29. load_assignment:
  30. cluster_name: xds_cluster
  31. endpoints:
  32. - lb_endpoints:
  33. - endpoint:
  34. address:
  35. socket_address:
  36. address: 127.0.0.1
  37. port_value: 5678

The management server could respond to LDS requests with:

  1. version_info: "0"
  2. resources:
  3. - "@type": type.googleapis.com/envoy.api.v2.Listener
  4. name: listener_0
  5. address:
  6. socket_address:
  7. address: 127.0.0.1
  8. port_value: 10000
  9. filter_chains:
  10. - filters:
  11. - name: envoy.http_connection_manager
  12. typed_config:
  13. "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
  14. stat_prefix: ingress_http
  15. codec_type: AUTO
  16. rds:
  17. route_config_name: local_route
  18. config_source:
  19. api_config_source:
  20. api_type: GRPC
  21. grpc_services:
  22. envoy_grpc:
  23. cluster_name: xds_cluster
  24. http_filters:
  25. - name: envoy.router

The management server could respond to RDS requests with:

  1. version_info: "0"
  2. resources:
  3. - "@type": type.googleapis.com/envoy.api.v2.RouteConfiguration
  4. name: local_route
  5. virtual_hosts:
  6. - name: local_service
  7. domains: ["*"]
  8. routes:
  9. - match: { prefix: "/" }
  10. route: { cluster: some_service }

The management server could respond to CDS requests with:

  1. version_info: "0"
  2. resources:
  3. - "@type": type.googleapis.com/envoy.api.v2.Cluster
  4. name: some_service
  5. connect_timeout: 0.25s
  6. lb_policy: ROUND_ROBIN
  7. type: EDS
  8. eds_cluster_config:
  9. eds_config:
  10. api_config_source:
  11. api_type: GRPC
  12. grpc_services:
  13. envoy_grpc:
  14. cluster_name: xds_cluster

The management server could respond to EDS requests with:

  1. version_info: "0"
  2. resources:
  3. - "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
  4. cluster_name: some_service
  5. endpoints:
  6. - lb_endpoints:
  7. - endpoint:
  8. address:
  9. socket_address:
  10. address: 127.0.0.2
  11. port_value: 1234

xDS API endpoints

A v2 xDS management server will implement the below endpoints as required for gRPC and/or REST serving. In both streaming gRPC and REST-JSON cases, a DiscoveryRequest is sent and a DiscoveryResponse received following the xDS protocol.

gRPC streaming endpoints

POST /envoy.api.v2.ClusterDiscoveryService/StreamClusters

See cds.proto for the service definition. This is used by Envoy as a client when

  1. cds_config:
  2. api_config_source:
  3. api_type: GRPC
  4. grpc_services:
  5. envoy_grpc:
  6. cluster_name: some_xds_cluster

is set in the dynamic_resources of the Bootstrap config.

POST /envoy.api.v2.EndpointDiscoveryService/StreamEndpoints

See eds.proto for the service definition. This is used by Envoy as a client when

  1. eds_config:
  2. api_config_source:
  3. api_type: GRPC
  4. grpc_services:
  5. envoy_grpc:
  6. cluster_name: some_xds_cluster

is set in the eds_cluster_config field of the Cluster config.

POST /envoy.api.v2.ListenerDiscoveryService/StreamListeners

See lds.proto for the service definition. This is used by Envoy as a client when

  1. lds_config:
  2. api_config_source:
  3. api_type: GRPC
  4. grpc_services:
  5. envoy_grpc:
  6. cluster_name: some_xds_cluster

is set in the dynamic_resources of the Bootstrap config.

POST /envoy.api.v2.RouteDiscoveryService/StreamRoutes

See rds.proto for the service definition. This is used by Envoy as a client when

  1. route_config_name: some_route_name
  2. config_source:
  3. api_config_source:
  4. api_type: GRPC
  5. grpc_services:
  6. envoy_grpc:
  7. cluster_name: some_xds_cluster

is set in the rds field of the HttpConnectionManager config.

POST /envoy.api.v2.ScopedRoutesDiscoveryService/StreamScopedRoutes

See srds.proto for the service definition. This is used by Envoy as a client when

  1. name: some_scoped_route_name
  2. scoped_rds:
  3. config_source:
  4. api_config_source:
  5. api_type: GRPC
  6. grpc_services:
  7. envoy_grpc:
  8. cluster_name: some_xds_cluster

is set in the scoped_routes field of the HttpConnectionManager config.

POST /envoy.service.discovery.v2.SecretDiscoveryService/StreamSecrets

See sds.proto for the service definition. This is used by Envoy as a client when

  1. name: some_secret_name
  2. config_source:
  3. api_config_source:
  4. api_type: GRPC
  5. grpc_services:
  6. envoy_grpc:
  7. cluster_name: some_xds_cluster

is set inside a SdsSecretConfig message. This message is used in various places such as the CommonTlsContext.

POST /envoy.service.discovery.v2.RuntimeDiscoveryService/StreamRuntime

See rtds.proto for the service definition. This is used by Envoy as a client when

  1. name: some_runtime_layer_name
  2. config_source:
  3. api_config_source:
  4. api_type: GRPC
  5. grpc_services:
  6. envoy_grpc:
  7. cluster_name: some_xds_cluster

is set inside the rtds_layer field.

REST endpoints

POST /v2/discovery:clusters

See cds.proto for the service definition. This is used by Envoy as a client when

  1. cds_config:
  2. api_config_source:
  3. api_type: REST
  4. cluster_names: [some_xds_cluster]

is set in the dynamic_resources of the Bootstrap config.

POST /v2/discovery:endpoints

See eds.proto for the service definition. This is used by Envoy as a client when

  1. eds_config:
  2. api_config_source:
  3. api_type: REST
  4. cluster_names: [some_xds_cluster]

is set in the eds_cluster_config field of the Cluster config.

POST /v2/discovery:listeners

See lds.proto for the service definition. This is used by Envoy as a client when

  1. lds_config:
  2. api_config_source:
  3. api_type: REST
  4. cluster_names: [some_xds_cluster]

is set in the dynamic_resources of the Bootstrap config.

POST /v2/discovery:routes

See rds.proto for the service definition. This is used by Envoy as a client when

  1. route_config_name: some_route_name
  2. config_source:
  3. api_config_source:
  4. api_type: REST
  5. cluster_names: [some_xds_cluster]

is set in the rds field of the HttpConnectionManager config.

Note

The management server responding to these endpoints must respond with a DiscoveryResponse along with a HTTP status of 200. Additionally, if the configuration that would be supplied has not changed (as indicated by the version supplied by the Envoy client) then the management server can respond with an empty body and a HTTP status of 304.

Aggregated Discovery Service

While Envoy fundamentally employs an eventual consistency model, ADS provides an opportunity to sequence API update pushes and ensure affinity of a single management server for an Envoy node for API updates. ADS allows one or more APIs and their resources to be delivered on a single, bidirectional gRPC stream by the management server. Without this, some APIs such as RDS and EDS may require the management of multiple streams and connections to distinct management servers.

ADS will allow for hitless updates of configuration by appropriate sequencing. For example, suppose foo.com was mapped to cluster X. We wish to change the mapping in the route table to point foo.com at cluster Y. In order to do this, a CDS/EDS update must first be delivered containing both clusters X and Y.

Without ADS, the CDS/EDS/RDS streams may point at distinct management servers, or when on the same management server at distinct gRPC streams/connections that require coordination. The EDS resource requests may be split across two distinct streams, one for X and one for Y. ADS allows these to be coalesced to a single stream to a single management server, avoiding the need for distributed synchronization to correctly sequence the update. With ADS, the management server would deliver the CDS, EDS and then RDS updates on a single stream.

ADS is only available for gRPC streaming (not REST) and is described more fully in xDS document. The gRPC endpoint is:

POST /envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources

See discovery.proto for the service definition. This is used by Envoy as a client when

  1. ads_config:
  2. api_type: GRPC
  3. grpc_services:
  4. envoy_grpc:
  5. cluster_name: some_ads_cluster

is set in the dynamic_resources of the Bootstrap config.

When this is set, any of the configuration sources above can be set to use the ADS channel. For example, a LDS config could be changed from

  1. lds_config:
  2. api_config_source:
  3. api_type: REST
  4. cluster_names: [some_xds_cluster]

to

  1. lds_config: {ads: {}}

with the effect that the LDS stream will be directed to some_ads_cluster over the shared ADS channel.

Delta endpoints

The REST, filesystem, and original gRPC xDS implementations all deliver “state of the world” updates: every CDS update must contain every cluster, with the absence of a cluster from an update implying that the cluster is gone. For Envoy deployments with huge amounts of resources and even a trickle of churn, these state-of-the-world updates can be cumbersome.

As of 1.12.0, Envoy supports a “delta” variant of xDS (including ADS), where updates only contain resources added/changed/removed. Delta xDS is a gRPC (only) protocol. Delta uses different request/response protos than SotW (DeltaDiscovery{Request,Response}); see discovery.proto. Conceptually, delta should be viewed as a new xDS transport type: there is static, filesystem, REST, gRPC-SotW, and now gRPC-delta. (Envoy’s implementation of the gRPC-SotW/delta client happens to share most of its code between the two, and something similar is likely possible on the server side. However, they are in fact incompatible protocols. The specification of the delta xDS protocol’s behavior is here.)

To use delta, simply set the api_type field of your ApiConfigSource proto(s) to DELTA_GRPC. That works for both xDS and ADS; for ADS, it’s the api_type field of DynamicResources.ads_config, as described in the previous section.

Management Server Unreachability

When an Envoy instance loses connectivity with the management server, Envoy will latch on to the previous configuration while actively retrying in the background to reestablish the connection with the management server.

Envoy debug logs the fact that it is not able to establish a connection with the management server every time it attempts a connection.

connected_state statistic provides a signal for monitoring this behavior.

Statistics

Management Server has a statistics tree rooted at control_plane. with the following statistics:

Name

Type

Description

connected_state

Gauge

A boolean (1 for connected and 0 for disconnected) that indicates the current connection state with management server

rate_limit_enforced

Counter

Total number of times rate limit was enforced for management server requests

pending_requests

Gauge

Total number of pending requests when the rate limit was enforced

xDS subscription statistics

Envoy discovers its various dynamic resources via discovery services referred to as xDS. Resources are requested via subscriptions, by specifying a filesystem path to watch, initiating gRPC streams or polling a REST-JSON URL.

The following statistics are generated for all subscriptions.

Name

Type

Description

config_reload

Counter

Total API fetches that resulted in a config reload due to a different config

init_fetch_timeout

Counter

Total initial fetch timeouts

update_attempt

Counter

Total API fetches attempted

update_success

Counter

Total API fetches completed successfully

update_failure

Counter

Total API fetches that failed because of network errors

update_rejected

Counter

Total API fetches that failed because of schema/validation errors

version

Gauge

Hash of the contents from the last successful API fetch

control_plane.connected_state

Gauge

A boolean (1 for connected and 0 for disconnected) that indicates the current connection state with management server