Architecture
The following diagram shows the main components of the Consul architecture when deployed to an ECS cluster:
- Consul servers: Production-ready Consul server cluster
- Application tasks: Runs user application containers along with two helper containers:
- Consul client: The Consul client container runs Consul. The Consul client communicates with the Consul server and configures the Envoy proxy sidecar. This communication is called control plane communication.
- Sidecar proxy: The sidecar proxy container runs Envoy. All requests to and from the application container(s) run through the sidecar proxy. This communication is called data plane communication.
- Mesh Init: Each task runs a short-lived container, called
mesh-init
, which sets up initial configuration for Consul and Envoy. - Health Syncing: Optionally, an additional
health-sync
container can be included in a task to sync health statuses from ECS into Consul. - ACL Controller: Automatically provisions Consul ACL tokens for Consul clients and service mesh services in an ECS Cluster.
For more information about how Consul works in general, see Consul’s Architecture Overview.
Task Startup
This diagram shows the timeline of a task starting up and all its containers:
T0: ECS starts the task. The
consul-client
andmesh-init
containers start:- consul-client uses the
retry-join
option to join the Consul cluster - mesh-init registers the service for the current task and its sidecar proxy with Consul. It runs
consul connect envoy -bootstrap
to generate Envoy’s bootstrap JSON file and write it to a shared volume.mesh-init
exits after completing these operations.
- consul-client uses the
T1: The following containers start:
- The
sidecar-proxy
container starts and runs Envoy by executingenvoy -c <path-to-bootstrap-json>
. - If applicable, the
health-sync
container syncs health checks from ECS to Consul (see ECS Health Check Syncing).
- The
- T2: The
sidecar-proxy
container is marked as healthy by ECS. It uses a health check that detects if its public listener port is open. At this time, your application containers are started since all Consul machinery is ready to service requests. The only running containers areconsul-client
,sidecar-proxy
, and your application container(s).
Task Shutdown
This diagram shows an example timeline of a task shutting down:
- T0: ECS sends a TERM signal to all containers. Each container reacts to the TERM signal:
- consul-client begins to gracefully leave the Consul cluster.
- health-sync stops syncing health status from ECS into Consul checks.
- sidecar-proxy ignores the TERM signal and continues running until the
user-app
container exits. This allows the application container to continue making outgoing requests through the proxy to the mesh. - user-app exits if it is not configured to ignore the TERM signal. The
user-app
container will continue running if it is configured to ignore the TERM signal.
- T1:
- health-sync updates its Consul checks to critical status and exits. This ensures this service instance is marked unhealthy.
- sidecar-proxy notices the
user-app
container has stopped and exits.
- T2:
consul-client
finishes gracefully leaving the Consul datacenter and exits. - T3:
- ECS notices all containers have exited, and will soon change the Task status to
STOPPED
- Updates about this task have reached the rest of the Consul cluster, so downstream proxies have been updated to stopped sending traffic to this task.
- ECS notices all containers have exited, and will soon change the Task status to
- T4: At this point task shutdown should be complete. Otherwise, ECS will send a KILL signal to any containers still running. The KILL signal cannot be ignored and will forcefully stop containers. This will interrupt in-progress operations and possibly cause errors.
Automatic ACL Token Provisioning
Consul ACL tokens secure communication between agents and services. The following containers in a task require an ACL token:
- consul-client: The Consul client uses a token to authorize itself with Consul servers. All
consul-client
containers share the same token. - mesh-init: The
mesh-init
container uses a token to register the service with Consul. This token is unique for the Consul service, and is shared by instances of the service.
The ACL controller automatically creates ACL tokens for mesh-enabled tasks in an ECS cluster. The acl-controller
Terraform module creates the ACL controller task. The controller creates the ACL token used by consul-client
containers at startup and then watches for tasks in the cluster. It checks tags to determine if the task is mesh-enabled. If so, it creates the service ACL token for the task, if the token does not yet exist.
The ACL controller stores all ACL tokens in AWS Secrets Manager, and tasks are configured to pull these tokens from AWS Secrets Manager when they start.
ECS Health Check Syncing
If the following conditions apply, ECS health checks automatically sync with Consul health checks for all application containers:
- marked as
essential
- have ECS
healthChecks
- are not configured with native Consul health checks
The mesh-init
container creates a TTL health check for every container that fits these criteria and the health-sync
container ensures that the ECS and Consul health checks remain in sync.