Local rate limit

The HTTP local rate limit filter applies a token bucket rate limit when the request’s route or virtual host has a per filter local rate limit configuration.

If the local rate limit token bucket is checked, and there are no tokens available, a 429 response is returned (the response is configurable). The local rate limit filter then sets the x-envoy-ratelimited response header. Additional response headers can be configured to be returned.

Request headers can be configured to be added to forwarded requests to the upstream when the local rate limit filter is enabled but not enforced.

Depending on the value of the config local_rate_limit_per_downstream_connection, the token bucket is either shared across all workers or on a per connection basis. This results in the local rate limits being applied either per Envoy process or per downstream connection. By default the rate limits are applied per Envoy process.

Example configuration

Example filter configuration for a globally set rate limiter (e.g.: all vhosts/routes share the same token bucket):

  1. name: envoy.filters.http.local_ratelimit
  2. typed_config:
  3. "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  4. stat_prefix: http_local_rate_limiter
  5. token_bucket:
  6. max_tokens: 10000
  7. tokens_per_fill: 1000
  8. fill_interval: 1s
  9. filter_enabled:
  10. runtime_key: local_rate_limit_enabled
  11. default_value:
  12. numerator: 100
  13. denominator: HUNDRED
  14. filter_enforced:
  15. runtime_key: local_rate_limit_enforced
  16. default_value:
  17. numerator: 100
  18. denominator: HUNDRED
  19. response_headers_to_add:
  20. - append: false
  21. header:
  22. key: x-local-rate-limit
  23. value: 'true'
  24. local_rate_limit_per_downstream_connection: false

Example filter configuration for a globally disabled rate limiter but enabled for a specific route:

  1. name: envoy.filters.http.local_ratelimit
  2. typed_config:
  3. "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  4. stat_prefix: http_local_rate_limiter

The route specific configuration:

  1. route_config:
  2. name: local_route
  3. virtual_hosts:
  4. - name: local_service
  5. domains: ["*"]
  6. routes:
  7. - match: { prefix: "/path/with/rate/limit" }
  8. route: { cluster: service_protected_by_rate_limit }
  9. typed_per_filter_config:
  10. envoy.filters.http.local_ratelimit:
  11. "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  12. token_bucket:
  13. max_tokens: 10000
  14. tokens_per_fill: 1000
  15. fill_interval: 1s
  16. filter_enabled:
  17. runtime_key: local_rate_limit_enabled
  18. default_value:
  19. numerator: 100
  20. denominator: HUNDRED
  21. filter_enforced:
  22. runtime_key: local_rate_limit_enforced
  23. default_value:
  24. numerator: 100
  25. denominator: HUNDRED
  26. response_headers_to_add:
  27. - append: false
  28. header:
  29. key: x-local-rate-limit
  30. value: 'true'
  31. - match: { prefix: "/" }
  32. route: { cluster: default_service }

Note that if this filter is configured as globally disabled and there are no virtual host or route level token buckets, no rate limiting will be applied.

Using rate limit descriptors for local rate limiting

Rate limit descriptors can be used to override local per-route rate limiting. A route’s rate limit action is used to match up a local descriptor in the filter config descriptor list. The local descriptor’s token bucket settings are then used to decide if the request should be rate limited or not depending on whether the local descriptor’s entries match the route’s rate limit actions descriptor entries. If there is no matching descriptor entries, the default token bucket is used.

Example filter configuration using descriptors:

local-rate-limit-with-descriptors.yaml

  1. route_config:
  2. name: local_route
  3. virtual_hosts:
  4. - name: local_service
  5. domains: ["*"]
  6. routes:
  7. - match: { prefix: "/foo" }
  8. route:
  9. cluster: service_protected_by_rate_limit
  10. rate_limits:
  11. - actions: # any actions in here
  12. - request_headers:
  13. header_name: x-envoy-downstream-service-cluster
  14. descriptor_key: client_cluster
  15. - request_headers:
  16. header_name: ":path"
  17. descriptor_key: path
  18. typed_per_filter_config:
  19. envoy.filters.http.local_ratelimit:
  20. "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  21. stat_prefix: test
  22. token_bucket:
  23. max_tokens: 1000
  24. tokens_per_fill: 1000
  25. fill_interval: 60s
  26. filter_enabled:
  27. runtime_key: test_enabled
  28. default_value:
  29. numerator: 100
  30. denominator: HUNDRED
  31. filter_enforced:
  32. runtime_key: test_enforced
  33. default_value:
  34. numerator: 100
  35. denominator: HUNDRED
  36. response_headers_to_add:
  37. - append: false
  38. header:
  39. key: x-test-rate-limit
  40. value: 'true'
  41. descriptors:
  42. - entries:
  43. - key: client_cluster
  44. value: foo
  45. - key: path
  46. value: /foo/bar
  47. token_bucket:
  48. max_tokens: 10
  49. tokens_per_fill: 10
  50. fill_interval: 60s
  51. - entries:
  52. - key: client_cluster
  53. value: foo
  54. - key: path
  55. value: /foo/bar2
  56. token_bucket:
  57. max_tokens: 100
  58. tokens_per_fill: 100
  59. fill_interval: 60s
  60. - match: { prefix: "/" }
  61. route: { cluster: default_service }

In this example, requests are rate-limited for routes prefixed with “/foo” as follow. If requests come from a downstream service cluster “foo” for “/foo/bar” path, then 10 req/min are allowed. But if they come from a downstream service cluster “foo” for “/foo/bar2” path, then 100 req/min are allowed. Otherwise, 1000 req/min are allowed.

Statistics

The local rate limit filter outputs statistics in the <stat_prefix>.http_local_rate_limit. namespace. 429 responses – or the configured status code – are emitted to the normal cluster dynamic HTTP statistics.

Name

Type

Description

enabled

Counter

Total number of requests for which the rate limiter was consulted

ok

Counter

Total under limit responses from the token bucket

rate_limited

Counter

Total responses without an available token (but not necessarily enforced)

enforced

Counter

Total number of requests for which rate limiting was applied (e.g.: 429 returned)

Runtime

The HTTP rate limit filter supports the following runtime fractional settings:

http_filter_enabled

% of requests that will check the local rate limit decision, but not enforce, for a given route_key specified in the local rate limit configuration. Defaults to 0.

http_filter_enforcing

% of requests that will enforce the local rate limit decision for a given route_key specified in the local rate limit configuration. Defaults to 0. This can be used to test what would happen before fully enforcing the outcome.