Egress Gateway
Note
Egress Gateway requires a 5.2 or more recent kernel.
The egress gateway feature routes all IPv4 connections originating from pods and destined to specific cluster-external CIDRs through particular nodes, from now on called “gateway nodes”.
When the egress gateway feature is enabled and egress gateway policies are in place, matching packets that leave the cluster are masqueraded with selected, predictable IPs associated with the gateway nodes. As an example, this feature can be used in combination with legacy firewalls to allow traffic to legacy infrastructure only from specific pods within a given namespace. The pods typically have ever-changing IP addresses, and even if masquerading was to be used as a way to mitigate this, the IP addresses of nodes can also change frequently over time.
This document explains how to enable the egress gateway feature and how to configure egress gateway policies to route and SNAT the egress traffic for a specific workload.
Note
This guide assumes that Cilium has been correctly installed in your Kubernetes cluster. Please see Quick Installation for more information. If unsure, run cilium status
and validate that Cilium is up and running.
Preliminary Considerations
Cilium must make use of network-facing interfaces and IP addresses present on the designated gateway nodes. These interfaces and IP addresses must be provisioned and configured by the operator based on their networking environment. The process is highly-dependent on said networking environment. For example, in AWS/EKS, and depending on the requirements, this may mean creating one or more Elastic Network Interfaces with one or more IP addresses and attaching them to instances that serve as gateway nodes so that AWS can adequately route traffic flowing from and to the instances. Other cloud providers have similar networking requirements and constructs.
Additionally, the enablement of the egress gateway feature requires that both BPF masquerading and the kube-proxy replacement are enabled, which may not be possible in all environments (due to, e.g., incompatible kernel versions).
Compatibility with other features
L7 policies
Egress gateway is currently partially incompatible with L7 policies. Specifically, when an egress gateway policy and an L7 policy both select the same endpoint, traffic from that endpoint will not go through egress gateway, even if the policy allows it. Full support will be added in an upcoming release once GitHub issue 19642 is resolved.
LB acceleration
When the egress gateway feature is used in combination with XDP-based LB acceleration (--bpf-lb-acceleration=native
), the user must ensure that the host Iptables configuration allows packets through the FORWARD
chain. Full support will be added in an upcoming release once GitHub issue 19717 is resolved.
Enable egress gateway
The egress gateway feature and all the requirements can be enabled as follow:
HelmConfigMap
$ helm upgrade cilium cilium/cilium --version 1.12.0 \
--namespace kube-system \
--reuse-values \
--set egressGateway.enabled=true \
--set bpf.masquerade=true \
--set kubeProxyReplacement=strict \
--set l7Proxy=false
enable-bpf-masquerade: true
enable-ipv4-egress-gateway: true
enable-l7-proxy: false
kube-proxy-replacement: strict
Rollout both the agent pods and the operator pods to make the changes effective:
$ kubectl rollout restart ds cilium -n kube-system
$ kubectl rollout restart deploy cilium-operator -n kube-system
Compatibility with cloud environments
Based on the specific configuration of the cloud provider and network interfaces it is possible that traffic leaves a node from the wrong interface.
To work around this issue, Cilium can be instructed to install the necessary IP rules and routes to route traffic through the appropriate network-facing interface as follow:
HelmConfigMap
$ helm upgrade cilium cilium/cilium --version 1.12.0 \
[..] \
--set egressGateway.installRoutes=true
install-egress-gateway-routes: true
Writing egress gateway policies
The API provided by Cilium to drive the egress gateway feature is the CiliumEgressGatewayPolicy
resource.
Metadata
CiliumEgressGatewayPolicy
is a cluster-scoped custom resource definition, so a .metadata.namespace
field should not be specified.
apiVersion: cilium.io/v2
kind: CiliumEgressGatewayPolicy
metadata:
name: example-policy
To target pods belonging to a given namespace only labels/expressions should be used instead (as described below).
Selecting source pods
The selectors
field of a CiliumEgressGatewayPolicy
resource is used to select source pods via a label selector. This can be done using matchLabels
:
selectors:
- podSelector:
matchLabels:
labelKey: labelVal
It can also be done using matchExpressions
:
selectors:
- podSelector:
matchExpressions:
- {key: testKey, operator: In, values: [testVal]}
- {key: testKey2, operator: NotIn, values: [testVal2]}
Moreover, multiple podSelector
can be specified:
selectors:
- podSelector:
[..]
- podSelector:
[..]
To select pods belonging to a given namespace, the special io.kubernetes.pod.namespace
label should be used.
Note
Only security identities will be taken into account. See Limiting Identity-Relevant Labels for more information.
Selecting the destination
One or more IPv4 destination CIDRs can be specified with destinationCIDRs
:
destinationCIDRs:
- "a.b.c.d/32"
- "e.f.g.0/24"
Note
Any IP belonging to these ranges which is also an internal cluster IP (e.g. pods, nodes, Kubernetes API server) will be excluded from the egress gateway SNAT logic.
Selecting and configuring the gateway node
The node that should act as gateway node for a given policy can be configured with the egressGateway
field. The node is matched based on its labels, with the nodeSelector
field:
egressGateway:
nodeSelector:
matchLabels:
testLabel: testVal
Note
In case multiple nodes are a match for the given set of labels, the first node in lexical ordering based on their name will be selected.
The IP address that should be used to SNAT traffic must also be configured. There are 3 different ways this can be achieved:
By specifying the interface:
egressGateway:
nodeSelector:
matchLabels:
testLabel: testVal
interface: ethX
In this case the first IPv4 address assigned to the
ethX
interface will be used.By explicitly specifying the egress IP:
egressGateway:
nodeSelector:
matchLabels:
testLabel: testVal
egressIP: a.b.c.d
Warning
The egress IP must be assigned to a network device on the node.
By omitting both
egressIP
andinterface
properties, which will make the agent use the first IPv4 assigned to the interface for the default route.egressGateway:
nodeSelector:
matchLabels:
testLabel: testVal
Regardless of which way the egress IP is configured, the user must ensure that Cilium is running on the device that has the egress IP assigned to it, by setting the --devices
agent option accordingly.
Example policy
Below is an example of a CiliumEgressGatewayPolicy
resource that conforms to the specification above:
apiVersion: cilium.io/v2
kind: CiliumEgressGatewayPolicy
metadata:
name: egress-sample
spec:
# Specify which pods should be subject to the current policy.
# Multiple pod selectors can be specified.
selectors:
- podSelector:
matchLabels:
org: empire
class: mediabot
# The following label selects default namespace
io.kubernetes.pod.namespace: default
# Specify which destination CIDR(s) this policy applies to.
# Multiple CIDRs can be specified.
destinationCIDRs:
- "0.0.0.0/0"
# Configure the gateway node.
egressGateway:
# Specify which node should act as gateway for this policy.
nodeSelector:
matchLabels:
node.kubernetes.io/name: a-specific-node
# Specify the IP address used to SNAT traffic matched by the policy.
# It must exist as an IP associated with a network interface on the instance.
egressIP: 10.168.60.100
# Alternatively it's possible to specify the interface to be used for egress traffic.
# In this case the first IPv4 assigned to that interface will be used as egress IP.
# interface: enp0s8
Creating the CiliumEgressGatewayPolicy
resource above would cause all traffic originating from pods with the org: empire
and class: mediabot
labels in the default
namespace and destined to 0.0.0.0/0
(i.e. all traffic leaving the cluster) to be routed through the gateway node with the node.kubernetes.io/name: a-specific-node
label, which will then SNAT said traffic with the 10.168.60.100
egress IP.
Testing the egress gateway feature
In this section we are going to show the necessary steps to test the feature. First we deploy a pod that connects to a cluster-external service. Then we apply a CiliumEgressGatewayPolicy
and observe that the pod’s connection gets redirected through the Gateway node. We assume a 2-node cluster with IPs 192.168.60.11
(node1) and 192.168.60.12
(node2). The client pod gets deployed to node1, and the CEGP selects node2 as Gateway node.
Create an external service (optional)
If you don’t have an external service to experiment with, you can use Nginx, as the server access logs will show from which IP address the request is coming.
Create an nginx service on a Linux node that is external to the existing Kubernetes cluster, and use it as the destination of the egress traffic:
$ # Install and start nginx
$ sudo apt install nginx
$ sudo systemctl start nginx
In this example, the IP associated with the host running the Nginx instance will be 192.168.60.13
.
Deploy client pods
Deploy a client pod that will be used to connect to the Nginx instance:
$ kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes-dns/dns-sw-app.yaml
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
pod/mediabot 1/1 Running 0 14s
$ kubectl exec mediabot -- curl http://192.168.60.13:80
Verify from the Nginx access log (or other external services) that the request is coming from one of the nodes in the Kubernetes cluster. In this example the access logs should contain something like:
$ tail /var/log/nginx/access.log
[...]
192.168.60.11 - - [04/Apr/2021:22:06:57 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.52.1"
since the client pod is running on the node 192.168.60.11
it is expected that, without any Cilium egress gateway policy in place, traffic will leave the cluster with the IP of the node.
Apply egress gateway policy
Download the egress-sample
Egress Gateway Policy yaml:
$ wget https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes-egress-gateway/egress-nat-policy-egress-gateway.yaml
Modify the destinationCIDRs
to include the IP of the host where your designated external service is running on.
Specifying an IP address in the egressIP
field is optional. To make things easier in this example, it is possible to comment out that line. This way, the agent will use the first IPv4 assigned to the interface for the default route.
To let the policy select the node designated to be the Egress Gateway, apply the label egress-node: test
to it:
$ kubectl label nodes <egress-gateway-node> egress-node=true
Note that the Egress Gateway node should be a different node from the one where the mediabot
pod is running on.
Apply the egress-sample
egress gateway Policy, which will cause all traffic from the mediabot pod to leave the cluster with the IP of the Egress Gateway node:
$ kubectl apply -f egress-nat-policy-egress-gateway.yaml
Verify the setup
We can now verify with the client pod that the policy is working correctly:
$ kubectl exec mediabot -- curl http://192.168.60.13:80
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
[...]
The access log from Nginx should show that the request is coming from the selected Egress IP rather than the one of the node where the pod is running:
$ tail /var/log/nginx/access.log
[...]
192.168.60.100 - - [04/Apr/2021:22:06:57 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.52.1"