Setting up Support for External Workloads (beta)
This is a step-by-step guide on how to add external workloads (such as VMs) in to your Kubernetes cluster and to enforce security policies to restrict access.
Note
This is a beta feature. Please provide feedback and file a GitHub issue if you experience any problems.
Prerequisites
- Cilium must be configured to use Kubernetes for identity allocation (
identityAllocationMode
set tocrd
). This is the default for new installations. - External workloads must run a recent enough kernel (>= 4.17) for k8s service access from the external host to work, see Host-Reachable Services for details.
- External workloads must have Docker 20.10 or newer installed on the system (a version which supports
--cgroupns
CLI option). - External workloads must have IP connectivity with the nodes in your cluster. This requirement is typically met by running your VMs in the same cloud provider virtual network (e.g., GCP VPC) as your k8s cluster, or establishing peering or VPN tunnels between the networks of the nodes of your cluster and your external workloads. Note that this precludes any VMs running behind NATs.
- All external workloads must have a unique IP address assigned them. Node IPs of such nodes and your clusters must not conflict with each other.
- The network between the external workloads and your cluster must allow the node-cluster communication. The exact ports are documented in the Firewall Rules section.
- This guide assumes your external workload manages domain name resolution service by a stand-alone
/etc/resolv.conf
, or via systemd (e.g., Ubuntu). - So far this functionality is only tested with the vxlan tunneling datapath mode (default for most installations).
Limitations
- Transparent encryption of traffic to/from external workloads is currently not supported.
Prepare your cluster
Enable support for external workloads
Your cluster must be configured with support for external workloads enabled. This can be done with the cilium CLI tool by issuing cilium clustermesh enable
after cilium install
:
cilium install --config tunnel=vxlan
cilium clustermesh enable
Config option tunnel=vxlan
overrides any default that could otherwise be auto-detected for your k8s cluster. This is currently a requirement for external workload support.
Note
If this fails indicating that --service-type
needs to be given, add --service-type NodePort
to the second command above, i.e. cilium clustermesh enable --service-type NodePort
. This will allow you to go through this guide, but be warned that NodePort service type makes your installation very fragile, it will become non-functional if the node through which the service is accessed is removed from the cluster or if it otherwise becomes unreachable.
This will add a deployment for clustermesh-apiserver
into your cluster, as well as the related cluster resources, such as TLS secrets. clustermesh-apiserver
service is exposed to the external workloads. If your are on GKE, EKS, or AKS, this is done by default using the internal LoadBalancer
service type. Override the auto-detection with an explicit --service-type LoadBalancer
to use an external LoadBalancer service type that uses an IP that is accessible from outside of the cluster.
Note
Use the --help
option after any of the cilium clustermesh
commands to see a short synopsis of available command options.
Tell your cluster about external workloads
To allow an external workload to join your cluster, the cluster must be informed about each such workload. This is done by creating a CiliumExternalWorkload
(CEW) resource for each external workload. CEW resource specifies the name and identity labels (including namespace) for the workload. The name must be the hostname of the external workload, as returned by the hostname
command run in the external workload. In this example this is runtime
. For now you must also allocate a small IP CIDR that must be unique to each workload. For example, for a VM named runtime
that is to join the default
namespace (vm
is an alias for the external-workload
subcommand):
cilium clustermesh vm create runtime -n default --ipv4-alloc-cidr 10.192.1.0/30
-n
is an alias for --namespace
and can be left out when the value is default
. The namespace value will be set as an identity label. The CEW resource itself is not namespaced.
To see the list of existing CEW resources, run:
cilium clustermesh vm status
Note that CEW resources are not namespaced, so this command shows the status of all CEW resources regardless of the namespace label that was used when creating them. --namespace
option for the status command controls the namespace of Cilium deployment in your cluster and usually needs to be left as the default kube-system
.
At this point the IP:
in the status for runtime
is N/A
to inform that the VM has not yet joined the cluster.
Install and configure Cilium on external workloads
Run the external workload install command on your k8s cluster. This extracts the TLS certificates and other access information from the cluster installation and writes out an installation script to be used in the external workloads to install Cilium and connect it to your k8s cluster:
cilium clustermesh vm install install-external-workload.sh
Note that the created script embeds the IP address for the clustermesh-apiserver
service. If service type LoadBalancer
can not be used, this IP address will be the one of the first node in your k8s cluster (for NodePort
service type). If this node is removed from the cluster the above step for creating the installation script must be repeated and all the external workloads reinstalled. LoadBalancer
is not affected by a node removal.
Log in to the external workload. First make sure the hostname matches the name used in the CiliumExternalWorkload resource:
hostname
Next, copy install-external-workload.sh
created above to the external workload. Then run the installation script:
./install-external-workload.sh
This command launches the Cilium agent in a docker container named cilium
and copies the cilium
node CLI to your host. This needs sudo
permissions, so you may be asked for a password. Note that this cilium
command is not the same as the cilium
CLI used to manage Cilium installation on a k8s cluster.
This command waits until the node has been connected to the cluster and the cluster services are available. Then it re-configures /etc/resolv.conf
with the IP address of the kube-dns
service.
Note
If your external workload node has multiple IP addresses you may need to tell Cilium agent which IP to use. To this end add HOST_IP=<ip-address>
to the beginning of the command line above.
Verify basic connectivity
Next you can check the status of the Cilium agent in your external workload:
cilium status
You should see something like:
KVStore: Ok etcd: 1/1 connected, lease-ID=7c02748328e75f57, lock lease-ID=7c02748328e75f59, has-quorum=true: https://clustermesh-apiserver.cilium.io:32379 - 3.4.13 (Leader)
Kubernetes: Disabled
...
Check that cluster DNS works:
nslookup -norecurse clustermesh-apiserver.kube-system.svc.cluster.local
Inspecting status changes in the cluster
The following command in your cluster should show the external workload IPs and their Cilium security IDs:
kubectl get cew
External workloads should also be visible as Cilium Endpoints:
kubectl get cep
Apply Cilium Network Policy to enforce traffic from external workloads
From the external workload, ping the backend IP of clustermesh-apiserver
service to verify connectivity:
ping $(cilium service list get -o jsonpath='{[?(@.spec.flags.name=="clustermesh-apiserver")].spec.backend-addresses[0].ip}')
The ping should keep running also when the following CCNP is applied in your cluster:
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: test-ccnp
namespace: kube-system
spec:
endpointSelector:
matchLabels:
k8s-app: clustermesh-apiserver
ingress:
- fromEndpoints:
- matchLabels:
io.kubernetes.pod.name: runtime
- toPorts:
- ports:
- port: "2379"
protocol: TCP
The ping should stop if you delete these lines from the policy (e.g., kubectl edit ccnp test-ccnp
):
- fromEndpoints:
- matchLabels:
io.kubernetes.pod.name: runtime
The ping should continue if you delete the policy:
kubectl delete ccnp test-ccnp
Clean-up
You can remove the Cilium installation from your external workload by running the installation script with the uninstall
argument:
./install-external-workload.sh uninstall
Conclusion
With the above we have enabled policy-based communication between external workloads and pods in your Kubernetes cluster. We have also established service load-balancing from external workloads to your cluster backends, and configured domain name lookup in the external workload to be served by kube-dns of your cluster.