Networking
This page explains how CoreDNS, the Traefik Ingress controller, and Klipper service load balancer work within K3s.
Refer to the Installation Network Options page for details on Flannel configuration options and backend selection, or how to set up your own CNI.
For information on which ports need to be opened for K3s, refer to the Networking Requirements.
CoreDNS
CoreDNS is deployed automatically on server startup. To disable it, configure all servers in the cluster with the --disable=coredns
option.
If you don’t install CoreDNS, you will need to install a cluster DNS provider yourself.
Traefik Ingress Controller
Traefik is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It simplifies networking complexity while designing, deploying, and running applications.
The Traefik ingress controller deploys a LoadBalancer Service that uses ports 80 and 443. By default, ServiceLB will expose these ports on all cluster members, meaning these ports will not be usable for other HostPort or NodePort pods.
Traefik is deployed by default when starting the server. For more information see Managing Packaged Components. The default config file is found in /var/lib/rancher/k3s/server/manifests/traefik.yaml
.
The traefik.yaml
file should not be edited manually, as K3s will replace the file with defaults at startup. Instead, you should customize Traefik by creating an additional HelmChartConfig
manifest in /var/lib/rancher/k3s/server/manifests
. For more details and an example see Customizing Packaged Components with HelmChartConfig. For more information on the possible configuration values, refer to the official Traefik Helm Configuration Parameters..
To remove Traefik from your cluster, start all servers with the --disable=traefik
flag.
K3s versions 1.20 and earlier include Traefik v1. K3s versions 1.21 and later install Traefik v2, unless an existing installation of Traefik v1 is found, in which case Traefik is not upgraded to v2. For more information on the specific version of Traefik included with K3s, consult the Release Notes for your version.
To migrate from an older Traefik v1 instance please refer to the Traefik documentation and migration tool.
Network Policy Controller
K3s includes an embedded network policy controller. The underlying implementation is kube-router’s netpol controller library (no other kube-router functionality is present) and can be found here.
To disable it, start each server with the --disable-network-policy
flag.
note
Network policy iptables rules are not removed if the K3s configuration is changed to disable the network policy controller. To clean up the configured kube-router network policy rules after disabling the network policy controller, use the k3s-killall.sh
script, or clean them using iptables-save
and iptables-restore
. These steps must be run manually on all nodes in the cluster.
iptables-save | grep -v KUBE-ROUTER | iptables-restore
ip6tables-save | grep -v KUBE-ROUTER | ip6tables-restore
Service Load Balancer
Any LoadBalancer controller can be deployed to your K3s cluster. By default, K3s provides a load balancer known as ServiceLB (formerly Klipper LoadBalancer) that uses available host ports.
Upstream Kubernetes allows Services of type LoadBalancer to be created, but doesn’t include a default load balancer implementation, so these services will remain pending
until one is installed. Many hosted services require a cloud provider such as Amazon EC2 or Microsoft Azure to offer an external load balancer implementation. By contrast, the K3s ServiceLB makes it possible to use LoadBalancer Services without a cloud provider or any additional configuration.
How ServiceLB Works
The ServiceLB controller watches Kubernetes Services with the spec.type
field set to LoadBalancer
.
For each LoadBalancer Service, a DaemonSet is created in the kube-system
namespace. This DaemonSet in turn creates Pods with a svc-
prefix, on each node. These Pods use iptables to forward traffic from the Pod’s NodePort, to the Service’s ClusterIP address and port.
If the ServiceLB Pod runs on a node that has an external IP configured, the node’s external IP is populated into the Service’s status.loadBalancer.ingress
address list. Otherwise, the node’s internal IP is used.
If multiple LoadBalancer Services are created, a separate DaemonSet is created for each Service.
It is possible to expose multiple Services on the same node, as long as they use different ports.
If you try to create a LoadBalancer Service that listens on port 80, the ServiceLB will try to find a free host in the cluster for port 80. If no host with that port is available, the LB will remain Pending.
Usage
Create a Service of type LoadBalancer in K3s.
Controlling ServiceLB Node Selection
Adding the svccontroller.k3s.cattle.io/enablelb=true
label to one or more nodes switches the ServiceLB controller into allow-list mode, where only nodes with the label are eligible to host LoadBalancer pods. Nodes that remain unlabeled will be excluded from use by ServiceLB.
note
By default, nodes are not labeled. As long as all nodes remain unlabeled, all nodes with ports available will be used by ServiceLB.
Creating ServiceLB Node Pools
To select a particular subset of nodes to host pods for a LoadBalancer, add the enablelb
label to the desired nodes, and set matching lbpool
label values on the Nodes and Services. For example:
- Label Node A and Node B with
svccontroller.k3s.cattle.io/lbpool=pool1
andsvccontroller.k3s.cattle.io/enablelb=true
- Label Node C and Node D with
svccontroller.k3s.cattle.io/lbpool=pool2
andsvccontroller.k3s.cattle.io/enablelb=true
- Create one LoadBalancer Service on port 443 with label
svccontroller.k3s.cattle.io/lbpool=pool1
. The DaemonSet for this service only deploy Pods to Node A and Node B. - Create another LoadBalancer Service on port 443 with label
svccontroller.k3s.cattle.io/lbpool=pool2
. The DaemonSet will only deploy Pods to Node C and Node D.
Disabling ServiceLB
To disable ServiceLB, configure all servers in the cluster with the --disable=servicelb
flag.
This is necessary if you wish to run a different LB, such as MetalLB.
Deploying an External Cloud Controller Manager
In order to reduce binary size, K3s removes all “in-tree” (built-in) cloud providers. Instead, K3s provides an embedded Cloud Controller Manager (CCM) stub that does the following:
- Sets node InternalIP and ExternalIP address fields based on the
--node-ip
and--node-external-ip
flags. - Hosts the ServiceLB LoadBalancer controller.
- Clears the
node.cloudprovider.kubernetes.io/uninitialized
taint that is present when the cloud-provider is set toexternal
Before deploying an external CCM, you must start all K3s servers with the --disable-cloud-controller
flag to disable to embedded CCM.
note
If you disable the built-in CCM and do not deploy and properly configure an external substitute, nodes will remain tainted and unschedulable.
Nodes Without a Hostname
Some cloud providers, such as Linode, will create machines with “localhost” as the hostname and others may not have a hostname set at all. This can cause problems with domain name resolution. You can run K3s with the --node-name
flag or K3S_NODE_NAME
environment variable and this will pass the node name to resolve this issue.
Multicluster CIDR (Experimental)
Version Gate
Experimental as of v1.26.3+k3s1
Warning
The network policy controller could not work properly when this flag is enabled.
From v1.26
Kubernetes introduced Multicluster CIDR as an alpha feature. (https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2593-multiple-cluster-cidrs)
This feature can be enabled on K3s server with the --multi-cluster-cidr
flag and it gives the possibility to define multiple cluster CIDR used to allocate the podCIDR for each node with also the possibility to extend it on an already running cluster. The clustercidr
resources will be visible through the API and kubectl
(the CIDR configured with --cluster-cidr
is defined as the default one).
A new clustercidr
can be defined as follow:
apiVersion: networking.k8s.io/v1alpha1
kind: ClusterCIDR
metadata:
name: new-cidr
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "worker2"
perNodeHostBits: 8
ipv4: 10.247.0.0/16
The nodes that match the nodeSelector
will use a podCIDR from the new defined resource.
note
A node that already has a CIDR cannot get a new one. It has to be removed and restarted.
Warning
A dualstack CIDR could be defined with both ipv4
and ipv6
configuration but the perNodeHostBits
will be the same. When a dualstack configuration is defined with --cluster-cidr
the --node-cidr-mask-size-ipv6
flag on the kube-controller
should be defined to have the same size of IPv4.