Networking

This page explains how CoreDNS, the Traefik Ingress controller, and Klipper service load balancer work within K3s.

Refer to the Installation Network Options page for details on Flannel configuration options and backend selection, or how to set up your own CNI.

For information on which ports need to be opened for K3s, refer to the Networking Requirements.

CoreDNS

CoreDNS is deployed automatically on server startup. To disable it, configure all servers in the cluster with the --disable=coredns option.

If you don’t install CoreDNS, you will need to install a cluster DNS provider yourself.

Traefik Ingress Controller

Traefik is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It simplifies networking complexity while designing, deploying, and running applications.

The Traefik ingress controller deploys a LoadBalancer Service that uses ports 80 and 443. By default, ServiceLB will expose these ports on all cluster members, meaning these ports will not be usable for other HostPort or NodePort pods.

Traefik is deployed by default when starting the server. For more information see Managing Packaged Components. The default config file is found in /var/lib/rancher/k3s/server/manifests/traefik.yaml.

The traefik.yaml file should not be edited manually, as K3s will replace the file with defaults at startup. Instead, you should customize Traefik by creating an additional HelmChartConfig manifest in /var/lib/rancher/k3s/server/manifests. For more details and an example see Customizing Packaged Components with HelmChartConfig. For more information on the possible configuration values, refer to the official Traefik Helm Configuration Parameters..

To remove Traefik from your cluster, start all servers with the --disable=traefik flag.

K3s versions 1.20 and earlier include Traefik v1. K3s versions 1.21 and later install Traefik v2, unless an existing installation of Traefik v1 is found, in which case Traefik is not upgraded to v2. For more information on the specific version of Traefik included with K3s, consult the Release Notes for your version.

To migrate from an older Traefik v1 instance please refer to the Traefik documentation and migration tool.

Network Policy Controller

K3s includes an embedded network policy controller. The underlying implementation is kube-router’s netpol controller library (no other kube-router functionality is present) and can be found here.

To disable it, start each server with the --disable-network-policy flag.

Networking - 图1note

Network policy iptables rules are not removed if the K3s configuration is changed to disable the network policy controller. To clean up the configured kube-router network policy rules after disabling the network policy controller, use the k3s-killall.sh script, or clean them using iptables-save and iptables-restore. These steps must be run manually on all nodes in the cluster.

  1. iptables-save | grep -v KUBE-ROUTER | iptables-restore
  2. ip6tables-save | grep -v KUBE-ROUTER | ip6tables-restore

Service Load Balancer

Any LoadBalancer controller can be deployed to your K3s cluster. By default, K3s provides a load balancer known as ServiceLB (formerly Klipper LoadBalancer) that uses available host ports.

Upstream Kubernetes allows Services of type LoadBalancer to be created, but doesn’t include a default load balancer implementation, so these services will remain pending until one is installed. Many hosted services require a cloud provider such as Amazon EC2 or Microsoft Azure to offer an external load balancer implementation. By contrast, the K3s ServiceLB makes it possible to use LoadBalancer Services without a cloud provider or any additional configuration.

How ServiceLB Works

The ServiceLB controller watches Kubernetes Services with the spec.type field set to LoadBalancer.

For each LoadBalancer Service, a DaemonSet is created in the kube-system namespace. This DaemonSet in turn creates Pods with a svc- prefix, on each node. These Pods use iptables to forward traffic from the Pod’s NodePort, to the Service’s ClusterIP address and port.

If the ServiceLB Pod runs on a node that has an external IP configured, the node’s external IP is populated into the Service’s status.loadBalancer.ingress address list. Otherwise, the node’s internal IP is used.

If multiple LoadBalancer Services are created, a separate DaemonSet is created for each Service.

It is possible to expose multiple Services on the same node, as long as they use different ports.

If you try to create a LoadBalancer Service that listens on port 80, the ServiceLB will try to find a free host in the cluster for port 80. If no host with that port is available, the LB will remain Pending.

Usage

Create a Service of type LoadBalancer in K3s.

Controlling ServiceLB Node Selection

Adding the svccontroller.k3s.cattle.io/enablelb=true label to one or more nodes switches the ServiceLB controller into allow-list mode, where only nodes with the label are eligible to host LoadBalancer pods. Nodes that remain unlabeled will be excluded from use by ServiceLB.

Networking - 图2note

By default, nodes are not labeled. As long as all nodes remain unlabeled, all nodes with ports available will be used by ServiceLB.

Creating ServiceLB Node Pools

To select a particular subset of nodes to host pods for a LoadBalancer, add the enablelb label to the desired nodes, and set matching lbpool label values on the Nodes and Services. For example:

  1. Label Node A and Node B with svccontroller.k3s.cattle.io/lbpool=pool1 and svccontroller.k3s.cattle.io/enablelb=true
  2. Label Node C and Node D with svccontroller.k3s.cattle.io/lbpool=pool2 and svccontroller.k3s.cattle.io/enablelb=true
  3. Create one LoadBalancer Service on port 443 with label svccontroller.k3s.cattle.io/lbpool=pool1. The DaemonSet for this service only deploy Pods to Node A and Node B.
  4. Create another LoadBalancer Service on port 443 with label svccontroller.k3s.cattle.io/lbpool=pool2. The DaemonSet will only deploy Pods to Node C and Node D.

Disabling ServiceLB

To disable ServiceLB, configure all servers in the cluster with the --disable=servicelb flag.

This is necessary if you wish to run a different LB, such as MetalLB.

Deploying an External Cloud Controller Manager

In order to reduce binary size, K3s removes all “in-tree” (built-in) cloud providers. Instead, K3s provides an embedded Cloud Controller Manager (CCM) stub that does the following:

  • Sets node InternalIP and ExternalIP address fields based on the --node-ip and --node-external-ip flags.
  • Hosts the ServiceLB LoadBalancer controller.
  • Clears the node.cloudprovider.kubernetes.io/uninitialized taint that is present when the cloud-provider is set to external

Before deploying an external CCM, you must start all K3s servers with the --disable-cloud-controller flag to disable to embedded CCM.

Networking - 图3note

If you disable the built-in CCM and do not deploy and properly configure an external substitute, nodes will remain tainted and unschedulable.

Nodes Without a Hostname

Some cloud providers, such as Linode, will create machines with “localhost” as the hostname and others may not have a hostname set at all. This can cause problems with domain name resolution. You can run K3s with the --node-name flag or K3S_NODE_NAME environment variable and this will pass the node name to resolve this issue.

Multicluster CIDR (Experimental)

Networking - 图4Version Gate

Experimental as of v1.26.3+k3s1

Networking - 图5Warning

The network policy controller could not work properly when this flag is enabled.

From v1.26 Kubernetes introduced Multicluster CIDR as an alpha feature. (https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2593-multiple-cluster-cidrs)

This feature can be enabled on K3s server with the --multi-cluster-cidr flag and it gives the possibility to define multiple cluster CIDR used to allocate the podCIDR for each node with also the possibility to extend it on an already running cluster. The clustercidr resources will be visible through the API and kubectl (the CIDR configured with --cluster-cidr is defined as the default one).

A new clustercidr can be defined as follow:

  1. apiVersion: networking.k8s.io/v1alpha1
  2. kind: ClusterCIDR
  3. metadata:
  4. name: new-cidr
  5. spec:
  6. nodeSelector:
  7. nodeSelectorTerms:
  8. - matchExpressions:
  9. - key: kubernetes.io/hostname
  10. operator: In
  11. values:
  12. - "worker2"
  13. perNodeHostBits: 8
  14. ipv4: 10.247.0.0/16

The nodes that match the nodeSelector will use a podCIDR from the new defined resource.

Networking - 图6note

A node that already has a CIDR cannot get a new one. It has to be removed and restarted.

Networking - 图7Warning

A dualstack CIDR could be defined with both ipv4 and ipv6 configuration but the perNodeHostBits will be the same. When a dualstack configuration is defined with --cluster-cidr the --node-cidr-mask-size-ipv6 flag on the kube-controller should be defined to have the same size of IPv4.