- Advanced Options / Configuration
- Certificate Management
- Token Management
- Configuring an HTTP proxy
- Using Docker as the Container Runtime
- Using etcdctl
- Configuring containerd
- NVIDIA Container Runtime Support
- Running Agentless Servers (Experimental)
- Running Rootless Servers (Experimental)
- Node Labels and Taints
- Starting the Service with the Installation Script
- Additional OS Preparations
- Running K3s in Docker
- SELinux Support
- Enabling Lazy Pulling of eStargz (Experimental)
- Additional Logging Sources
- Additional Network Policy Logging
Advanced Options / Configuration
This section contains advanced information describing the different ways you can run and manage K3s, as well as steps necessary to prepare the host OS for K3s use.
Certificate Management
Certificate Authority Certificates
K3s generates self-signed Certificate Authority (CA) Certificates during startup of the first server node. These CA certificates are valid for 10 years, and are not automatically renewed.
For information on using custom CA certificates, or renewing the self-signed CA certificates, see the k3s certificate rotate-ca command documentation.
Client and Server certificates
K3s client and server certificates are valid for 365 days from their date of issuance. Any certificates that are expired, or within 90 days of expiring, are automatically renewed every time K3s starts.
For information on manually rotating client and server certificates, see the k3s certificate rotate command documentation.
Token Management
By default, K3s uses a single static token for both servers and agents. This token cannot be changed once the cluster has been created. It is possible to enable a second static token that can only be used to join agents, or to create temporary kubeadm
style join tokens that expire automatically. For more information, see the k3s token command documentation.
Configuring an HTTP proxy
If you are running K3s in an environment, which only has external connectivity through an HTTP proxy, you can configure your proxy settings on the K3s systemd service. These proxy settings will then be used in K3s and passed down to the embedded containerd and kubelet.
The K3s installation script will automatically take the HTTP_PROXY
, HTTPS_PROXY
and NO_PROXY
, as well as the CONTAINERD_HTTP_PROXY
, CONTAINERD_HTTPS_PROXY
and CONTAINERD_NO_PROXY
variables from the current shell, if they are present, and write them to the environment file of your systemd service, usually:
/etc/systemd/system/k3s.service.env
/etc/systemd/system/k3s-agent.service.env
Of course, you can also configure the proxy by editing these files.
K3s will automatically add the cluster internal Pod and Service IP ranges and cluster DNS domain to the list of NO_PROXY
entries. You should ensure that the IP address ranges used by the Kubernetes nodes themselves (i.e. the public and private IPs of the nodes) are included in the NO_PROXY
list, or that the nodes can be reached through the proxy.
HTTP_PROXY=http://your-proxy.example.com:8888
HTTPS_PROXY=http://your-proxy.example.com:8888
NO_PROXY=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
If you want to configure the proxy settings for containerd without affecting K3s and the Kubelet, you can prefix the variables with CONTAINERD_
:
CONTAINERD_HTTP_PROXY=http://your-proxy.example.com:8888
CONTAINERD_HTTPS_PROXY=http://your-proxy.example.com:8888
CONTAINERD_NO_PROXY=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
Using Docker as the Container Runtime
K3s includes and defaults to containerd, an industry-standard container runtime. As of Kubernetes 1.24, the Kubelet no longer includes dockershim, the component that allows the kubelet to communicate with dockerd. K3s 1.24 and higher include cri-dockerd, which allows seamless upgrade from prior releases of K3s while continuing to use the Docker container runtime.
To use Docker instead of containerd:
Install Docker on the K3s node. One of Rancher’s Docker installation scripts can be used to install Docker:
curl https://releases.rancher.com/install-docker/20.10.sh | sh
Install K3s using the
--docker
option:curl -sfL https://get.k3s.io | sh -s - --docker
Confirm that the cluster is available:
$ sudo k3s kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system local-path-provisioner-6d59f47c7-lncxn 1/1 Running 0 51s
kube-system metrics-server-7566d596c8-9tnck 1/1 Running 0 51s
kube-system helm-install-traefik-mbkn9 0/1 Completed 1 51s
kube-system coredns-8655855d6-rtbnb 1/1 Running 0 51s
kube-system svclb-traefik-jbmvl 2/2 Running 0 43s
kube-system traefik-758cd5fc85-2wz97 1/1 Running 0 43s
Confirm that the Docker containers are running:
$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3e4d34729602 897ce3c5fc8f "entry" About a minute ago Up About a minute k8s_lb-port-443_svclb-traefik-jbmvl_kube-system_d46f10c6-073f-4c7e-8d7a-8e7ac18f9cb0_0
bffdc9d7a65f rancher/klipper-lb "entry" About a minute ago Up About a minute k8s_lb-port-80_svclb-traefik-jbmvl_kube-system_d46f10c6-073f-4c7e-8d7a-8e7ac18f9cb0_0
436b85c5e38d rancher/library-traefik "/traefik --configfi…" About a minute ago Up About a minute k8s_traefik_traefik-758cd5fc85-2wz97_kube-system_07abe831-ffd6-4206-bfa1-7c9ca4fb39e7_0
de8fded06188 rancher/pause:3.1 "/pause" About a minute ago Up About a minute k8s_POD_svclb-traefik-jbmvl_kube-system_d46f10c6-073f-4c7e-8d7a-8e7ac18f9cb0_0
7c6a30aeeb2f rancher/pause:3.1 "/pause" About a minute ago Up About a minute k8s_POD_traefik-758cd5fc85-2wz97_kube-system_07abe831-ffd6-4206-bfa1-7c9ca4fb39e7_0
ae6c58cab4a7 9d12f9848b99 "local-path-provisio…" About a minute ago Up About a minute k8s_local-path-provisioner_local-path-provisioner-6d59f47c7-lncxn_kube-system_2dbd22bf-6ad9-4bea-a73d-620c90a6c1c1_0
be1450e1a11e 9dd718864ce6 "/metrics-server" About a minute ago Up About a minute k8s_metrics-server_metrics-server-7566d596c8-9tnck_kube-system_031e74b5-e9ef-47ef-a88d-fbf3f726cbc6_0
4454d14e4d3f c4d3d16fe508 "/coredns -conf /etc…" About a minute ago Up About a minute k8s_coredns_coredns-8655855d6-rtbnb_kube-system_d05725df-4fb1-410a-8e82-2b1c8278a6a1_0
c3675b87f96c rancher/pause:3.1 "/pause" About a minute ago Up About a minute k8s_POD_coredns-8655855d6-rtbnb_kube-system_d05725df-4fb1-410a-8e82-2b1c8278a6a1_0
4b1fddbe6ca6 rancher/pause:3.1 "/pause" About a minute ago Up About a minute k8s_POD_local-path-provisioner-6d59f47c7-lncxn_kube-system_2dbd22bf-6ad9-4bea-a73d-620c90a6c1c1_0
64d3517d4a95 rancher/pause:3.1 "/pause"
Using etcdctl
etcdctl provides a CLI for interacting with etcd servers. K3s does not bundle etcdctl.
If you would like to use etcdctl to interact with K3s’s embedded etcd, install etcdctl using the official documentation.
ETCD_VERSION="v3.5.5"
ETCD_URL="https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz"
curl -sL ${ETCD_URL} | sudo tar -zxv --strip-components=1 -C /usr/local/bin
You may then use etcdctl by configuring it to use the K3s-managed certificates and keys for authentication:
sudo etcdctl version \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key
Configuring containerd
K3s will generate config.toml for containerd in /var/lib/rancher/k3s/agent/etc/containerd/config.toml
.
For advanced customization for this file you can create another file called config.toml.tmpl
in the same directory, and it will be used instead.
The config.toml.tmpl
will be treated as a Go template file, and the config.Node
structure is being passed to the template. See this folder for Linux and Windows examples on how to use the structure to customize the configuration file. The config.Node golang struct is defined here
NVIDIA Container Runtime Support
K3s will automatically detect and configure the NVIDIA container runtime if it is present when K3s starts.
- Install the nvidia-container package repository on the node by following the instructions at: https://nvidia.github.io/libnvidia-container/
- Install the nvidia container runtime packages. For example:
apt install -y nvidia-container-runtime cuda-drivers-fabricmanager-515 nvidia-headless-515-server
- Install K3s, or restart it if already installed:
curl -ksL get.k3s.io | sh -
- Confirm that the nvidia container runtime has been found by k3s:
grep nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml
This will automatically add nvidia
and/or nvidia-experimental
runtimes to the containerd configuration, depending on what runtime executables are found. You must still add a RuntimeClass definition to your cluster, and deploy Pods that explicitly request the appropriate runtime by setting runtimeClassName: nvidia
in the Pod spec:
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
---
apiVersion: v1
kind: Pod
metadata:
name: nbody-gpu-benchmark
namespace: default
spec:
restartPolicy: OnFailure
runtimeClassName: nvidia
containers:
- name: cuda-container
image: nvcr.io/nvidia/k8s/cuda-sample:nbody
args: ["nbody", "-gpu", "-benchmark"]
resources:
limits:
nvidia.com/gpu: 1
env:
- name: NVIDIA_VISIBLE_DEVICES
value: all
- name: NVIDIA_DRIVER_CAPABILITIES
value: all
Note that the NVIDIA Container Runtime is also frequently used with the NVIDIA Device Plugin and GPU Feature Discovery, which must be installed separately, with modifications to ensure that pod specs include runtimeClassName: nvidia
, as mentioned above.
Running Agentless Servers (Experimental)
Warning: This feature is experimental.
When started with the --disable-agent
flag, servers do not run the kubelet, container runtime, or CNI. They do not register a Node resource in the cluster, and will not appear in kubectl get nodes
output. Because they do not host a kubelet, they cannot run pods or be managed by operators that rely on enumerating cluster nodes, including the embedded etcd controller and the system upgrade controller.
Running agentless servers may be advantageous if you want to obscure your control-plane nodes from discovery by agents and workloads, at the cost of increased administrative overhead caused by lack of cluster operator support.
By default, the apiserver on agentless servers will not be able to make outgoing connections to admission webhooks or aggregated apiservices running within the cluster. To remedy this, set the --egress-selector-mode
server flag to either pod
or cluster
. If you are changing this flag on an existing cluster, you’ll need to restart all nodes in the cluster for the option to take effect.
Running Rootless Servers (Experimental)
Warning: This feature is experimental.
Rootless mode allows running K3s servers as an unprivileged user, so as to protect the real root on the host from potential container-breakout attacks.
See https://rootlesscontaine.rs/ to learn more about Rootless Kubernetes.
Known Issues with Rootless mode
Ports
When running rootless a new network namespace is created. This means that K3s instance is running with networking fairly detached from the host. The only way to access Services run in K3s from the host is to set up port forwards to the K3s network namespace. Rootless K3s includes controller that will automatically bind 6443 and service ports below 1024 to the host with an offset of 10000.
For example, a Service on port 80 will become 10080 on the host, but 8080 will become 8080 without any offset. Currently, only LoadBalancer Services are automatically bound.
Cgroups
Cgroup v1 and Hybrid v1/v2 are not supported; only pure Cgroup v2 is supported. If K3s fails to start due to missing cgroups when running rootless, it is likely that your node is in Hybrid mode, and the “missing” cgroups are still bound to a v1 controller.
Multi-node/multi-process cluster
Multi-node rootless clusters, or multiple rootless k3s processes on the same node, are not currently supported. See #6488 for more details.
Starting Rootless Servers
Enable cgroup v2 delegation, see https://rootlesscontaine.rs/getting-started/common/cgroup2/ . This step is required; the rootless kubelet will fail to start without the proper cgroups delegated.
Download
k3s-rootless.service
from https://github.com/k3s-io/k3s/blob//k3s-rootless.service . Make sure to use the same version ofk3s-rootless.service
andk3s
.Install
k3s-rootless.service
to~/.config/systemd/user/k3s-rootless.service
. Installing this file as a system-wide service (/etc/systemd/...
) is not supported. Depending on the path ofk3s
binary, you might need to modify theExecStart=/usr/local/bin/k3s ...
line of the file.Run
systemctl --user daemon-reload
Run
systemctl --user enable --now k3s-rootless
Run
KUBECONFIG=~/.kube/k3s.yaml kubectl get pods -A
, and make sure the pods are running.
Note: Don’t try to run
k3s server --rootless
on a terminal, as terminal sessions do not allow cgroup v2 delegation. If you really need to try it on a terminal, usesystemd-run --user -p Delegate=yes --tty k3s server --rooless
to wrap it in a systemd scope.
Advanced Rootless Configuration
Rootless K3s uses rootlesskit and slirp4netns to communicate between host and user network namespaces. Some of the configuration used by rootlesskit and slirp4nets can be set by environment variables. The best way to set these is to add them to the Environment
field of the k3s-rootless systemd unit.
Variable | Default | Description |
---|---|---|
K3S_ROOTLESS_MTU | 1500 | Sets the MTU for the slirp4netns virtual interfaces. |
K3S_ROOTLESS_CIDR | 10.41.0.0/16 | Sets the CIDR used by slirp4netns virtual interfaces. |
K3S_ROOTLESS_ENABLE_IPV6 | autotedected | Enables slirp4netns IPv6 support. If not specified, it is automatically enabled if K3s is configured for dual-stack operation. |
K3S_ROOTLESS_PORT_DRIVER | builtin | Selects the rootless port driver; either builtin or slirp4netns . Builtin is faster, but masquerades the original source address of inbound packets. |
K3S_ROOTLESS_DISABLE_HOST_LOOPBACK | true | Controls whether or not access to the hosts’s loopback address via the gateway interface is enabled. It is recommended that this not be changed, for security reasons. |
Troubleshooting Rootless
- Run
systemctl --user status k3s-rootless
to check the daemon status - Run
journalctl --user -f -u k3s-rootless
to see the daemon log - See also https://rootlesscontaine.rs/
Node Labels and Taints
K3s agents can be configured with the options --node-label
and --node-taint
which adds a label and taint to the kubelet. The two options only add labels and/or taints at registration time, so they can only be set when the node is first joined to the cluster.
All current versions of Kubernetes restrict nodes from registering with most labels with kubernetes.io
and k8s.io
prefixes, specifically including the kubernetes.io/role
label. If you attempt to start a node with a disallowed label, K3s will fail to start. As stated by the Kubernetes authors:
Nodes are not permitted to assert their own role labels. Node roles are typically used to identify privileged or control plane types of nodes, and allowing nodes to label themselves into that pool allows a compromised node to trivially attract workloads (like control plane daemonsets) that confer access to higher privilege credentials.
See SIG-Auth KEP 279 for more information.
If you want to change node labels and taints after node registration, or add reserved labels, you should use kubectl
. Refer to the official Kubernetes documentation for details on how to add taints and node labels.
Starting the Service with the Installation Script
The installation script will auto-detect if your OS is using systemd or openrc and enable and start the service as part of the installation process.
- When running with openrc, logs will be created at
/var/log/k3s.log
. - When running with systemd, logs will be created in
/var/log/syslog
and viewed usingjournalctl -u k3s
(orjournalctl -u k3s-agent
on agents).
An example of disabling auto-starting and service enablement with the install script:
curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_SKIP_ENABLE=true sh -
Additional OS Preparations
Old iptables versions
Several popular Linux distributions ship a version of iptables that contain a bug which causes the accumulation of duplicate rules, which negatively affects the performance and stability of the node. See Issue #3117 for information on how to determine if you are affected by this problem.
K3s includes a working version of iptables (v1.8.8) which functions properly. You can tell K3s to use its bundled version of iptables by starting K3s with the --prefer-bundled-bin
option, or by uninstalling the iptables/nftables packages from your operating system.
Version Gate
The --prefer-bundled-bin
flag is available starting with the 2022-12 releases (v1.26.0+k3s1, v1.25.5+k3s1, v1.24.9+k3s1, v1.23.15+k3s1).
Red Hat Enterprise Linux / CentOS / Fedora
It is recommended to turn off firewalld:
systemctl disable firewalld --now
If you wish to keep firewalld enabled, by default, the following rules are required:
firewall-cmd --permanent --add-port=6443/tcp #apiserver
firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16 #pods
firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16 #services
firewall-cmd --reload
Additional ports may need to be opened depending on your setup. See Inbound Rules for more information. If you change the default CIDR for pods or services, you will need to update the firewall rules accordingly.
If enabled, it is required to disable nm-cloud-setup and reboot the node:
systemctl disable nm-cloud-setup.service nm-cloud-setup.timer
reboot
Ubuntu / Debian
It is recommended to turn off ufw (uncomplicated firewall):
ufw disable
If you wish to keep ufw enabled, by default, the following rules are required:
ufw allow 6443/tcp #apiserver
ufw allow from 10.42.0.0/16 to any #pods
ufw allow from 10.43.0.0/16 to any #services
Additional ports may need to be opened depending on your setup. See Inbound Rules for more information. If you change the default CIDR for pods or services, you will need to update the firewall rules accordingly.
Raspberry Pi
Raspberry Pi OS is Debian based, and may suffer from an old iptables version. See workarounds.
Standard Raspberry Pi OS installations do not start with cgroups
enabled. K3S needs cgroups
to start the systemd service. cgroups
can be enabled by appending cgroup_memory=1 cgroup_enable=memory
to /boot/cmdline.txt
.
Example cmdline.txt:
console=serial0,115200 console=tty1 root=PARTUUID=58b06195-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait cgroup_memory=1 cgroup_enable=memory
Starting with Ubuntu 21.10, vxlan support on Raspberry Pi has been moved into a separate kernel module.
sudo apt install linux-modules-extra-raspi
Running K3s in Docker
There are several ways to run K3s in Docker:
- K3d
- Docker
k3d is a utility designed to easily run multi-node K3s clusters in Docker.
k3d makes it very easy to create single- and multi-node k3s clusters in docker, e.g. for local development on Kubernetes.
See the Installation documentation for more information on how to install and use k3d.
To use Docker, rancher/k3s
images are also available to run the K3s server and agent. Using the docker run
command:
sudo docker run \
--privileged \
--name k3s-server-1 \
--hostname k3s-server-1 \
-p 6443:6443 \
-d rancher/k3s:v1.24.10-k3s1 \
server
note
You must specify a valid K3s version as the tag; the latest
tag is not maintained.
Docker images do not allow a +
sign in tags, use a -
in the tag instead.
Once K3s is up and running, you can copy the admin kubeconfig out of the Docker container for use:
sudo docker cp k3s-server-1:/etc/rancher/k3s/k3s.yaml ~/.kube/config
SELinux Support
Version Gate
Available as of v1.19.4+k3s1
If you are installing K3s on a system where SELinux is enabled by default (such as CentOS), you must ensure the proper SELinux policies have been installed.
- Automatic Installation
- Manual Installation
The install script will automatically install the SELinux RPM from the Rancher RPM repository if on a compatible system if not performing an air-gapped install. Automatic installation can be skipped by setting INSTALL_K3S_SKIP_SELINUX_RPM=true
.
The necessary policies can be installed with the following commands:
yum install -y container-selinux selinux-policy-base
yum install -y https://rpm.rancher.io/k3s/latest/common/centos/7/noarch/k3s-selinux-0.2-1.el7_8.noarch.rpm
To force the install script to log a warning rather than fail, you can set the following environment variable: INSTALL_K3S_SELINUX_WARN=true
.
Enabling SELinux Enforcement
To leverage SELinux, specify the --selinux
flag when starting K3s servers and agents.
This option can also be specified in the K3s configuration file.
selinux: true
Using a custom --data-dir
under SELinux is not supported. To customize it, you would most likely need to write your own custom policy. For guidance, you could refer to the containers/container-selinux repository, which contains the SELinux policy files for Container Runtimes, and the k3s-io/k3s-selinux repository, which contains the SELinux policy for K3s.
Enabling Lazy Pulling of eStargz (Experimental)
What’s lazy pulling and eStargz?
Pulling images is known as one of the time-consuming steps in the container lifecycle. According to Harter, et al.,
pulling packages accounts for 76% of container start time, but only 6.4% of that data is read
To address this issue, k3s experimentally supports lazy pulling of image contents. This allows k3s to start a container before the entire image has been pulled. Instead, the necessary chunks of contents (e.g. individual files) are fetched on-demand. Especially for large images, this technique can shorten the container startup latency.
To enable lazy pulling, the target image needs to be formatted as eStargz. This is an OCI-alternative but 100% OCI-compatible image format for lazy pulling. Because of the compatibility, eStargz can be pushed to standard container registries (e.g. ghcr.io) as well as this is still runnable even on eStargz-agnostic runtimes.
eStargz is developed based on the stargz format proposed by Google CRFS project but comes with practical features including content verification and performance optimization. For more details about lazy pulling and eStargz, please refer to Stargz Snapshotter project repository.
Configure k3s for lazy pulling of eStargz
As shown in the following, --snapshotter=stargz
option is needed for k3s server and agent.
k3s server --snapshotter=stargz
With this configuration, you can perform lazy pulling for eStargz-formatted images. The following example Pod manifest uses eStargz-formatted node:13.13.0
image (ghcr.io/stargz-containers/node:13.13.0-esgz
). When the stargz snapshotter is enabled, K3s performs lazy pulling for this image.
apiVersion: v1
kind: Pod
metadata:
name: nodejs
spec:
containers:
- name: nodejs-estargz
image: ghcr.io/stargz-containers/node:13.13.0-esgz
command: ["node"]
args:
- -e
- var http = require('http');
http.createServer(function(req, res) {
res.writeHead(200);
res.end('Hello World!\n');
}).listen(80);
ports:
- containerPort: 80
Additional Logging Sources
Rancher logging for K3s can be installed without using Rancher. The following instructions should be executed to do so:
helm repo add rancher-charts https://charts.rancher.io
helm repo update
helm install --create-namespace -n cattle-logging-system rancher-logging-crd rancher-charts/rancher-logging-crd
helm install --create-namespace -n cattle-logging-system rancher-logging --set additionalLoggingSources.k3s.enabled=true rancher-charts/rancher-logging
Additional Network Policy Logging
Packets dropped by network policies can be logged. The packet is sent to the iptables NFLOG action, which shows the packet details, including the network policy that blocked it.
If there is a lot of traffic, the number of log messages could be very high. To control the log rate on a per-policy basis, set the limit
and limit-burst
iptables parameters by adding the following annotations to the network policy in question:
kube-router.io/netpol-nflog-limit=<LIMIT-VALUE>
kube-router.io/netpol-nflog-limit-burst=<LIMIT-BURST-VALUE>
Default values are limit=10/minute
and limit-burst=10
. Check the iptables manual for more information on the format and possible values for these fields.
To convert NFLOG packets to log entries, install ulogd2 and configure [log1]
to read on group=100
. Then, restart the ulogd2 service for the new config to be committed. When a packet is blocked by network policy rules, a log message will appear in /var/log/ulog/syslogemu.log
.
Packets sent to the NFLOG netlink socket can also be read by using command-line tools like tcpdump or tshark:
tcpdump -ni nflog:100
While more readily available, tcpdump will not show the name of the network policy that blocked the packet. Use wireshark’s tshark command instead to display the full NFLOG packet header, including the nflog.prefix
field that contains the policy name.