Service Account Issuer (SAI) migration
In the past changing the Service Account Issuer has been a disruptive process. However since Kubernetes v1.22 you can specify multiple Service Account Issuers in the Kubernetes API Server (Docs here).
As noted in the Kubernetes Docs when the --service-account-issuer
flag is specified multiple times, the first is used to generate tokens and all are used to determine which issuers are accepted.
So with this feature we can migrate to a new Service Account Issuer without disruption to cluster operations.
Note: There is official kOps support for this in the forthcoming feature - koordinates/kops#16497.
Migrate using Instancegroup Hooks (prior kOps v1.28+)
Warning: This procedure is manual and involves some tricky modification of manifest files. We recommend testing this on a staging cluster before proceeding on a production cluster.
In this example we are switching from master.[cluster-name].[domain]
to api.internal.[cluster-name].[domain]
.
Add the
modify-kube-api-manifest
(existing SAI as primary) hook to the control-plane instancegroupshooks:
- name: modify-kube-api-manifest
before:
- kubelet.service
manifest: |
User=root
Type=oneshot
ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/i\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
Apply the changes to the cluster
- Roll the control-plane nodes
Update the
modify-kube-api-manifest
(switch the primary/secondary SAI) hook on the control-plane instancegroupshooks:
- name: modify-kube-api-manifest
before:
- kubelet.service
manifest: |
User=root
Type=oneshot
ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/a\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
Apply the changes to the cluster
- Roll the control-plane nodes
- Roll all other nodes in the cluster
- Wait 24 hours until the dynamic SA tokens have refreshed
- Remove the
modify-kube-api-manifest
hook on the control-plane instancegroups - Apply the changes to the cluster
- Roll the control-plane nodes
This procedure was originally posted in a GitHub issue here with inspiration from this comment.