InPlace Update
In-place Update is one of the key features provided by OpenKruise.
Workloads that support in-place update:
Currently CloneSet
, Advanced StatefulSet
and Advanced DaemonSet
re-use the same code package ./pkg/util/inplaceupdate and have similar behaviours of in-place update. In this article, we would like to introduce the usage and workflow of them.
Note that the in-place update workflow of SidecarSet
is a little different from the other workloads, such as it will not set Pod to not-ready before update. So the things we talk below do not totally go for SidecarSet
.
What is in-place update?
Once we are going to update image in a existing Pod, look at the comparation between Recreate and InPlace Update:
In ReCreate way we have to delete the old Pod and create a new Pod:
- Pod name and uid all changed, because they are totally different Pod objects (such as Deployment update)
- Or Pod name may not change but uid changed, because they are still different Pod objects, althrough re-use the same name (such as StatefulSet update)
- Node name of the Pod changed, because the new Pod is almost impossible to be scheduled to the previous node.
- Pod IP changed, because the new Pod is almost impossible to be allocated the previous IP.
But for InPlace way we can re-use the Pod object but only modify the fields in it, so that:
- Avoid additional cost of scheduling, allocating IP, allocating and mounting volumes
- Faster image pulling, because of we can re-use most of image layers pulled by the old image and only to pull several new layers
- When a container is in-place updating, the other containers in Pod will not be affected and remain running.
Understand InPlaceIfPossible
The update type in Kruise workloads is named InPlaceIfPossible
, which tells Kruise to update Pods in-place as possible, and it should go back to ReCreate Update if impossible.
What changes does it consider to be possilble to in-place update?
- Update
spec.template.metadata.*
in workloads, such as labels and annotations, Kruise will only update the metadata to existing Pods without recreate them. - Update
spec.template.spec.containers[x].image
in workloads, Kruise will in-place update the container image in Pods without recreate them. - Since Kruise v1.0 (including v1.0 alpha/beta), update
spec.template.metadata.labels/annotations
and there exists container env from the changed labels/annotations, Kruise will in-place update them to renew the env value in containers.
Otherwise, the changes to other fields such as spec.template.spec.containers[x].env
or spec.template.spec.containers[x].resources
will go back to ReCreate Update.
Take the CloneSet YAML below as an example:
- Modify
app-image:v1
image, will trigger in-place update. - Modify the value of
app-config
in annotations, will trigger in-place update (Read the Requirements below). - Modify the two fields above together, will tigger in-place update both image and environment.
- Directly modify the value of
APP_NAME
in env or add a new env, will trigger recreate update.
apiVersion: apps.kruise.io/v1alpha1
kind: CloneSet
metadata:
...
spec:
replicas: 1
template:
metadata:
annotations:
app-config: "... the real env value ..."
spec:
containers:
- name: app
image: app-image:v1
env:
- name: APP_CONFIG
valueFrom:
fieldRef:
fieldPath: metadata.annotations['app-config']
- name: APP_NAME
value: xxx
updateStrategy:
type: InPlaceIfPossible
Workflow overview
You can see the whole workflow of in-place update below (you may need to right click and open it in a new tab):
InPlace update with launch priorities
FEATURE STATE: Kruise v1.1.0
When you in-place update multiple containers at once and the containers have different launch priorities, Kruise will update the containers by order according to the priorities.
- For pods without container launch priorities, no guarantees of the execution order during in-place update multiple containers.
- For pods with container launch priorities:
- keep execution order during in-place update multiple containers with different priorities.
- no guarantees of the execution order during in-place update multiple containers with the same priority.
For example, we have the CloneSet that includes two containers with different priorities:
apiVersion: apps.kruise.io/v1alpha1
kind: CloneSet
metadata:
...
spec:
replicas: 1
template:
metadata:
annotations:
app-config: "... config v1 ..."
spec:
containers:
- name: sidecar
env:
- name: KRUISE_CONTAINER_PRIORITY
value: "10"
- name: APP_CONFIG
valueFrom:
fieldRef:
fieldPath: metadata.annotations['app-config']
- name: main
image: main-image:v1
updateStrategy:
type: InPlaceIfPossible
When we update the CloneSet to change app-config
annotation and image of main container, which means both sidecar and main containers need to update, Kruise will firstly in-place update pods that recreates sidecar container with the new env from annotation.
At this moment, we can find the apps.kruise.io/inplace-update-state
annotation in updated Pod and see its value:
{
"revision": "{CLONESET_NAME}-{HASH}", // the target revision name of this in-place update
"updateTimestamp": "2022-03-22T09:06:55Z", // the start time of this whole update
"nextContainerImages": {"main": "main-image:v2"}, // the next containers that should update images
// "nextContainerRefMetadata": {...}, // the next containers that should update env from annotations/labels
"preCheckBeforeNext": {"containersRequiredReady": ["sidecar"]}, // the pre-check must be satisfied before the next containers can update
"containerBatchesRecord":[
{"timestamp":"2022-03-22T09:06:55Z","containers":["sidecar"]} // the first batch of containers that have updated (it just means the spec of containers has updated, such as images in pod.spec.container or annotaions/labels, but dosn't mean the real containers on node have been updated completely)
]
}
When the sidecar container has been updated successfully, Kruise will update the next main container. Finally, you will find the apps.kruise.io/inplace-update-state
annotation looks like:
{
"revision": "{CLONESET_NAME}-{HASH}",
"updateTimestamp": "2022-03-22T09:06:55Z",
"lastContainerStatuses":{"main":{"imageID":"THE IMAGE ID OF OLD MAIN CONTAINER"}},
"containerBatchesRecord":[
{"timestamp":"2022-03-22T09:06:55Z","containers":["sidecar"]},
{"timestamp":"2022-03-22T09:07:20Z","containers":["main"]}
]
}
Usually, users only have to care about the containerBatchesRecord
to make sure the containers are updated in different batches. If the Pod is blocking during in-place update, you should check the nextContainerImages/nextContainerRefMetadata
and see if the previous containers in preCheckBeforeNext
have been updated successfully and ready.
Requirements
To use InPlace Update for env from metadata, you have to enable kruise-daemon
(defaults to be enabled) and InPlaceUpdateEnvFromMetadata
feature-gate when install or upgrade Kruise chart.
Note that if you have some nodes of virtual-kubelet type, kruise-daemon may not work on them and in-place update for env from metadata will not be executed.