Upgrade Guide
This guide describes how to upgrade the Open Service Mesh (OSM) control plane.
How upgrades work
OSM’s control plane lifecycle is managed by Helm and can be upgraded with Helm’s upgrade functionality, which will patch or replace control plane components as needed based on changed values and resource templates.
Resource availability during upgrade
Since upgrades may include redeploying the osm-controller with the new version, there may be some downtime of the controller. While the osm-controller is unavailable, there will be a delay in processing new SMI resources, creating new pods to be injected with a proxy sidecar container will fail, and mTLS certificates will not be rotated.
Already existing SMI resources will be unaffected, this means that the data plane (which includes the Envoy sidecar configs) will also be unaffected by upgrading.
Data plane interruptions are expected if the upgrade includes CRD changes. Streamlining data plane upgrades is being tracked in issue #512.
Policy
Only certain upgrade paths are tested and supported.
Note: These plans are tentative and subject to change.
Breaking changes in this section refer to incompatible changes to the following user-facing components:
osm
CLI commands, flags, and behavior- SMI CRDs and controllers
This implies the following are NOT user-facing and incompatible changes are NOT considered “breaking” as long as the incompatibility is handled by user-facing components:
- Chart values.yaml
osm-mesh-config
MeshConfig- Internally-used labels and annotations (monitored-by, injection, metrics, etc.)
Upgrades are only supported between versions that do not include breaking changes, as described below.
For OSM versions 0.y.z
:
- Breaking changes will not be introduced between
0.y.z
and0.y.z+1
- Breaking changes may be introduced between
0.y.z
and0.y+1.0
For OSM versions x.y.z
where x >= 1
:
- Breaking changes will not be introduced between
x.y.z
andx.y+1.0
or betweenx.y.z
andx.y.z+1
- Breaking changes may be introduced between
x.y.z
andx+1.0.0
How to upgrade OSM
The recommended way to upgrade a mesh is with the osm
CLI. For advanced use cases, helm
may be used.
CRD Upgrades
Because Helm does not manage CRDs beyond the initial installation, OSM leverages an init-container on the osm-bootstrap
pod to to update existing and add new CRDs during an upgrade. If the new release contains updates to existing CRDs or adds new CRDs, the init-osm-bootstrap
on the osm-bootstrap
pod will update the CRDs. The associated Custom Resources will remain as is, requiring no additional action prior to or immediately after the upgrade.
Please check the CRD Updates
section of the release notes to see if any updates have been made to the CRDs used by OSM. If the version of the Custom Resources are within the versions the updated CRD supports, no immediate action is required. OSM implements a conversion webhook for all of its CRDs, ensuring support for older versions and providing the flexibilty to update Custom Resources at a later point in time.
Upgrading with the OSM CLI
Pre-requisites
- Kubernetes cluster with the OSM control plane installed
- Ensure that the Kubernetes cluster has the minimum Kubernetes version required by the new OSM chart. This can be found in the Installation Pre-requisites
osm
CLI installed- By default, the
osm
CLI will upgrade to the same chart version that it installs. e.g. v1.0.0 of theosm
CLI will upgrade to v1.0.0 of the OSM Helm chart. Upgrading to any other version of the Helm chart than the version matching the CLI may work, but those scenarios are not tested and issues that arise may not get fixed even if reported.
- By default, the
The osm mesh upgrade
command performs a helm upgrade
of the existing Helm release for a mesh.
Basic usage requires no additional arguments or flags:
$ osm mesh upgrade
OSM successfully upgraded mesh osm
This command will upgrade the mesh with the default mesh name in the default OSM namespace. Values from the previous release will NOT carry over to the new release by default, but may be passed individually with the --set
flag on osm mesh upgrade
.
See osm mesh upgrade --help
for more details
Upgrading with Helm
Pre-requisites
- Kubernetes cluster with the OSM control plane installed
- The helm 3 CLI
OSM Configuration
When upgrading, any custom settings used to install or run OSM may be reverted to the default, this only includes any metrics deployments. Please ensure that you carefully follow the guide to prevent these values from being overwritten.
To preserve any changes you’ve made to the OSM configuration, use the helm --values
flag. Create a copy of the values file (make sure to use the version for the upgraded chart) and change any values you wish to customize. You can omit all other values.
**Note: Any configuration changes that go into the MeshConfig will not be applied during upgrade and the values will remain as is prior to the upgrade. If you wish to update any value in the MeshConfig you can do so by patching the resource after an upgrade.
For example, if the logLevel
field in the MeshConfig was set to info
prior to upgrade, updating this in override.yaml
will during an upgrade will not cause any change.
Warning: Do NOT change osm.meshName
or osm.osmNamespace
Helm Upgrade
Then run the following helm upgrade
command.
$ helm upgrade <mesh name> osm --repo https://openservicemesh.github.io/osm --version <chart version> --namespace <osm namespace> --values override.yaml
Omit the --values
flag if you prefer to use the default settings.
Run helm upgrade --help
for more options.
Upgrading Third Party Dependencies
Envoy
The envoy version can be updated by changing the value of the envoyImage
variable in the osm-mesh-config. When doing so, it is recommended to specify the image digest associated with that envoy version to avoid being vulnerable to supply chain attacks. For instance, to update the envoy-alpine image to v1.19.1, the following command should be run:
export osm_namespace=osm-system # Replace osm-system with the namespace where OSM is installed
kubectl patch meshconfig osm-mesh-config -n $osm_namespace -p '{"spec":{"sidecar":{"envoyImage":"envoyproxy/envoy-alpine@sha256:6502a637c6c5fba4d03d0672d878d12da4bcc7a0d0fb3f1d506982dde0039abd"}}}' --type=merge
After the MeshConfig resource has been updated, all the pods and deployments that are part of the mesh must be restarted so that the newer version of Envoy sidecar can be injected onto the pods as a part of the automatic sidecar injection that OSM performs. This can be done with the kubectl rollout restart deploy
command.
Prometheus, Grafana, and Jaeger
If enabled, OSM’s Prometheus, Grafana, and Jaeger services are deployed alongside other OSM control plane components. Though these third party dependencies cannot be updated through the meshconfig like Envoy, the versions can still be updated in the deployment directly. For instance, to update prometheus to v2.19.1, the user can run:
export osm_namespace=osm-system # Replace osm-system with the namespace where OSM is installed
kubectl set image deployment/osm-prometheus -n $osm_namespace prometheus="prom/prometheus:v2.19.1"
To update to Grafana 8.1.0, the command would look like:
kubectl set image deployment/osm-grafana -n $osm_namespace grafana="grafana/grafana:8.1.0"
And for Jaeger, the user would run the following to update to 1.26.0:
kubectl set image deployment/jaeger -n $osm_namespace jaeger="jaegertracing/all-in-one:1.26.0"
OSM Upgrade Troubleshooting Guide
OSM Mesh Upgrade Timing Out
Insufficient CPU
If the osm mesh upgrade
command is timing out, it could be due to insufficient CPU.
- Check the pods to see if any of them aren’t fully up and running
# Replace osm-system with osm-controller's namespace if using a non-default namespace
kubectl get pods -n osm-system
- If there are any pods that are in Pending state, use
kubectl describe
to check theEvents
section
# Replace osm-system with osm-controller's namespace if using a non-default namespace
kubectl describe pod <pod-name> -n osm-system
If you see the following error, then please increase the number of CPUs Docker can use.
`Warning FailedScheduling 4s (x15 over 19m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu.`
Error Validating CLI Parameters
If the osm mesh upgrade
command is still timing out, it could be due to a CLI/Image Version mismatch.
- Check the pods to see if any of them aren’t fully up and running
# Replace osm-system with osm-controller's namespace if using a non-default namespace
kubectl get pods -n osm-system
- If there are any pods that are in Pending state, use
kubectl describe
to check theEvents
section forError Validating CLI parameters
# Replace osm-system with osm-controller's namespace if using a non-default namespace
kubectl describe pod <pod-name> -n osm-system
- If you find the error, please check the pod’s logs for any errors
kubectl logs -n osm-system <pod-name> | grep -i error
If you see the following error, then it’s due to a CLI/Image Version mismatch.
`"error":"Please specify the init container image using --init-container-image","reason":"FatalInvalidCLIParameters"`
Workaround is to set the container-registry
and osm-image-tag
flag when running osm mesh upgrade
.
osm mesh upgrade --container-registry $CTR_REGISTRY --osm-image-tag $CTR_TAG --enable-egress=true
Other Issues
If you’re running into issues that are not resolved with the steps above, please open a GitHub issue.