Release notes for kOps 1.29 series

Significant changes

  • Add support to —emit-per-nodegroup-metrics on the cluster autoscaler addon (emitPerNodegroupMetrics) PR#16693

Deferred deletion / pruning phase

Some infrastructure changes are potentially disruptive to the continued operation of the cluster. For the most disruptive operations, particularly those that break rolling-update of the cluster, we have started to use deferred deletion to minimize the impact. For example, on AWS we create a second NLB during the kops update phase when we cannot change the NLB directly. kops update will report that a --prune is needed. To minimize disruption, we recommend you perform this after a rolling-update, for example:

  1. kops update $MYCLUSTER --yes --admin
  2. kops rolling-update $MYCLUSTER --yes
  3. kops update $MYCLUSTER --yes --admin --prune # NEW!

Deferred deletion is currently used to safely introduce security groups for NLBs on AWS, and to move to an internal load balancer for kops-controller on GCP.

Initial OpenTelemetry Support

We are starting to add (experimental) support for OpenTelemetry, in particular Tracing support. Setting OTEL_EXPORTER_OTLP_TRACES_FILE will write a trace file which can then be read by the traceserver program. More information and options are described in docs/opentelemetry.md. The tracing data is not expected to be particularly useful for end-users in this release; the (non-standard) recording approach is instead intended to work well with our Prow end-to-end testing system so that developers can optimize kOps.

Please note: this is not telemetry in the “phone-home” sense. The kOps project does not collect data from your machine. As an open-source project we do not even want to collect any of your data. Currently the only OpenTelemetry backend supported is writing to a filesystem (and it is opt-in). In future you will be able to configure other OpenTelemetry backends, but this data will only be sent if you enable OpenTelemetry, and only sent to where you configure.

AWS

  • Network Load Balancers in front of the Kubernetes API and bastion hosts now have a security group attached. These security groups are used for security group rules allowing incoming traffic to the NLBs as well as traffic between the NLBs and their target instances.

  • Posts event data to URL upon instance interruption action in aws-node-termination-handler with WEBHOOK_URL.

GCP

  • As of Kubernetes version 1.29, credentials for private GCR/AR repositories will be handled by the out-of-tree credential provider. This is an additional binary that each instance downloads from the assets repository.
  • Two additional StorageClasses are created on GCP clusters. These are called balanced-csi and ssd-csi and utilize the GCP Balanced and SSD Persistent Disk volume types respectively.
  • Breaking Change - the default StorageClass has been changed from standard-csi to balanced-csi.

  • We now use a private load-balancer for in-cluster traffic on GCP, which allows us to use network tags to restrict access only to the cluster nodes.

Breaking changes

Other breaking changes

  • kops toolbox dump limits the number of nodes dumped to 500 by default. Use --max-nodes to override.

  • Support for Kubernetes version 1.23 has been removed.

Known Issues

Deprecations

  • Support for Kubernetes version 1.24 is deprecated and will be removed in kOps 1.30.

  • Support for Kubernetes version 1.25 is deprecated and will be removed in kOps 1.31.

  • Support for AWS Classic Load Balancer for API is deprecated and should not be used for newly created clusters.

  • All legacy addons (under /addons) are deprecated in favor of managed addons, including the metrics server addon and the autoscaler addon.