Autopilot

A tool for updating your k0s controller and worker nodes using specialized plans. There is a public update-server hosted on the same domain as the documentation site. See the example below on how to use it. There is only a single channel edge_release available. The channel exposes the latest released version.

How it works

  • You create a Plan YAML
    • Defining the update payload (new version of k0s, URLs for platforms, etc)
    • Add definitions for all the nodes that should receive the update.
      • Either statically, or dynamically using label/field selectors
  • Apply the Plan
    • Applying a Plan is a simple kubectl apply operation.
  • Monitor the progress
    • The applied Plan provides a status that details the progress.

Automatic updates

To enable automatic updates, create an UpdateConfig object:

  1. apiVersion: autopilot.k0sproject.io/v1beta2
  2. kind: UpdateConfig
  3. metadata:
  4. name: example
  5. namespace: default
  6. spec:
  7. channel: edge_release
  8. updateServer: https://updates.k0sproject.io/
  9. upgradeStrategy:
  10. type: periodic
  11. periodic:
  12. # The folowing fields configures updates to happen only on Tue or Wed at 13:00-15:00
  13. days: [Tuesdsay,Wednesday]
  14. startTime: "13:00"
  15. length: 2h
  16. planSpec: # This defines the plan to be created IF there are updates available
  17. ...

Safeguards

There are a number of safeguards in place to avoid breaking a cluster.

Stateless Component

  • The autopilot component were designed to not require any heavy state, or massive synchronization. Controllers can disappear, and backup controllers can resume the autopilot operations.

Workers Update Only After Controllers

  • The versioning that Kubelet and the Kubernetes API server adhere to requires that Kubelets should not be of a newer version than the API server.

  • How autopilot handles this is that when a Plan is applied that has both controller and worker nodes, all of the controller nodes will be updated first. It is only when all controllers have updated successfully that worker nodes will receive their update instructions.

Plans are Immutable

  • When you apply a Plan, autopilot evaluates all of the controllers and workers that should be included into the Plan, and tracks them in the status. After this point, no additional changes to the plan (other than status) will be recognized.
    • This helps in largely dynamic worker node environments where nodes that may have been matched by the selector discovery method no longer exist by the time the update is ready to be scheduled.

Controller Quorum Safety

  • Prior to scheduling a controller update, autopilot queries the API server of all controllers to ensure that they report a successful /ready
  • Only once all controllers are /ready will the current controller get sent update signaling.
  • In the event that any controller reports a non-ready, the Plan transitions into an InconsistentTargets state, and the Plan execution ends.

Controllers Update Sequentially

  • Despite having the configuration options for controllers to set concurrency, only one controller will be updated at a time.

Update Payload Verification

  • Each update object payload can provide an optional sha256 hash of the update content (specified in url), which is compared against the update content after it downloads.

Configuration

Autopilot relies on a Plan object on its instructions on what to update.

Here is an arbitrary Autopilot plan:

  1. apiVersion: autopilot.k0sproject.io/v1beta2
  2. kind: Plan
  3. metadata:
  4. name: autopilot
  5. spec:
  6. id: id1234
  7. timestamp: now
  8. commands:
  9. - k0supdate:
  10. version: v1.31.3+k0s.0
  11. platforms:
  12. linux-amd64:
  13. url: https://github.com/k0sproject/k0s/releases/download/v1.31.3+k0s.0/k0s-v1.31.3+k0s.0-amd64
  14. sha256: '0000000000000000000000000000000000000000000000000000000000000000'
  15. targets:
  16. controllers:
  17. discovery:
  18. static:
  19. nodes:
  20. - ip-172-31-44-131
  21. - ip-172-31-42-134
  22. - ip-172-31-39-65
  23. workers:
  24. limits:
  25. concurrent: 5
  26. discovery:
  27. selector:
  28. labels: environment=staging
  29. fields: metadata.name=worker2

Core Fields

apiVersion <string> (required)

  • The current version of the Autopilot API is v1beta2, with a full group-version of autopilot.k0sproject.io/v1beta2

metadata.name <string> (required)

  • The name of the plan should always be autopilot
    • Note: Plans will not execute if they don’t follow this convention.

Spec Fields

spec.id <string> (optional)

  • An identifier that can be provided by the creator for informational and tracking purposes.

spec.timestamp <string> (optional)

  • A timestamp value that can be provided by the creator for informational purposes. Autopilot does nothing with this information.

spec.commands[] (required)

  • The commands contains all of the commands that should be performed as a part of the plan.

k0supdate Command

spec.commands[].k0supdate.version <string> (required)

  • The version of the binary being updated. This version is used to compare against the installed version before and after update to ensure success.

spec.commands[].k0supdate.platforms.*.url <string> (required)

  • An URL providing where the updated binary should be downloaded from, for this specific platform.
    • The naming of platforms is a combination of $GOOS and $GOARCH, separated by a hyphen (-)
      • eg: linux-amd64, linux-arm64, linux-arm
    • Note: The main supported platform is linux. Autopilot may work on other platforms, however this has not been tested.

spec.commands[].k0supdate.platforms.*.sha256 <string> (optional)

  • If a SHA256 hash is provided for the binary, the completed downloaded will be verified against it.

spec.commands[].k0supdate.targets.controllers <object> (optional)

  • This object provides the details of how controllers should be updated.

spec.commands[].k0supdate.targets.controllers.limits.concurrent <int> (fixed as 1)

  • The configuration allows for specifying the number of concurrent controller updates through the plan spec, however for controller targets this is fixed always to 1.
  • By ensuring that only one controller updates at a time, we aim to avoid scenarios where quorom may be disrupted.

spec.commands[].k0supdate.targets.workers <object> (optional)

  • This object provides the details of how workers should be updated.

spec.commands[].k0supdate.targets.workers.limits.concurrent <int> (optional, default = 1)

  • Specifying a concurrent value for worker targets will allow for that number of workers to be updated at a time. If no value is provided, 1 is assumed.

airgapupdate Command

spec.commands[].airgapupdate.version <string> (required)

  • The version of the airgap bundle being updated.

spec.commands[].airgapupdate.platforms.*.url <string> (required)

  • An URL providing where the updated binary should be downloaded from, for this specific platform.
    • The naming of platforms is a combination of $GOOS and $GOARCH, separated by a hyphen (-)
      • eg: linux-amd64, linux-arm64, linux-arm
    • Note: The main supported platform is linux. Autopilot may work on other platforms, however this has not been tested.

spec.commands[].airgapupdate.platforms.*.sha256 <string> (optional)

  • If a SHA256 hash is provided for the binary, the completed downloaded will be verified against it.

spec.commands[].airgapupdate.workers <object> (optional)

  • This object provides the details of how workers should be updated.

spec.commands[].airgapupdate.workers.limits.concurrent <int> (optional, default = 1)

  • Specifying a concurrent value for worker targets will allow for that number of workers to be updated at a time. If no value is provided, 1 is assumed.

Static Discovery

This defines the static discovery method used for this set of targets (controllers, workers). The static discovery method relies on a fixed set of hostnames defined in .nodes.

It is expected that a Node (workers) or ControlNode (controllers) object exists with the same name.

  1. static:
  2. nodes:
  3. - ip-172-31-44-131
  4. - ip-172-31-42-134
  5. - ip-172-31-39-65

spec.commands[].k0supdate.targets.*.discovery.static.nodes[] <string> (required for static)

  • A list of hostnames that should be included in target set (controllers, workers).

Selector Target Discovery

The selector target discovery method relies on a dynamic query to the Kubernetes API using labels and fields to produce a set of hosts that should be updated.

Providing both labels and fields in the selector definition will result in a logical AND of both operands.

  1. selector:
  2. labels: environment=staging
  3. fields: metadata.name=worker2

Specifying an empty selector will result in all nodes being selected for this target set.

  1. selector: {}

spec.commands[].k0supdate.targets.*.discovery.selector.labels <string> (optional)

  • A collection of name/value labels that should be used for finding the appropriate nodes for the update of this target set.

spec.commands[].k0supdate.targets.*.discovery.selector.fields <string> (optional)

  • A collection of name/value fields that should be used for finding the appropriate nodes for the update of this target set.
    • Note: Currently only the field metadata.name is available as a query field.

Status Reporting

After a Plan has been applied, its progress can be viewed in the .status of the autopilot Plan.

  1. kubectl get plan autopilot -oyaml

An example of a Plan status:

  1. status:
  2. state: SchedulableWait
  3. commands:
  4. - state: SchedulableWait
  5. k0supdate:
  6. controllers:
  7. - lastUpdatedTimestamp: "2022-04-07T15:52:44Z"
  8. name: controller0
  9. state: SignalCompleted
  10. - lastUpdatedTimestamp: "2022-04-07T15:52:24Z"
  11. name: controller1
  12. state: SignalCompleted
  13. - lastUpdatedTimestamp: "2022-04-07T15:52:24Z"
  14. name: controller2
  15. state: SignalPending
  16. workers:
  17. - lastUpdatedTimestamp: "2022-04-07T15:52:24Z"
  18. name: worker0
  19. state: SignalPending
  20. - lastUpdatedTimestamp: "2022-04-07T15:52:24Z"
  21. name: worker1
  22. state: SignalPending
  23. - lastUpdatedTimestamp: "2022-04-07T15:52:24Z"
  24. name: worker2
  25. state: SignalPending

To read this status, this indicates that:

  • The overall status of the update is SchedulableWait, meaning that autopilot is waiting for the next opportunity to process a command.
  • There are three controller nodes
    • Two controllers have SignalCompleted successfully
    • One is waiting to be signalled (SignalPending)
  • There are also three worker nodes
    • All are awaiting signaling updates (SignalPending)

Plan Status

The Plan status at .status.status represents the overall status of the autopilot update operation. There are a number of statuses available:

StatusDescriptionEnds Plan?
IncompleteTargetsThere are nodes in the resolved Plan that do not have associated Node (worker) or ControlNode (controller) objects.Yes
InconsistentTargetsA controller has reported itself as not-ready during the selection of the next controller to update.Yes
SchedulableIndicates that the Plan can be re-evaluated to determine which next node to update.No
SchedulableWaitScheduling operations are in progress, and no further update scheduling should occur.No
CompletedThe Plan has run successfully to completion.Yes
RestrictedThe Plan included node types (controller or worker) that violates the —exclude-from-plans restrictions.Yes

Node Status

Similar to the Plan Status, the individual nodes can have their own statuses:

StatusDescription
SignalPendingThe node is available and awaiting an update signal
SignalSentUpdate signaling has been successfully applied to this node.
MissingPlatformThis node is a platform that an update has not been provided for.
MissingSignalNodeThis node does have an associated Node (worker) or ControlNode (controller) object.

UpdateConfig

UpdateConfig Core Fields

apiVersion <string> (required field)

  • API version. The current version of the Autopilot API is v1beta2, with a full group-version of autopilot.k0sproject.io/v1beta2

metadata.name <string> (required field)

  • Name of the config.

Spec

spec.channel <string> (optional)

  • Update channel to use. Supported values: stable(default), unstable.

spec.updateServer <string> (optional)

  • Update server url. Defaults to https://updates.k0sproject.io

spec.upgradeStrategy.type <enum:cron|periodic>

  • Select which update strategy to use.

spec.upgradeStrategy.cron <string> (optional) DEPRECATED

  • Schedule to check for updates in crontab format.

spec.upgradeStrategy.cron <object>

Fields:

  • days: On which weekdays to check for updates
  • startTime: At which time of day to check updates
  • length: The length of the update window

spec.planSpec <string> (optional)

  • Describes the behavior of the autopilot generated Plan

Example

  1. apiVersion: autopilot.k0sproject.io/v1beta2
  2. kind: UpdaterConfig
  3. metadata:
  4. name: example
  5. spec:
  6. channel: stable
  7. updateServer: https://updates.k0sproject.io/
  8. upgradeStrategy:
  9. type: periodic
  10. periodic:
  11. # The folowing fields configures updates to happen only on Tue or Wed at 13:00-15:00
  12. days: [Tuesdsay,Wednesday]
  13. startTime: "13:00"
  14. length: 2h
  15. # Optional. Specifies a created Plan object
  16. planSpec:
  17. commands:
  18. - k0supdate: # optional
  19. forceupdate: true # optional
  20. targets:
  21. controllers:
  22. discovery:
  23. static:
  24. nodes:
  25. - ip-172-31-44-131
  26. - ip-172-31-42-134
  27. - ip-172-31-39-65
  28. workers:
  29. limits:
  30. concurrent: 5
  31. discovery:
  32. selector:
  33. labels: environment=staging
  34. fields: metadata.name=worker2
  35. airgapupdate: # optional
  36. workers:
  37. limits:
  38. concurrent: 5
  39. discovery:
  40. selector:
  41. labels: environment=staging
  42. fields: metadata.name=worker2

FAQ

Q: How do I apply the Plan and ControlNode CRDs?

A: These CRD definitions are embedded in the k0s binary and applied on startup. No additional action is needed.

Q: How will ControlNode instances get removed?

A: ControlNode instances are created by autopilot controllers as they startup. When controllers disappear, they will not remove their associated ControlNode instance. It is the responsibility of the operator/administrator to ensure their maintenance.

Q: I upgraded my workers, and now Kubelets are no longer reporting

You probably upgraded your workers to an API version greater than what is available on the API server.

https://kubernetes.io/releases/version-skew-policy/

Make sure that your controllers are at the desired version first before upgrading workers.