Job Sidecar Terminator

FEATURE STATE: Kruise v1.4.0

In Kubernetes world, for job workload, one may expect pods enter Completed phase once the main container exited. However, when these pods have long-running sidecar containers, these sidecars will actually remain running after the main containers completed.

To solve such problem, job sidecar terminator controller watches and terminates sidecar containers for such job-type Pods when its main containers completed.

Note: If your kubernetes version >= 1.28, it is recommended that you use k8s native Sidecar Container to solve the above problem.

Requirements

  • Enabled SidecarTerminator feature gate when installing/upgrading Kruise (defaults to disabled).
  • Enabled KruiseDaemon feature gate when installing/upgrading Kruise (defaults to enabled).

Usage

For Pods Running on Normal Nodes

It is very easily to use this feature if your Pods run on normal nodes, you just need to add a special env to the sidecar containers you want to terminate in the Pods, and Kruise will terminate them using CRR at the right time:

  1. kind: Job
  2. spec:
  3. template:
  4. spec:
  5. containers:
  6. - name: sidecar
  7. env:
  8. - name: KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT
  9. value: "true"
  10. - name: main
  11. ... ...

For Pods Running on Virtual Nodes

For certain serverless container platforms like ECI and Fargate, their Pods run on Virtual-Kubelet instead of normal nodes, which means Kruise cannot terminate its sidecar using CRR because Kruise Daemon cannot run on virtual-kubelet. However, we can address this issue by utilizing the pod in-place-update mechanism offered by native Kubernetes. If a sidecar container needs to be terminated, we can replace the original sidecar image with an image that exits as soon as it is pulled.

Step 1: Prepare a special image

  • This image should exit as soon as it is pulled and started.
  • This image should be compatible with the commands and args of original sidecar container.

Step 2: Config your sidecar container

  1. kind: Job
  2. spec:
  3. template:
  4. spec:
  5. containers:
  6. - name: sidecar
  7. env:
  8. - name: KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT_WITH_IMAGE
  9. value: "example/quick-exit:v1.0.0"
  10. - name: main
  11. ... ...

Replace "example/quick-exit:v1.0.0" with your prepared image.

Ignore sidecar container with non-zero exit code

FEATURE STATE: Kruise v1.6.0

In previous versions, the sidecar container was required to be able to accept the SIGTERM signal and to ensure that the exit code was 0. If the sidecar container had non-zero exit code, it would result in Pod Phase=Failed.

As of Kruise 1.6.0, Kruise will ignore sidecar container with non-zero exit code, and Pod Phase only depend on the success or failure of the main containers.

Notes

  • Your sidecar container must respond the SIGTERM signal, and the entrypoint should exit 0 when received this signal.

  • This feature can handle the Pods with Never/OnFailure restart policy, and doesn’t care which type of job controllers they’re controlled by.

  • The container with env KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT will be treated as sidecars, the others as main containers.

  • The sidecars will be terminated when ALL main containers completed.

    • In Never restart policy settings, main container will be treated as completed once it exit.

    • In OnFailure restart policy settings, main container will be treated as completed once it exit and exit code must be 0.

  • In Pods on real nodes mode, KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT has a higher priority than KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT_WITH_IMAGE