Configure Kubernetes runtime

The Kubernetes runtime works when a function worker generates and applies Kubernetes manifests. The manifests generated by a function worker include:

  • a StatefulSet By default, the StatefulSet manifest has a single pod with a number of replicas. The number is determined by the parallelism of the function. The pod downloads the function payload (via the function worker REST API) on pod boot. The pod’s container image is configurable if the function runtime is configured.
  • a Service (used to communicate with the pod)
  • a Secret for authenticating credentials (when applicable). The Kubernetes runtime supports secrets. You can create a Kubernetes secret and expose it as an environment variable in the pod.

Configure Kubernetes runtime - 图1tip

For the rules of translating Pulsar object names into Kubernetes resource labels, see instructions.

Configure basic settings

To quickly configure a Kubernetes runtime, you can use the default settings of KubernetesRuntimeFactoryConfig in the conf/functions_worker.yml file.

If you have [set up a Pulsar cluster on Kubernetes using Helm chart, which means function workers have also been set up on Kubernetes, you can use the serviceAccount associated with the pod where the function worker is running. Otherwise, you can configure function workers to communicate with a Kubernetes cluster by setting functionRuntimeFactoryConfigs to k8Uri.

Integrate Kubernetes secrets

A Secret in Kubernetes is an object that holds some confidential data such as a password, a token, or a key. When you create a secret in the Kubernetes namespace where your functions are deployed, functions can safely reference and distribute it. To enable this feature, set secretsProviderConfiguratorClassName to org.apache.pulsar.functions.secretsproviderconfigurator.KubernetesSecretsProviderConfigurator in the conf/functions-worker.yml file.

For example, you deploy a function to the pulsar-func Kubernetes namespace, and you have a secret named database-creds with a field name password, which you want to mount in the pod as an environment variable named DATABASE_PASSWORD. The following configurations enable functions to reference the secret and mount the value as an environment variable in the pod.

  1. tenant: "mytenant"
  2. namespace: "mynamespace"
  3. name: "myfunction"
  4. inputs: [ "persistent://mytenant/mynamespace/myfuncinput" ]
  5. className: "com.company.pulsar.myfunction"
  6. secrets:
  7. # the secret will be mounted from the `password` field in the `database-creds` secret as an env var called `DATABASE_PASSWORD`
  8. DATABASE_PASSWORD:
  9. path: "database-creds"
  10. key: "password"

Enable token authentication

When you use token authentication, TLS encryption, or custom authentications to secure the communication with your Pulsar cluster, Pulsar passes your certificate authority (CA) to the client, so the client can authenticate the cluster with your signed certificate.

To enable the authentication for your Pulsar cluster, you need to specify a mechanism for the pod running your function to authenticate the broker, by implementing the org.apache.pulsar.functions.auth.KubernetesFunctionAuthProvider interface.

  • For token authentication, Pulsar includes an implementation of the above interface to distribute the CA. The function worker captures the token that deploys (or updates) the function, saves it as a secret, and mounts it into the pod.

    The configuration in the conf/function-worker.yml file is as follows. functionAuthProviderClassName is used to specify the path to this implementation.

    1. functionAuthProviderClassName: org.apache.pulsar.functions.auth.KubernetesSecretsTokenAuthProvider
  • For TLS or custom authentication, you can either implement the org.apache.pulsar.functions.auth.KubernetesFunctionAuthProvider interface or use an alternative mechanism.

Configure Kubernetes runtime - 图2note

If the token you use to deploy the function has an expiration date, you may need to deploy the function again after it expires.

Customize Kubernetes runtime

Customizing Kubernetes runtime allows you to customize Kubernetes resources created by the runtime, including how to generate manifests, how to pass authenticated data to pods, and how to integrate secrets.

To customize Kubernetes runtime, you can set runtimeCustomizerClassName in the conf/functions-worker.yml file and use the fully qualified class name.

The function API provides a flag named customRuntimeOptions, which is passed to the org.apache.pulsar.functions.runtime.kubernetes.KubernetesManifestCustomizer interface. To initialize KubernetesManifestCustomizer, you can set runtimeCustomizerConfig in the conf/functions-worker.yml file.

Configure Kubernetes runtime - 图3note

runtimeCustomizerConfig is the same across all functions. If you provide both runtimeCustomizerConfig and customRuntimeOptions, you need to decide how to manage these two configurations in your implementation of the KubernetesManifestCustomizer interface.

Pulsar includes a built-in implementation initialized with runtimeCustomizerConfig. It enables you to pass a JSON document as customRuntimeOptions with certain properties to augment. To use this built-in implementation, set runtimeCustomizerClassName to org.apache.pulsar.functions.runtime.kubernetes.BasicKubernetesManifestCustomizer.

If both runtimeCustomizerConfig and customRuntimeOptions are provided and have conflicts, BasicKubernetesManifestCustomizer uses customRuntimeOptions to override runtimeCustomizerConfig.

Below is an example of configuring customRuntimeOptions.

  1. {
  2. "jobName": "jobname", // the k8s pod name to run this function instance
  3. "jobNamespace": "namespace", // the k8s namespace to run this function in
  4. "extractLabels": { // extra labels to attach to the statefulSet, service, and pods
  5. "extraLabel": "value"
  6. },
  7. "extraAnnotations": { // extra annotations to attach to the statefulSet, service, and pods
  8. "extraAnnotation": "value"
  9. },
  10. "nodeSelectorLabels": { // node selector labels to add on to the pod spec
  11. "customLabel": "value"
  12. },
  13. "tolerations": [ // tolerations to add to the pod spec
  14. {
  15. "key": "custom-key",
  16. "value": "value",
  17. "effect": "NoSchedule"
  18. }
  19. ],
  20. "resourceRequirements": { // values for cpu and memory should be defined as described here: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container
  21. "requests": {
  22. "cpu": 1,
  23. "memory": "4G"
  24. },
  25. "limits": {
  26. "cpu": 2,
  27. "memory": "8G"
  28. }
  29. }
  30. }

How to define Pulsar resource names when running Pulsar in Kubernetes

If you run Pulsar Functions or connectors on Kubernetes, you need to follow the Kubernetes naming convention to define the names of your Pulsar resources, whichever admin interface you use.

Kubernetes requires a name that can be used as a DNS subdomain name as defined in RFC 1123. Pulsar supports more legal characters than the Kubernetes naming convention. If you create a Pulsar resource name with special characters that are not supported by Kubernetes (for example, including colons in a Pulsar namespace name), Kubernetes runtime translates the Pulsar object names into Kubernetes resource labels which are in RFC 1123-compliant forms. Consequently, you can run functions or connectors using Kubernetes runtime. The rules for translating Pulsar object names into Kubernetes resource labels are as below:

  • Truncate to 63 characters

  • Replace the following characters with dashes (-):

    • Non-alphanumeric characters

    • Underscores (_)

    • Dots (.)

  • Replace beginning and ending non-alphanumeric characters with 0

Configure Kubernetes runtime - 图4tip

  • If you get an error in translating Pulsar object names into Kubernetes resource labels (for example, you may have a naming collision if your Pulsar object name is too long) or want to customize the translating rules, see customize Kubernetes runtime.
  • For how to configure Kubernetes runtime, see instructions.