Run function workers separately

The following diagram illustrates how function workers run as a separate process in separate machines.

Function workers run separately in Pulsar

Run function workers separately - 图2note

The Service URLs in the illustration represent Pulsar service URLs that Pulsar client and Pulsar admin use to connect to a Pulsar cluster.

To set up function workers that run separately, complete the following steps:

Step 1: Configure function workers to run separately

Run function workers separately - 图3note

To run function workers separately, you need to keep functionsWorkerEnabled as its default value (false) in the conf/broker.conf file.

Configure worker parameters

Configure the required parameters for workers in the conf/functions_worker.yml file.

  • workerId: The identity of a worker node, which is unique across clusters. The type is string.
  • workerHostname: The hostname of the worker node.
  • workerPort: The port that the worker server listens on. Keep it as default if you don’t customize it. Set it to null to disable the plaintext port.
  • workerPortTls: The TLS port that the worker server listens on. Keep it as default if you don’t customize it. For more information about TLS encryption settings, refer to settings.

Run function workers separately - 图4note

When accessing function workers to manage functions, the pulsar-admin CLI or any of the clients should use the configured workerHostname and workerPort to generate an --admin-url.

Configure function package parameters

Configure the numFunctionPackageReplicas parameter in the conf/functions_worker.yml file. It indicates the number of replicas to store function packages.

Run function workers separately - 图5note

To ensure high availability in a production deployment, set numFunctionPackageReplicas to equal the number of bookies. The default value 1 is only for one-node cluster deployment.

Configure function metadata parameters

Configure the required parameter for function metadata in the conf/functions_worker.yml file.

  • pulsarServiceUrl: The Pulsar service URL for your broker cluster.
  • pulsarWebServiceUrl: The Pulsar web service URL for your broker cluster.
  • pulsarFunctionsCluster: Set the value to your Pulsar cluster name (same as the clusterName setting in the conf/broker.conf file).

If authentication is enabled on your broker cluster, you must configure the following authentication settings for the function workers to communicate with the brokers.

  • brokerClientAuthenticationEnabled: Whether to enable the broker client authentication used by function workers to talk to brokers.
  • clientAuthenticationPlugin: The authentication plugin to be used by the Pulsar client used in worker service.
  • clientAuthenticationParameters: The authentication parameter to be used by the Pulsar client used in worker service.

Enable security settings

When you run a function worker separately in a cluster configured with authentication, your function worker needs to communicate with the broker and authenticate incoming requests. Thus you need to configure the properties that the broker requires for authentication and authorization.

Run function workers separately - 图6note

You must configure both the function worker authentication and authorization for the server to authenticate incoming requests and configure the client to be authenticated to communicate with the broker.

For example, if you use token authentication, you need to configure the following properties in the conf/function-worker.yml file.

  1. brokerClientAuthenticationPlugin: org.apache.pulsar.client.impl.auth.AuthenticationToken
  2. brokerClientAuthenticationParameters: file:///etc/pulsar/token/admin-token.txt
  3. configurationMetadataStoreUrl: zk:zookeeper-cluster:2181 # auth requires a connection to zookeeper
  4. authenticationProviders:
  5. - "org.apache.pulsar.broker.authentication.AuthenticationProviderToken"
  6. authorizationEnabled: true
  7. authenticationEnabled: true
  8. superUserRoles:
  9. - superuser
  10. - proxy
  11. properties:
  12. tokenSecretKey: file:///etc/pulsar/jwt/secret # if using a secret token, key file must be DER-encoded
  13. tokenPublicKey: file:///etc/pulsar/jwt/public.key # if using public/private key tokens, key file must be DER-encoded

You can enable the following security settings on function workers.

Enable TLS encryption

To enable TLS encryption, configure the following settings.

  1. useTLS: true
  2. pulsarServiceUrl: pulsar+ssl://localhost:6651/
  3. pulsarWebServiceUrl: https://localhost:8443
  4. tlsEnabled: true
  5. tlsCertificateFilePath: /path/to/functions-worker.cert.pem
  6. tlsKeyFilePath: /path/to/functions-worker.key-pk8.pem
  7. tlsTrustCertsFilePath: /path/to/ca.cert.pem
  8. // The path to trusted certificates used by the Pulsar client to authenticate with Pulsar brokers
  9. brokerClientTrustCertsFilePath: /path/to/ca.cert.pem

For more details on TLS encryption, refer to Transport Encryption using TLS.

Enable authentication providers

To enable authentication providers on function workers, substitute the authenticationProviders parameter with the providers you want to enable.

  1. authenticationEnabled: true
  2. authenticationProviders: [provider1, provider2]

For mTLS authentication provider, follow the example below to add the required settings.

  1. brokerClientAuthenticationPlugin: org.apache.pulsar.client.impl.auth.AuthenticationTls
  2. brokerClientAuthenticationParameters: tlsCertFile:/path/to/admin.cert.pem,tlsKeyFile:/path/to/admin.key-pk8.pem
  3. authenticationEnabled: true
  4. authenticationProviders: ['org.apache.pulsar.broker.authentication.AuthenticationProviderTls']

For SASL authentication provider, add saslJaasClientAllowedIds and saslJaasServerSectionName under properties.

  1. properties:
  2. saslJaasClientAllowedIds: .*pulsar.*
  3. saslJaasServerSectionName: Broker

For token authentication provider, add the required settings under properties.

  1. properties:
  2. tokenSecretKey: file://my/secret.key
  3. # If using public/private
  4. # tokenPublicKey: file://path/to/public.key

Run function workers separately - 图7note

Key files must be DER (Distinguished Encoding Rules)-encoded.

Enable authorization providers

To enable authorization on function workers, complete the following steps.

  1. Configure authorizationEnabled, authorizationProvider and configurationMetadataStoreUrl in the functions_worker.yml file. The authentication provider connects to configurationMetadataStoreUrl to receive namespace policies.

    1. authorizationEnabled: true
    2. authorizationProvider: org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider
    3. configurationMetadataStoreUrl: <meta-type>:<configuration-metadata-store-url>
  2. Configure a list of superuser roles. The superuser roles can access any admin API. The following configuration is an example.

    1. superUserRoles:
    2. - role1
    3. - role2
    4. - role3

Configure BookKeeper authentication

If authentication is enabled on the BookKeeper cluster, you need to configure the following BookKeeper authentication settings for your function workers.

  • bookkeeperClientAuthenticationPlugin: the authentication plugin name of BookKeeper client.
  • bookkeeperClientAuthenticationParametersName: the authentication plugin parameters of BookKeeper client, including names and values.
  • bookkeeperClientAuthenticationParameters: the authentication plugin parameters of BookKeeper client.

Step 2: Start function workers

Run function workers separately - 图8note

Before starting function workers, make sure function runtime is configured.

  • You can start a function worker in the background by using the pulsar-daemon CLI tool:

    1. bin/pulsar-daemon start functions-worker
  • To start a function worker in the foreground, you can use the pulsar-admin CLI as follows.

    1. bin/pulsar functions-worker

Step 3: Configure proxies for standalone function workers

When you are running function workers in a separate cluster, the admin rest endpoints are split into two clusters as shown in the following figure. The functions, function-worker, source, and sink endpoints are now served by the worker cluster, while all the other remaining endpoints are served by the broker cluster. This requires you to use the right service URL accordingly in the pulsar-admin CLI. To address this inconvenience, you can start a proxy cluster that serves as the central entry point of the admin service for routing admin rest requests.

assets/functions-worker-separated-proxy.svg

Run function workers separately - 图10tip

If you haven’t set up a proxy cluster yet, follow the instructions to deploy one.

To enable a proxy for routing function-related admin requests to function workers, you can edit the conf/proxy.conf file to modify the following settings:

  1. functionWorkerWebServiceURL=<pulsar-functions-worker-web-service-url>
  2. functionWorkerWebServiceURLTLS=<pulsar-functions-worker-web-service-url>