Connecting to Kubeflow Pipelines using the SDK client
How to connect to Kubeflow Pipelines using the SDK client and configure the SDK client using environment variables
This guide demonstrates how to connect to Kubeflow Pipelines using the Kubeflow Pipelines SDK client, and how to configure the SDK client using environment variables.
Before you begin
- You need a Kubeflow Pipelines deployment using one of the installation options.
- Install the Kubeflow Pipelines SDK.
Connect to Kubeflow Pipelines using SDK
The Kubeflow Pipelines REST API is available at the same endpoint as the Kubeflow Pipelines user interface (UI). The SDK client can send requests to this endpoint to upload pipelines, create pipeline runs, schedule recurring runs and more.
Connect to Kubeflow Pipelines from outside your cluster
Kubeflow distributions secure the Kubeflow Pipelines public endpoint with authentication and authorization. Since Kubeflow distributions can have different authentication and authorization requirements, the steps needed to connect to your Kubeflow Pipelines instance might be different depending on the Kubeflow distribution you installed. Refer to documentation for your Kubeflow distribution:
- Connecting to Kubeflow Pipelines on Google Cloud using the SDK
- Kubeflow Pipelines on AWS
- Authentication using OIDC in Azure
- Pipelines on IBM Cloud Kubernetes Service (IKS)
For Kubeflow Pipelines standalone and Google Cloud AI Platform Pipelines, you can also connect to the API via kubectl port-forward
.
Kubeflow Pipelines standalone deploys a Kubernetes service named ml-pipeline-ui
in your Kubernetes cluster without extra authentication.
You can use kubectl port-forward to port forward the Kubernetes service locally to your laptop outside of the cluster:
# Change the namespace if you deployed Kubeflow Pipelines in a different
# namespace.
$ kubectl port-forward svc/ml-pipeline-ui 3000:80 --namespace kubeflow
You can verify that port forwarding is working properly by visiting http://localhost:3000 in your browser. If port forwarding is working properly, the Kubeflow Pipelines UI appears.
Run the following python code to instantiate the kfp.Client
:
import kfp
client = kfp.Client(host='http://localhost:3000')
print(client.list_experiments())
Note, for Kubeflow Pipelines in multi-user mode, you cannot access the API using kubectl port-forward because it requires authentication. Refer to distribution specific documentation as recommended above.
Connect to Kubeflow Pipelines from the same cluster
Note, this is not supported right now for multi-user Kubeflow Pipelines, refer to Multi-User Isolation for Pipelines – Current Limitations.
As mentioned above, the Kubeflow Pipelines API Kubernetes service is ml-pipeline-ui
.
Using Kubernetes standard mechanisms to discover the service, you can access ml-pipeline-ui
service from a Pod in the same namespace by DNS name:
import kfp
client = kfp.Client(host='http://ml-pipeline-ui:80')
print(client.list_experiments())
Or, you can access ml-pipeline-ui
service by using environment variables:
import kfp
import os
host = os.getenv('ML_PIPELINE_UI_SERVICE_HOST')
port = os.getenv('ML_PIPELINE_UI_SERVICE_PORT')
client = kfp.Client(host=f'http://{host}:{port}')
print(client.list_experiments())
When accessing Kubeflow Pipelines from a Pod in a different namespace, you must access by the service name and the namespace:
import kfp
namespace = 'kubeflow' # or the namespace you deployed Kubeflow Pipelines
client = kfp.Client(host=f'http://ml-pipeline-ui.{namespace}:80')
print(client.list_experiments())
Configure SDK client by environment variables
It’s usually beneficial to configure the Kubeflow Pipelines SDK client using Kubeflow Pipelines environment variables, so that you can initiate kfp.Client
instances without any explicit arguments.
For example, when the API endpoint is http://localhost:3000, run the following to configure environment variables in bash:
export KF_PIPELINES_ENDPOINT=http://localhost:3000
Or configure in a Jupyter Notebook by using the IPython built-in %env
magic command:
%env KF_PIPELINES_ENDPOINT=http://localhost:3000
Then you can use the SDK client without explicit arguments.
import kfp
# When not specified, host defaults to env var KF_PIPELINES_ENDPOINT.
# This is now equivalent to `client = kfp.Client(host='http://localhost:3000')`
client = kfp.Client()
print(client.list_experiments())
Refer to more configurable environment variables here.
Next Steps
- Using the Kubeflow Pipelines SDK
- Kubeflow Pipelines SDK Reference
- Experiment with the Kubeflow Pipelines API
Last modified 27.05.2021: doc(kfp): connecting KFP SDK client to API generic introduction (#2729) (4636ab5d)