Upgrading and Reinstalling
How to upgrade or reinstall your Kubeflow Pipelines deployment
Starting from Kubeflow v0.5, Kubeflow Pipelines persists thepipeline data in permanent storage volumes. Kubeflow Pipelines thereforesupports the following capabilities:
Reinstall: You can delete a cluster and create a new cluster, specifyingthe existing storage volumes to retrieve the original data in the new cluster.
Upgrade (limited support):
The full Kubeflow deployment currently supports upgrading in Alphastatus with limited support. Check the following sources for progressupdates:
Before you start
This guide tells you how to reinstall Kubeflow Pipelines as part of afull Kubeflow deployment. This guide therefore assumes that you want to use oneof the options in the Kubeflow deploymentguide to deploy Kubeflow Pipelines withKubeflow.
Note the following alternatives:
Instead of the full Kubeflow deployment, you can use Kubeflow PipelinesStandalone, which does support upgrading. See how to upgrade theKubeflow Pipelines Standalonedeployment.
If you’re using Kubeflow Pipelines on Google Cloud Platform (GCP), see how toupgrade or reinstall Kubeflow Pipelines onGCP.
Kubeflow Pipelines data storage
Kubeflow Pipelines creates and manages the following data related to yourmachine learning pipeline:
- Metadata: Experiments, jobs, runs, etc. Kubeflow Pipelinesstores the pipeline metadata in a MySQL database.
- Artifacts: Pipeline packages, metrics, views, etc. Kubeflow Pipelinesstores the artifacts in a Minio server.
Kubeflow Pipelines uses the KubernetesPersistentVolume(PV) subsystem to provision the MySQL database and the Minio server.You can specify your own preferred PV.
Deploying Kubeflow
This section describes how to deploy Kubeflow in a way that ensures you can usethe Kubeflow Pipelines reinstallation capability.
If you don’t need custom storage and are happy with the default PVs thatKubeflow provides, you can follow the Kubeflowdeployment guidewithout doing anything extra. The deployment process uses the KubernetesdefaultStorageClassto provision the PVs for you.
If you want to specify a custom PV:
Create your Kubernetes cluster if you don’t already have one.See the Kubernetes documentation.
Create two PVs in your Kubernetes cluster with your preferred storage type.See theKubernetes guide to PVs.
Follow the Kubeflowdeployment guide,but note the following changes to the standard procedure.
Before running the
kfctl apply
command:- Edit
${KF_DIR}/kustomize/minio/overlays/minioPd/params.env
and specifythe PV for the Minio server:
- Edit
...
minioPd=[YOUR-PRE-CREATED-MINIO-PV-NAME]
...
-
Edit ${KF_DIR}/kustomize/mysql/overlays/mysqlPd/params.env
and specifythe PV for the MySQL database:
...
mysqlPd=[YOUR-PRE-CREATED-MYSQL-PV-NAME]
...
- Run the
kfctl apply
command to deploy Kubeflow as usual:
kfctl apply -V -f ${CONFIG_FILE}
Reinstalling Kubeflow Pipelines
You can delete a Kubeflow cluster and create a new one, specifyingyour existing storage to retrieve the original data in the new cluster.
Notes:
- You must use command-line deployment.You cannot reinstall Kubeflow Pipelines using the web interface.
- When you do
kfctl apply
orkfctl build
, you should use a differentdeployment name from your existing deployment name. Using a different nameensures that your data is safe in case of a deployment failure. This guidedefines the deployment name in the ${KF_NAME} environment variable. - The reinstallation steps are the same as for a standard Kubeflow installation,except that you must use the same PV definitions as in your previousdeployment to create the PV in the new cluster.
To reinstall Kubeflow Pipelines:
Create two PVs in your Kubernetes cluster, using the same PV definitions asin your previous deployment. See theKubernetes guide to PVs.
Follow the Kubeflowdeployment guide,but note the following changes to the standard procedure.
Set a different
${KF_NAME}
name from your existing${KF_NAME}
.Before running the
kfctl apply
command:- Edit
${KF_DIR}/kustomize/minio/overlays/minioPd/params.env
and specifythe PV for the Minio server:
- Edit
...
minioPd=[YOUR-PRE-CREATED-MINIO-PV-NAME]
...
- Edit
${KF_DIR}/kustomize/mysql/overlays/mysqlPd/params.env
and specifythe PV for the MySQL database:
...
mysqlPd=[YOUR-PRE-CREATED-MYSQL-PV-NAME]
...
- Run the
kfctl apply
command to deploy Kubeflow as usual:
kfctl apply -V -f ${CONFIG_FILE}
You should now have a new Kubeflow deployment that uses the same pipelines datastorage as your previous deployment. Follow any remaining steps in theKubeflow deployment guideto check your deployment, depending on the deployment option you chose.