Upgrading and Reinstalling
How to upgrade or reinstall your Kubeflow Pipelines deployment
Starting from Kubeflow version 0.5, Kubeflow Pipelines persists thepipeline data in a permanent storage volume. Kubeflow Pipelines thereforesupports the following capabilities:
- Reinstall: You can delete a cluster and create a new cluster, specifyingthe storage to retrieve the original data in the new cluster.
Note that upgrade isn’t currently supported, check this issuefor progress.
Context
Kubeflow Pipelines creates and manages the following data related to yourmachine learning pipeline:
- Metadata: Experiments, jobs, runs, etc. Kubeflow Pipelinesstores the pipeline metadata in a MySQL database.
- Artifacts: Pipeline packages, metrics, views, etc. Kubeflow Pipelinesstores the artifacts in a Minio server.
The MySQL database and the Minio server are both backed by the KubernetesPersistentVolume(PV) subsystem.
- If you are deploying to Google Cloud Platform (GCP), Kubeflow Pipelinescreates a Compute EnginePersistent Disk (PD)and mounts it as a PV.
- If you are not deploying to GCP, you can specify your own preferred PV.
Deploying Kubeflow
This section describes how to deploy Kubeflow in a way that ensures you can usethe Kubeflow Pipelines reinstallation capability.
Deploying Kubeflow on GCP
Follow the guide to deploying Kubeflow onGCP. You don’t need to do anything extra.
When the deployment has finished, you can see two entries in the GCPDeployment Manager, one for deploying the cluster and one fordeploying the storage:
The entry suffixed with -storage
creates one PD for the metadata store and onefor the artifact store:
Deploying Kubeflow in other environments (non-GCP)
The steps below assume that you already have a Kubernetes cluster set up.
If you don’t need custom storage and are happy with the default PVs thatKubeflow provides, you can follow the Kubeflowquick startwithout doing anything extra. The deployment script uses the KubernetesdefaultStorageClassto provision the PVs for you.
If you want to specify a custom PV:
Create two PVs in your Kubernetes cluster with your preferred storage type.See theKubernetes guide to PVs.
Follow the Kubeflowquick start,but note the following change to the standard procedure:
Before running the apply
command:
kfctl apply all -V
You should first edit the following files to specify your PVs:
${KFAPP}/kustomize/minio/overlays/minioPd/params.env
...
minioPd=[YOUR-PRE-CREATED-MINIO-PV-NAME]
...
${KFAPP}/kustomize/mysql/overlays/mysqlPd/params.env
...
mysqlPd=[YOUR-PRE-CREATED-MYSQL-PV-NAME]
...
- Then run the
apply
command as usual:
kfctl apply k8s
Reinstalling Kubeflow Pipelines
You can delete a Kubeflow cluster and create a new one, specifyingyour existing storage to retrieve the original data in the new cluster.
Note: You must use command line deployment. You cannot reinstallKubeflow Pipelines using the web interface.
Reinstalling Kubeflow Pipelines on GCP
To reinstall Kubeflow Pipelines, follow the command line deploymentinstructions, but note the followingchange in the procedure:
Warning, when you do
kfctl init ${KFAPP} —other-flags
, you should use a different${KFAPP}
name from your existing${KFAPP}
. Otherwise, your data in existing PDs will be deleted duringkfctl apply all -V
.Before running the following
apply
command:
kfctl apply all -V
You should first:
- Edit
gcp_config/storage-kubeflow.yaml
to skip creating new storages:
...
createPipelinePersistentStorage: false
...
- Edit the following files to specify the persistent disks createdin a previous deployment:
${KFAPP}/kustomize/minio/overlays/minioPd/params.env
...
minioPd=[NAME-OF-ARTIFACT-STORAGE-DISK]
...
${KFAPP}/kustomize/mysql/overlays/mysqlPd/params.env
...
mysqlPd=[NAME-OF-METADATA-STORAGE-DISK]
...
- Then run the
apply
command:
kfctl apply all -V
Reinstalling Kubeflow in other environments (non-GCP)
The steps are the same as for any non-GCP installation, except that youmust use the same PV definitions as in your previous deployment to create thePV in the new cluster.
Create two PVs in your Kubernetes cluster, using the same PV definitions asin your previous deployment. See theKubernetes guide to PVs.
Follow the Kubeflowquick start,but note the following change to the standard procedure:
Before running the apply
command:
kfctl apply k8s
You should first edit the following files to specify your PVs:
${KFAPP}/kustomize/minio/overlays/minioPd/params.env
...
minioPd=[YOUR-PRE-CREATED-MINIO-PV-NAME]
...
${KFAPP}/kustomize/mysql/overlays/mysqlPd/params.env
...
mysqlPd=[YOUR-PRE-CREATED-MYSQL-PV-NAME]
...
- Then run the
apply
command:
kfctl apply k8s