Deploy using CLI
Instructions for using the CLI to deploy Kubeflow on Google Cloud Platform (GCP)
This guide describes how to use the kfctl
command line interface (CLI) todeploy Kubeflow on GCP. The command line deployment gives you more control overthe deployment process and configuration than you get if you use the deploymentUI. If you’re looking for a simpler deployment procedure, see how to deployKubeflow using the deployment UI.
Before you start
Before installing Kubeflow on the command line:
Ensure you have installed the following tools:
If you’re usingCloud Shell, enableboost mode.
If you want to use Cloud Identity-Aware Proxy (CloudIAP) for access control, follow the guideto setting up OAuth credentials.Cloud IAP is recommended for production deployments or deployments withaccess to sensitive data. Alternatively, you can use basic authenticationwith a username and password.
Deploy Kubeflow
Follow these steps to deploy Kubeflow:
- Create user credentials. You only need to run this command once:
gcloud auth application-default login
- Create environment variables for your access control services:
# If using Cloud IAP, create environment variables from the
# OAuth client ID and secret that you obtained earlier:
export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
# If using basic authentication, create environment variables for
# username and password:
export KUBEFLOW_USERNAME=<your username>
export KUBEFLOW_PASSWORD=<your password>
Download a
kfctl
release from theKubeflow releases page.Unpack the tar ball:
tar -xvf kfctl_<release tag>_<platform>.tar.gz
- Run the following commands to set up and deploy Kubeflow. The code belowincludes an optional command to add the binary
kfctl
to your path. If youdon’t add the binary to your path, you must use the full path to thekfctl
binary each time you run it.
# The following command is optional, to make kfctl binary easier to use.
export PATH=$PATH:<path to your kfctl file>
# Set KFAPP to the name of your Kubeflow application. See detailed
# description in the text below this code snippet.
# For example, 'kubeflow-test' or 'kfw-test'.
export KFAPP=<your choice of application directory name>
export ZONE=<your target GCP zone> # where the deployment will be created
export PROJECT=<your GCP project ID>
# Run the following commands for the default installation which uses Cloud IAP:
export CONFIG="https://raw.githubusercontent.com/kubeflow/kubeflow/c54401e/bootstrap/config/kfctl_gcp_iap.0.6.2.yaml"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V
# Alternatively, run these commands if you want to use basic authentication:
export CONFIG="https://raw.githubusercontent.com/kubeflow/kubeflow/c54401e/bootstrap/config/kfctl_gcp_basic_auth.0.6.2.yaml"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V --use_basic_auth
cd ${KFAPP}
kfctl generate all -V --zone ${ZONE}
kfctl apply all -V
- ${KFAPP} - the name of a directory where you want Kubeflowconfigurations to be stored. This directory is created when you run
kfctl init
. If you want a custom deployment name, specify that name here.The value of this variable becomes the name of your deployment.The value of KFAPP must consist of lower case alphanumeric characters or‘-’, and must start and end with an alphanumeric character.For example, ‘kubeflow-test’ or ‘kfw-test’.The value of this variable cannot be greater than 25 characters. It mustcontain just the directory name, not the full path to the directory.The content of this directory is described in the next section. - ${PROJECT} - the project ID of the GCP project where you want Kubeflowdeployed.
- ${ZONE} - You can see a list of zones here.If you plan to use accelerators, make sure to pick a zone that supports the type you want.
- When you run
kfctl init
you need to choose to use either IAP or basicauthentication, as described above. kfctl generate all
attempts to fetch your email address from yourcredential. If it can’t find a valid email address, you need to pass avalid email address with flag—email <your email address>
. This emailaddress becomes an administrator in the configuration of your Kubeflowdeployment.- The deployment process creates a separate deployment for your data storage.After running
kfctl apply
you should notice two new deployments:
- The deployment process creates a separate deployment for your data storage.After running
{KFAPP}-storage: This deployment has persistent volumes for yourpipelines.
- {KFAPP}: This deployment has all the components of Kubeflow, includinga GKE clusternamed ${KFAPP} with Kubeflow installed.
- When the deployment finishes, check the resources installed in the namespace
kubeflow
in your new cluster. To do this from the command line, first setyourkubectl
credentials to point to the new cluster:
- When the deployment finishes, check the resources installed in the namespace
gcloud container clusters get-credentials ${KFAPP} --zone ${ZONE} --project ${PROJECT}
Then see what’s installed in the kubeflow
namespace of your GKE cluster:
```
kubectl -n kubeflow get all
```
- Access the Kubeflow central dashboard at the following URI when it becomesavailable:
https://<KFAPP>.endpoints.<project-id>.cloud.goog/
- It can take 20 minutes for the URI to become available.Kubeflow needs to provision a signed SSL certificate and register a DNSname.
- If you own/manage the domain or a subdomain withCloud DNSthen you can configure this process to be much faster.See kubeflow/kubeflow#731.
- We recommend that you check in the contents of your ${KFAPP} directoryinto source control.
Understanding the deployment process
The kfctl
deployment process includes by the following commands:
- init - performs a one-time setup.
- generate - creates configuration files defining the various resources.
- apply - creates or updates the resources.
- delete - deletes the resources.
With the exception of init
, all commands take an argument which describes theset of resources to apply the command to. This argument can be one of thefollowing:
- platform - all GCP resources; that is, anything that doesn’t run onKubernetes.
- k8s - all resources that run on Kubernetes.
- all - all GCP and Kubernetes resources.
App layout
Your Kubeflow app directory ${KFAPP} contains the following files and directories:
app.yaml defines configurations related to your Kubeflow deployment.
- The values are set when you run
kfctl init
. - The values are snapshotted inside app.yaml to make your appself contained.
- The values are set when you run
gcp_config is a directory that containsDeployment Manager configuration filesdefining your GCP infrastructure.
- The directory is created when you run
kfctl generate platform
. - You can modify these configurations to customize your GCP infrastructure.
- The directory is created when you run
kustomize is a directory that contains the kustomize packages for Kubeflowapplications. Seehow Kubeflow uses kustomize.
- The directory is created when you run
kfctl generate
. - You can customize the Kubernetes resources by modifying the manifests andrunning
kfctl apply
again.
- The directory is created when you run
GCP service accounts
Creating a deployment using kfctl
creates three service accounts in yourGCP project. These service accounts are created using the principle of leastprivilege.The three service accounts are:
${KFAPP}-admin
is used for some admin tasks like configuring the loadbalancers. The principle is that this account is needed to deploy Kubeflow butnot needed to actually run jobs.${KFAPP}-user
is intended to be used by training jobs and models to accessGCP resources (Cloud Storage, BigQuery, etc.). This account has a much smallerset of privileges compared toadmin
.${KFAPP}-vm
is used only for the virtual machine (VM) service account. Thisaccount has the minimal permissions needed to send metrics and logs toStackdriver.
Next steps
- Run a full ML workflow on Kubeflow, using theend-to-end MNIST tutorial or theGitHub issue summarizationexample.
- See how to delete your Kubeflow deploymentusing the CLI.
- See how to customize your Kubeflowdeployment.
- See how to upgrade Kubeflow and how toupgrade or reinstall a Kubeflow Pipelinesdeployment.
- Troubleshoot any issues you mayfind.