Authenticating Kubeflow to GCP

Authentication and authorization to Google Cloud Platform (GCP)

This page describes in-cluster and local authentication for Kubeflow GCP deployments.

In-cluster authentication

Starting from Kubeflow v0.6, you consume Kubeflow from custom namespaces (that is, namespaces other than kubeflow).The kubeflow namespace is only for running Kubeflow system components. Individual jobs and model deploymentsrun in separate namespaces. To do this, install GCP credentials into the new namespace.

Starting in Kubeflow v0.7: Google Kubernetes Engine (GKE) workload identity

Starting in v0.7, Kubeflow uses the new GKE feature: workload identity.This is the recommended way to access GCP APIs from your GKE cluster.You no longer have to download GCP service account key. Instead, you can configure a Kubernetes service account (KSA) to act as a GCP service account (GSA).

If you deployed Kubeflow following the GCP instructions, then the profiler controller automatically binds the “default-editor” service account for every profile namespace to a default GCP service account created during kubeflow deployment.The Kubeflow deployment process also creates a default profile for the cluster admin.

For more info about profiles see the Multi-user isolation page.

Here is an example profile spec:

  1. apiVersion: kubeflow.org/v1beta1
  2. kind: Profile
  3. spec:
  4. plugins:
  5. - kind: WorkloadIdentity
  6. spec:
  7. gcpServiceAccount: ${SANAME}@${PROJECT}.iam.gserviceaccount.com
  8. ...

You can verify that there is a KSA called default-editor and that it has an annotation of the corresponding GSA:

  1. kubectl -n ${PROFILE_NAME} describe serviceaccount default-editor
  2. ...
  3. Name: default-editor
  4. Annotations: iam.gke.io/gcp-service-account: ${KFNAME}-user@${PROJECT}.iam.gserviceaccount.com
  5. ...

You can double check that GSA is also properly set up:

  1. gcloud --project=${PROJECT} iam service-accounts get-iam-policy ${KFNAME}-user@${PROJECT}.iam.gserviceaccount.com

When a pod uses KSA default-editor, it can access GCP APIs with the role granted to the GSA.

Provisioning custom Google service accounts in namespaces:When creating a profile, you can specify a custom GCP service account for the namespace to control which GCP resources are accessible.

Prerequisite: you must have permission to edit your GCP project’s IAM policy and to create a profile custom resource (CR) in your Kubeflow cluster.

  • if you don’t already have a GCP service account you want to use, create a new one. For example: user1-gcp@<project-id>.iam.gserviceaccount.com:
  1. gcloud iam service-accounts create user1-gcp@<project-id>.iam.gserviceaccount.com
  • You can bind roles to the GCP service account to allow access to the desired GCP resources. For example to run BigQuery job, you can grant access like so:
  1. gcloud projects add-iam-policy-binding <project-id> \
  2. --member='serviceAccount:user1-gcp@<project-id>.iam.gserviceaccount.com' \
  3. --role='roles/bigquery.jobUser'
  • Grant owner permission of service account user1-gcp@<project-id>.iam.gserviceaccount.com to cluster account <cluster-name>-admin@<project-id>.iam.gserviceaccount.com:
  1. gcloud iam service-accounts add-iam-policy-binding \
  2. user1-gcp@<project-id>.iam.gserviceaccount.com \
  3. --member='serviceAccount:<cluster-name>-admin@<project-id>.iam.gserviceaccount.com' --role='roles/owner'
  • Manually create a profile for user1 and specify the GCP service account to bind in plugins field:
  1. apiVersion: kubeflow.org/v1beta1
  2. kind: Profile
  3. metadata:
  4. name: profileName # replace with the name of the profile (the user's namespace name)
  5. spec:
  6. owner:
  7. kind: User
  8. name: user1@email.com # replace with the email of the user
  9. plugins:
  10. - kind: WorkloadIdentity
  11. spec:
  12. gcpServiceAccount: user1-gcp@project-id.iam.gserviceaccount.com

Note:The profile controller currently doesn’t perform any access control checks to see whether the user creating the profile should be able to use the GCP service account.As a result, any user who can create a profile can get access to any service account for which the admin controller has owner permissions. We will improve this in subsequent releases.

You can find more details on workload identity in the GKE documentation.

Kubeflow v0.6 and before: GCP service account key as secret

When you set up Kubeflow for GCP, it automaticallyprovisions three service accountswith different privileges in the kubeflow namespace. In particular, the ${KF_NAME}-user service account ismeant to grant your user services access to GCP. The credentials to this service account can be accessed withinthe cluster as a Kubernetes secret called user-gcp-sa.

The secret has basic access to a limited set of GCP services by default, but more roles can be granted through theGCP IAM console.

You can create a PodDefault object to attach the credentials to certain pods.

Credentials

You can add credentials to the new namespace by either copying them from an existing Kubeflow namespace or bycreating a new service account.

To copy credentials from one namespace to another namespace use the following CLI commands (Note: there is anissue filed to automate these commands):

  1. NAMESPACE=<new kubeflow namespace>
  2. SOURCE=kubeflow
  3. NAME=user-gcp-sa
  4. SECRET=$(kubectl -n ${SOURCE} get secrets ${NAME} -o jsonpath="{.data.${NAME}\.json}" | base64 -d)
  5. kubectl create -n ${NAMESPACE} secret generic ${NAME} --from-literal="${NAME}.json=${SECRET}"

To create a new service account instead of copying credentials, use the following steps:

  • Create a service account with the desired roles:
  1. export PROJECT_ID=<GCP project id>
  2. export NAMESPACE=<new kubeflow namespace>
  3. export SA_NAME=<service account name>
  4. export GCPROLES=roles/editor
  5. gcloud --project=${PROJECT_ID} iam service-accounts create $SA_NAME
  6. gcloud projects add-iam-policy-binding $PROJECT_ID \
  7. --member serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \
  8. --role $GCPROLES
  • Download the JSON service account key, set KEYPATH to the correct path, and create the key:
  1. export KEYPATH=some/path/${SA_NAME}.gcp.json
  2. gcloud --project=${PROJECT_ID} iam service-accounts keys create ${KEYPATH} \
  3. --iam-account $SA_NAME@$PROJECT_ID.iam.gserviceaccount.com
  • Upload the JSON service account key to cluster as a secret:
  1. kubectl create secret generic user-gcp-sa -n $NAMESPACE \ --from-file=user-gcp-sa.json=${KEYPATH}
PodDefault object

The PodDefault object is a way to centrally manage configurations that should be added to all pods.

The PodDefault will match all pods with the specified selector and modify the pods to inject the volumes,secrets, and environment variables listed in the pod manifest.

Create a pod default in a file called add-gcp-secret.yaml and apply it using kubectl apply -f add-gcp-secret.yaml -n $NAMESPACE:

  1. apiVersion: "kubeflow.org/v1alpha1"
  2. kind: PodDefault
  3. metadata:
  4. name: add-gcp-secret
  5. spec:
  6. selector:
  7. matchLabels:
  8. addgcpsecret: "true"
  9. desc: "add gcp credential"
  10. env:
  11. - name: GOOGLE_APPLICATION_CREDENTIALS
  12. value: /secret/gcp/user-gcp-sa.json
  13. volumeMounts:
  14. - name: secret-volume
  15. mountPath: /secret/gcp
  16. volumes:
  17. - name: secret-volume
  18. secret:
  19. secretName: user-gcp-sa

Authentication from a Pod

You must do two things to access a GCP service account from a Pod:

  • Mount the secret as a file. This gives your Pod access to your GCP account,so be careful which Pods you grant access to.
  • Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to the service account.GCP libraries use this environment variable to find the service account and authenticate with GCP. The following YAML describes a Pod that has access to the ${KF_NAME}-user service account:
  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: mypod
  5. spec:
  6. containers:
  7. - name: mypod
  8. image: myimage
  9. env:
  10. - name: GOOGLE_APPLICATION_CREDENTIALS
  11. value: "/var/secrets/user-sa.json"
  12. volumeMounts:
  13. - name: gcp-secret
  14. mountPath: "/var/secrets/user-sa.json"
  15. readOnly: true
  16. volumes:
  17. - name: gcp-secret
  18. secret:
  19. secretName: myappname-user

Authentication from Kubeflow Pipelines

Refer to Authenticating Pipelines to GCP.


Local authentication

gcloud

Use the gcloud tool to interact with Google Cloud Platform (GCP) on the command line.You can use the gcloud command to set up Google Kubernetes Engine (GKE) clusters,and interact with other Google services.

Logging in

You have two options for authenticating the gcloud command:

  • You can use a user account to authenticate using a Google account (typically Gmail).You can register a user account using gcloud auth login,which brings up a browser window to start the familiar Google authentication flow.

  • You can create a service account within your GCP project. You can thendownload a .json key fileassociated with the account, and run thegcloud auth activate-service-accountcommand to authenticate your gcloud session.

You can find more information in the GCP docs.

Listing active accounts

You can run the following command to verify you are authenticating with the expected account:

  1. gcloud auth list

In the output of the command, an asterisk denotes your active account.

Viewing IAM roles

Permissions are handled in GCP using IAM Roles.These roles define which resources your account can read or write to. Provided you have thenecessary permissions,you can check which roles were assigned to your account using the following gcloud command:

  1. PROJECT_ID=your-gcp-project-id-here
  2. gcloud projects get-iam-policy $PROJECT_ID --flatten="bindings[].members" \
  3. --format='table(bindings.role)' \
  4. --filter="bindings.members:$(gcloud config list account --format 'value(core.account)')"

You can view and modify roles through theGCP IAM console.

You can find more information about IAM in theGCP docs.


kubectl

The kubectl tool is used for interacting with a Kubernetes cluster through the command line.

Connecting to a cluster using a GCP account

If you set up your Kubernetes cluster using GKE, you can authenticate with the cluster using a GCP account.The following commands fetch the credentials for your cluster and save them to your localkubeconfig file:

  1. CLUSTER_NAME=your-gke-cluster
  2. ZONE=your-gcp-zone
  3. gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONE

You can find more information in theGCP docs.

Changing active clusters

If you work with multiple Kubernetes clusters, you may have multiple contexts saved in your localkubeconfig file.You can view the clusters you have saved by run the following command:

  1. kubectl config get-contexts

You can view which cluster is currently being controlled by kubectl with the following command:

  1. CONTEXT_NAME=your-new-context
  2. kubectl config set-context $CONTEXT_NAME

You can find more information in theKubernetes docs.

Checking RBAC permissions

Like GKE IAM, Kubernetes permissions are typically handled with a “role-based authorization control” (RBAC) system.Each Kubernetes service account has a set of authorized roles associated with it. If your account doesn’t have theright roles assigned to it, certain tasks fail.

You can check if an account has the proper permissions to run a command by building a query structured askubectl auth can-i [VERB] [RESOURCE] —namespace [NAMESPACE]. For example, the following command verifiesthat your account has permissions to create deployments in the kubeflow namespace:

  1. kubectl auth can-i create deployments --namespace kubeflow

You can find more information in theKubernetes docs.

Adding RBAC permissions

If you find you are missing a permission you need, you can grant the missing roles to your service account usingKubernetes resources.

  • Roles describe the permissions you want to assign. For example, verbs: ["create"], resources:["deployments"]
  • RoleBindings define a mapping between the Role, and a specific service account

By default, Roles and RoleBindings apply only to resources in a specific namespace, but there are alsoClusterRoles and ClusterRoleBindings that can grant access to resources cluster-wide

You can find more information in theKubernetes docs.

Next steps

See the troubleshooting guide for help with diagnosing and fixing issues you may encounter with Kubeflow on GCP