IBM Cloud Private for Kubeflow

Get Kubeflow running on IBM Cloud Private

This guide is a quick start to deploy Kubeflow on IBM Cloud Private 3.1.0 or later. IBM Cloud Private is an enterprise PaaS layer for developing and managing on-premises, containerized applications. It is an integrated environment for managing containers that includes the container orchestrator Kubernetes, a private image registry, a management console, and monitoring frameworks.

Prerequisites

  • Get the system requirements from IBM Knowledge Center for IBM Cloud Private.

  • Setup NFS Server and export one or more path for persistent volume.

Installing IBM Cloud Private

Following installation steps in IBM Knowledge Center to install IBM Cloud Private 3.1.0 or later with master, proxy, worker, and optional management and vulnerability advisory nodes in your cluster in standard or high availability configurations.

The guide takes IBM Cloud Private 3.1.0 as example below. You can check the IBM Cloud Private after installation.

  1. # kubectl get node
  2. NAME STATUS ROLES AGE VERSION
  3. 10.43.0.38 Ready management 11d v1.11.1+icp-ee
  4. 10.43.0.39 Ready master,etcd,proxy 11d v1.11.1+icp-ee
  5. 10.43.0.40 Ready va 11d v1.11.1+icp-ee
  6. 10.43.0.44 Ready worker 11d v1.11.1+icp-ee
  7. 10.43.0.46 Ready worker 11d v1.11.1+icp-ee
  8. 10.43.0.49 Ready worker 11d v1.11.1+icp-ee

Creating image policy and persistent volume

  • Create Kubernetes namespace.
  1. export K8S_NAMESPACE=kubeflow
  2. kubectl create namespace $K8S_NAMESPACE
  • K8S_NAMESPACE is namespace name that the Kubeflow will be installed in. By default should be “kubeflow”.
    • Create image policy for the namespace.

The image policy definition file (image-policy.yaml) is as following:

  1. apiVersion: securityenforcement.admission.cloud.ibm.com/v1beta1
  2. kind: ImagePolicy
  3. metadata:
  4. name: image-policy
  5. spec:
  6. repositories:
  7. - name: docker.io/*
  8. policy: null
  9. - name: k8s.gcr.io/*
  10. policy: null
  11. - name: gcr.io/*
  12. policy: null
  13. - name: ibmcom/*
  14. policy: null
  15. - name: quay.io/*
  16. policy: null

Create ImagePolicy for the specified namespace.

  1. kubectl create -n $K8S_NAMESPACE -f image-policy.yaml
  • Create persistent volume (PV) for Kubeflow components.

Some Kubeflow components need PVs to storage data, such as minio, mysql katib. We need to create PVs for those pods in advance.The PVs defination file (pv.yaml) is as following:

  1. apiVersion: v1
  2. kind: PersistentVolume
  3. metadata:
  4. name: kubeflow-pv1
  5. spec:
  6. capacity:
  7. storage: 20Gi
  8. accessModes:
  9. - ReadWriteOnce
  10. persistentVolumeReclaimPolicy: Retain
  11. nfs:
  12. path: ${NFS_SHARED_DIR}/pv1
  13. server: ${NFS_SERVER_IP}
  14. ---
  15. apiVersion: v1
  16. kind: PersistentVolume
  17. metadata:
  18. name: kubeflow-pv2
  19. spec:
  20. capacity:
  21. storage: 20Gi
  22. accessModes:
  23. - ReadWriteOnce
  24. persistentVolumeReclaimPolicy: Retain
  25. nfs:
  26. path: ${NFS_SHARED_DIR}/pv2
  27. server: ${NFS_SERVER_IP}
  28. ---
  29. apiVersion: v1
  30. kind: PersistentVolume
  31. metadata:
  32. name: kubeflow-pv3
  33. spec:
  34. capacity:
  35. storage: 20Gi
  36. accessModes:
  37. - ReadWriteOnce
  38. persistentVolumeReclaimPolicy: Retain
  39. nfs:
  40. path: ${NFS_SHARED_DIR}/pv3
  41. server: ${NFS_SERVER_IP}
  • NFS_SERVER_IP is the NFS server IP, that can be management node IP but need management node need to support NFS mounting.
  • NFS_SHARED_DIR is the NFS shared path that can be mounted by othe nodes in IBM Cloud Private cluster. And ensure the sub-folders(pv1, pv2,pv3) in defination above are created. Create PV by running below command:
  1. kubectl create -f pv.yaml

Installing Kubeflow

Follow these steps to deploy Kubeflow:

  1. tar -xvf kfctl_<release tag>_<platform>.tar.gz
  • Run the following commands to set up and deploy Kubeflow. The code belowincludes an optional command to add the binary kfctl to your path. If you don’t add the binary to your path, you must use the full path to the kfctlbinary each time you run it.
  1. # The following command is optional, to make kfctl binary easier to use.
  2. export PATH=$PATH:<path to kfctl in your Kubeflow installation>
  3. export KFAPP=<your choice of application directory name>
  4. # Default uses IAP.
  5. kfctl init ${KFAPP}
  6. cd ${KFAPP}
  7. kfctl generate all -V
  8. kfctl apply all -V
  • ${KFAPP} - the name of a directory where you want Kubeflowconfigurations to be stored. This directory is created when you runkfctl init. If you want a custom deployment name, specify that name here.The value of this variable becomes the name of your deployment.The value of this variable cannot be greater than 25 characters. It mustcontain just the directory name, not the full path to the directory.The content of this directory is described in the next section.
    • Check the resources deployed in namespace kubeflow:
  1. kubectl -n kubeflow get all

Access Kubeflow dashboard

Change the Ambassador service type to NodePort, then access the Kubeflow dashboard through Ambassador.

  1. kubectl -n kubeflow patch service ambassador -p '{"spec":{"type": "NodePort"}}'
  2. AMBASSADOR_PORT=$(kubectl -n kubeflow get service ambassador -ojsonpath='{.spec.ports[?(@.name=="ambassador")].nodePort}')

Then you will find the NodePort and access the Kubeflow dashboard by NodePort.

  1. http://${MANAGEMENT_IP}:$AMBASSADOR_PORT/
  • MANAGEMENT_IP is management node IP.
  • AMBASSADOR_PORT is the ambassador port.

For Kubeflow 0.6.1, the Ambassador service has been dropped. The Kubeflow Dashboard can be accessed via istio-ingressgateway service. If loadbalancer is not available in your environment, NodePort or Port forwarding can be used to access the Kubeflow Dashboard. Refer Ingress Gateway guide.

Delete Kubeflow

Run the following commands to delete your deployment and reclaim all resources:

  1. cd ${KFAPP}
  2. # If you want to delete all the resources, including storage.
  3. kfctl delete all --delete_storage
  4. # If you want to preserve storage, which contains metadata and information
  5. # from mlpipeline.
  6. kfctl delete all