Kubeflow

Quickly get running with your ML Workflow

The Kubeflow project is dedicated to making deployments of machine learning (ML)workflows on Kubernetes simple, portable and scalable. Our goal is not torecreate other services, but to provide a straightforward way to deploybest-of-breed open-source systems for ML to diverse infrastructures. Anywhereyou are running Kubernetes, you should be able to run Kubeflow.

Getting started with Kubeflow

Follow the getting-started guide to set up yourenvironment.

Then read the documentation to learn about the features of Kubeflow,including the following guides to Kubeflow components:

  • Kubeflow includes services for spawning and managingJupyter notebooks. Project Jupyteris a non-profit, open source project that supports interactive data scienceand scientific computing across many programming languages.

  • Kubeflow Pipelines is a platform forbuilding, deploying, and managing multi-step ML workflows based on Dockercontainers.

  • Kubeflow offers a number of components that you can useto build your ML training, hyperparameter tuning, and serving workloads acrossmultiple platforms.

What is Kubeflow?

Kubeflow is the machine learning toolkit for Kubernetes.

To use Kubeflow, the basic workflow is:

  • Download and run the Kubeflow deployment binary.
  • Customize the resulting configuration files.
  • Run the specified scripts to deploy your containers to your specificenvironment.

You can adapt the configuration to choose the platforms and services that youwant to use for each stage of the ML workflow: data preparation, model training,prediction serving, and service management.

You can choose to deploy your Kubernetes workloads locally or to a cloudenvironment.

The Kubeflow mission

Our goal is to make scaling machine learning (ML) models and deploying them toproduction as simple as possible, by letting Kubernetes do what it’s great at:

  • Easy, repeatable, portable deployments on a diverse infrastructure (laptop<-> ML rig <-> training cluster <-> production cluster)
  • Deploying and managing loosely-coupled microservices
  • Scaling based on demand

Because ML practitioners use a diverse set of tools, one of the key goals is tocustomize the stack based on user requirements (within reason) and let thesystem take care of the “boring stuff”. While we have started with a narrow setof technologies, we are working with many different projects to includeadditional tooling.

Ultimately, we want to have a set of simple manifests that give you an easy touse ML stack anywhere Kubernetes is already running, and that can selfconfigure based on the cluster it deploys into.

History

Kubeflow started as an open sourcing of the way Google ran TensorFlow internally, based on a pipeline called TensorFlow Extended. It began as just a simpler way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi-architecture, multi-cloud framework for running entire machine learning pipelines.

Getting involved

There are many ways to contribute to Kubeflow, and we welcome contributions!Read the contributor’s guide to get started on thecode, and get to know the community in thecommunity guide.