Istio Integration (for TF Serving)

Using Istio for TF Serving

Istio provides a lot of functionality that we want to have, such as metrics, auth andquota, rollout and A/B testing.

Install Istio

We assume Kubeflow is already deployed in the kubeflow namespace.

  1. kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/install/crds.yaml
  2. kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/install/istio-noauth.yaml
  3. kubectl apply -f https://raw.githubusercontent.com/kubeflow/kubeflow/master/dependencies/istio/kf-istio-resources.yaml
  4. kubectl label namespace kubeflow istio-injection=enabled

The first command installs Istio’s CRDs.

The second command installs Istio’s core components (without mTLS), with some customization:1. sidecar injection configmap policy is changed from enabled to disabled

  • istio-ingressgateway is of type NodePort instead of LoadBalancer The third command deploys some resources for Kubeflow.

The fourth command label the kubeflow namespace for sidecar injector.

See this table for sidecar injectionbehavior. We want to have configmap disabled, and namespace enabled, so that injection happens if and only ifthe pod has annotation.

Kubeflow TF Serving with Istio

This section has not yet been converted to kustomize, please refer to kubeflow/manifests/issues/18.

After installing Istio, we can deploy the TF Serving component as inTensorFlow Serving withadditional params:

  1. ks param set ${MODEL_COMPONENT} injectIstio true

This will inject an istio sidecar in the TF serving deployment.

Routing with Istio vs Ambassador

With the ambassador annotation, a TF serving deployment can be accessed at HOST/tfserving/models/MODEL_NAME.However, in order to use Istio’s Gateway to do traffic split, we should use the path provided byIstio routing: HOST/istio/tfserving/models/MODEL_NAME

Metrics

The istio sidecar reports data to Mixer.Execute the command:

  1. kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=grafana -o jsonpath='{.items[0].metadata.name}') 3000:3000

Visit http://localhost:3000/dashboard/db/istio-mesh-dashboard in your web browser.Send some requests to the TF serving service, then there should be some data (QPS, success rate, latency)like this:Istio dashboard

Define and view metrics

See istio doc.

Expose Grafana dashboard behind ingress/IAP

Grafana needs to be configuredto work properly behind a reverse proxy. We can override the default config usingenvironment variable.So do kubectl edit deploy -n istio-system grafana, and add env vars

  1. - name: GF_SERVER_DOMAIN
  2. value: YOUR_HOST
  3. - name: GF_SERVER_ROOT_URL
  4. value: '%(protocol)s://%(domain)s:/grafana'

Rolling out new model

A typical scenario is that we first deploy a model A. Then we develop another model B, and we want to deploy itand gradually move traffic from A to B. This can be achieved using Istio’s traffic routing.

  • Deploy the first model as described forTensorFlow Serving.Then you will have the service (Model) and the deployment (Version).

  • Deploy another version of the model, v2. This time, no need to deploy the service part.

  1. MODEL_COMPONENT2=mnist-v2
  2. KF_ENV=default
  3. ks generate tf-serving-deployment-gcp ${MODEL_COMPONENT2}
  4. ks param set ${MODEL_COMPONENT2} modelName mnist // modelName should be the SAME as the previous one
  5. ks param set ${MODEL_COMPONENT2} versionName v2 // v2 !!
  6. ks param set ${MODEL_COMPONENT2} modelBasePath gs://kubeflow-examples-data/mnist
  7. ks param set ${MODEL_COMPONENT2} gcpCredentialSecretName user-gcp-sa
  8. ks param set ${MODEL_COMPONENT2} injectIstio true // This is required
  9. ks apply ${KF_ENV} -c ${MODEL_COMPONENT2}

The KF_ENV environment variable represents a conceptual deployment environmentsuch as development, test, staging, or production, as defined byksonnet. For this example, we use the default environment.You can read more about Kubeflow’s use of ksonnet in the Kubeflowksonnet component guide.

  • Update the traffic weight
  1. ks param set mnist-service trafficRule v1:90:v2:10 // This routes 90% to v1, and 10% to v2
  2. ks apply ${KF_ENV} -c mnist-service