TensorFlow Batch Predict

Batch prediction for TensorFlow models

Kubeflow Batch Predict

Kubeflow batch-predict allows users to run predict jobs over a trainedTensorFlow model in SavedModel format in a batch mode. It isapache-beam-based and currently runs with a localrunner on a single node in a Kubernetes cluster.

Run a TensorFlow Batch Predict Job

Note: Before running a job, you should have deployed kubeflow to your cluster.

To run batch prediction, we create a Kubernetes job to run beam. Kubeflow provides a ksonnet prototype suitable for you to to generate a component which you can then customize for your jobs.

Create the component

  1. MY_BATCH_PREDICT_JOB=my_batch_predict_job
  2. GCP_CREDENTIAL_SECRET_NAME=user-gcp-sa
  3. INPUT_FILE_PATTERNS=gs://my_data_bucket/my_file_pattens
  4. MODEL_PATH=gs://my_model_bucket/my_model
  5. OUTPUT_RESULT_PREFIX=gs://my_data_bucket/my_result_prefix
  6. OUTPUT_ERROR_PREFIX=gs://my_data_bucket/my_error_prefix
  7. BATCH_SIZE=4
  8. INPUT_FILE_FORMAT=my_format
  9. ks registry add kubeflow-git github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow
  10. ks pkg install kubeflow-git/examples
  11. ks generate tf-batch-predict ${MY_BATCH_PREDICT_JOB}
  12. --gcpCredentialSecretName=${GCP_CREDENTIAL_SECRET_NAME} \
  13. --inputFilePatterns=${INPUT_FILE_PATTERNS} \
  14. --inputFileFormat=${INPUT_FILE_FORMAT} \
  15. --modelPath=${MODEL_PATH} \
  16. --outputResultPrefix=${OUTPUT_RESULT_PREFIX} \
  17. --outputErrorPrefix=${OUTPUT_ERROR_PREFIX} \
  18. --batchSize=${BATCH_SIZE}

The supported parameters and their usage:

  • inputFilePatterns The list of input files or file patterns, separated by commas.

  • inputFileFormat One of the following values: json, tfrecord, and tfrecord_gzip.

  • modelPath The path containing the model files in SavedModel format.

  • batchSize Number of prediction instances in one batch. This largelydepends on how many instances can be held and processed simultaneously in thememory of your machine.

  • outputResultPrefix Output path to save the prediction results.

  • outputErrorPrefix Output path to save the prediction errors.

  • numGpus Number of GPUs to use per machine.

  • gcpCredentialSecretName Secret name if used on GCP. Only needed for running the jobs in GKE in order to output results to GCS.

You can set or update values for optional parameters after generating thecomponent. For example, you can set the modelPath to a new value (e.g. to testout another model) or set the output to another gcs location (e.g. in order notto overwrite the results from previousruns). For example:

  1. ks param set --env=default ${MY_BATCH_PREDICT_JOB} modelPath gs://my_new_bucket/my_new_model
  2. ks param set --env=default ${MY_BATCH_PREDICT_JOB} outputResultPrefix gs://my_new_bucket/my_new_output

Use GPUs

To use GPUs your cluster must be configured to use GPUs.

When all the conditions above are satisfied, you should set the number of GPUs to a positive integer. For example:

  1. ks param set --env=default ${MY_BATCH_PREDICT_JOB} numGpus 1

This way, the batch-predict job will use a GPU version of docker image and add appropriateconfiguration to start the kubernetes job.

Submit the job

  1. export KF_ENV=default
  2. ks apply ${KF_ENV} -c ${MY_BATCH_PREDICT_JOB_NAME}

The KF_ENV environment variable represents a conceptual deployment environmentsuch as development, test, staging, or production, as defined byksonnet. For this example, we use the default environment.You can read more about Kubeflow’s use of ksonnet in the Kubeflowksonnet component guide.

You should see that a job is started to provision the batch-predict docker image.Then a pod starts to run the job.

  1. kubectl get pods
  2. kubectl logs -f ${POD_NAME}

You can check the state of the pod to determine if a job is running,failed, or completed. Once it is completed, you can checkthe result output location to see if any sensible results are generated. Ifanything goes wrong, check the error output location where the error message isstored.

Delete the job

  1. ks delete ${KF_ENV} -c ${MY_BATCH_PREDICT_JOB_NAME}