Deploy PMML model with InferenceService

PMML, or predictive model markup language, is an XML format for describing data mining and statistical models, including inputs to the models, transformations used to prepare data for data mining, and the parameters that define the models themselves. In this example we show how you can serve the PMML format model on InferenceService.

Create the InferenceService

  1. apiVersion: "serving.kserve.io/v1beta1"
  2. kind: "InferenceService"
  3. metadata:
  4. name: "pmml-demo"
  5. spec:
  6. predictor:
  7. pmml:
  8. storageUri: gs://kfserving-examples/models/pmml

Create the InferenceService with above yaml

  1. kubectl apply -f pmml.yaml

Expected Output

  1. $ inferenceservice.serving.kserve.io/pmml-demo created

Warning

The pmmlserver is based on Py4J and that doesn’t support multi-process mode, so we can’t set spec.predictor.containerConcurrency. If you want to scale the PMMLServer to improve prediction performance, you should set the InferenceService’s resources.limits.cpu to 1 and scale the replica size.

Run a prediction

The first step is to determine the ingress IP and ports and set INGRESS_HOST and INGRESS_PORT

  1. MODEL_NAME=pmml-demo
  2. INPUT_PATH=@./pmml-input.json
  3. SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
  4. curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

Expected Output

  1. * TCP_NODELAY set
  2. * Connected to localhost (::1) port 8081 (#0)
  3. > POST /v1/models/pmml-demo:predict HTTP/1.1
  4. > Host: pmml-demo.default.example.com
  5. > User-Agent: curl/7.64.1
  6. > Accept: */*
  7. > Content-Length: 45
  8. > Content-Type: application/x-www-form-urlencoded
  9. >
  10. * upload completely sent off: 45 out of 45 bytes
  11. < HTTP/1.1 200 OK
  12. < content-length: 39
  13. < content-type: application/json; charset=UTF-8
  14. < date: Sun, 18 Oct 2020 15:50:02 GMT
  15. < server: istio-envoy
  16. < x-envoy-upstream-service-time: 12
  17. <
  18. * Connection #0 to host localhost left intact
  19. {"predictions": [{'Species': 'setosa', 'Probability_setosa': 1.0, 'Probability_versicolor': 0.0, 'Probability_virginica': 0.0, 'Node_Id': '2'}]}* Closing connection 0