Description

Generalized Linear Model stream predict. https://en.wikipedia.org/wiki/Generalized_linear_model.

Parameters

Name Description Type Required? Default Value
linkPredResultCol link predict col name of output String null
predictionCol Column name of prediction. String

Script Example

Code

  1. # data
  2. data = np.array([
  3. [1.6094,118.0000,69.0000,1.0000,2.0000],
  4. [2.3026,58.0000,35.0000,1.0000,2.0000],
  5. [2.7081,42.0000,26.0000,1.0000,2.0000],
  6. [2.9957,35.0000,21.0000,1.0000,2.0000],
  7. [3.4012,27.0000,18.0000,1.0000,2.0000],
  8. [3.6889,25.0000,16.0000,1.0000,2.0000],
  9. [4.0943,21.0000,13.0000,1.0000,2.0000],
  10. [4.3820,19.0000,12.0000,1.0000,2.0000],
  11. [4.6052,18.0000,12.0000,1.0000,2.0000]
  12. ])
  13. df = pd.DataFrame({"u": data[:, 0], "lot1": data[:, 1], "lot2": data[:, 2], "offset": data[:, 3], "weights": data[:, 4]})
  14. source = dataframeToOperator(df, schemaStr='u double, lot1 double, lot2 double, offset double, weights double', op_type='batch')
  15. featureColNames = ["lot1", "lot2"]
  16. labelColName = "u"
  17. # train
  18. train = GlmTrainBatchOp()\
  19. .setFamily("gamma")\
  20. .setLink("Log")\
  21. .setRegParam(0.3)\
  22. .setMaxIter(5)\
  23. .setFeatureCols(featureColNames)\
  24. .setLabelCol(labelColName)
  25. source.link(train)
  26. # batch predict
  27. predict = GlmPredictBatchOp()\
  28. .setPredictionCol("pred")
  29. predict.linkFrom(train, source)
  30. predict.print()
  31. # eval
  32. eval = GlmEvaluationBatchOp()\
  33. .setFamily("gamma")\
  34. .setLink("Log")\
  35. .setRegParam(0.3)\
  36. .setMaxIter(5)\
  37. .setFeatureCols(featureColNames)\
  38. .setLabelCol(labelColName)
  39. eval.linkFrom(train, source)
  40. eval.print()
  41. # stream predict
  42. source_stream = dataframeToOperator(df, schemaStr='u double, lot1 double, lot2 double, offset double, weights double', op_type='stream')
  43. predict_stream = GlmPredictStreamOp(train)\
  44. .setPredictionCol("pred")
  45. predict_stream.linkFrom(source_stream)
  46. predict_stream.print()

Results

u lot1 lot2 offset weights pred
0 1.6094 118.0 69.0 1.0 2.0 0.378525
1 2.3026 58.0 35.0 1.0 2.0 0.970639
2 2.7081 42.0 26.0 1.0 2.0 1.126458
3 2.9957 35.0 21.0 1.0 2.0 1.227753
4 3.4012 27.0 18.0 1.0 2.0 1.258898
5 3.6889 25.0 16.0 1.0 2.0 1.305654
6 4.0943 21.0 13.0 1.0 2.0 1.367991
7 4.3820 19.0 12.0 1.0 2.0 1.383571
8 4.6052 18.0 12.0 1.0 2.0 1.375774