Description

Ridge regression predict stream operator. this operator predict data’s regression value with linear model.

Parameters

Name Description Type Required? Default Value
reservedCols Names of the columns to be retained in the output table String[] null
predictionCol Column name of prediction. String
vectorCol Name of a vector column String null

Script Example

Script

  1. data = np.array([
  2. [2, 1, 1],
  3. [3, 2, 1],
  4. [4, 3, 2],
  5. [2, 4, 1],
  6. [2, 2, 1],
  7. [4, 3, 2],
  8. [1, 2, 1],
  9. [5, 3, 3]])
  10. df = pd.DataFrame({"f0": data[:, 0],
  11. "f1": data[:, 1],
  12. "label": data[:, 2]})
  13. batchData = dataframeToOperator(df, schemaStr='f0 int, f1 int, label int', op_type='batch')
  14. streamData = dataframeToOperator(df, schemaStr='f0 int, f1 int, label int', op_type='stream')
  15. colnames = ["f0","f1"]
  16. ridge = RidgeRegTrainBatchOp().setLambda(0.1).setFeatureCols(colnames).setLabelCol("label")
  17. model = batchData.link(ridge)
  18. predictor = LinearRegPredictStreamOp(model).setPredictionCol("pred")
  19. predictor.linkFrom(streamData).print()
  20. StreamOperator.execute()

Result

f0 f1 f2 label pred
1.0 7.0 9.0 16.8 16.614452974656647
1.0 3.0 3.0 6.7 6.754928617036061
1.0 2.0 4.0 6.9 6.871072594920224
1.0 3.0 4.0 8.0 7.787338643951784