Serializing with MLeap
Serializing and deserializing with MLeap is a simple task. You canchoose to serialize to a directory on the file system or to a zip filethat can easily be shipped around.
Create a Simple MLeap Pipeline
import ml.combust.bundle.BundleFile
import ml.combust.bundle.serializer.SerializationFormat
import ml.combust.mleap.core.feature.{OneHotEncoderModel, StringIndexerModel}
import ml.combust.mleap.core.regression.LinearRegressionModel
import ml.combust.mleap.runtime.transformer.Pipeline
import ml.combust.mleap.runtime.transformer.feature.{OneHotEncoder, StringIndexer, VectorAssembler}
import ml.combust.mleap.runtime.transformer.regression.LinearRegression
import org.apache.spark.ml.linalg.Vectors
import ml.combust.mleap.runtime.MleapSupport._
import resource._
// Create a sample pipeline that we will serialize
// And then deserialize using various formats
val stringIndexer = StringIndexer(
shape = NodeShape.scalar(inputCol = "a_string", outputCol = "a_string_index"),
model = StringIndexerModel(Seq("Hello, MLeap!", "Another row")))
val oneHotEncoder = OneHotEncoder(
shape = NodeShape.vector(1, 2, inputCol = "a_string_index", outputCol = "a_string_oh"),
model = OneHotEncoderModel(2, dropLast = false))
val featureAssembler = VectorAssembler(
shape = NodeShape().withInput("input0", "a_string_oh").
withInput("input1", "a_double").withStandardOutput("features"),
model = VectorAssemblerModel(Seq(TensorShape(2), ScalarShape())))
val linearRegression = LinearRegression(
shape = NodeShape.regression(3),
model = LinearRegressionModel(Vectors.dense(2.0, 3.0, 6.0), 23.5))
val pipeline = Pipeline(
shape = NodeShape(),
model = PipelineModel(Seq(stringIndexer, oneHotEncoder, featureAssembler, linearRegression)))
Serialize to Zip File
In order to serialize to a zip file, make sure the URI begins withjar:file
and ends with a .zip
.
For examplejar:file:/tmp/mleap-bundle.zip
.
JSON Format
for(bundle <- managed(BundleFile("jar:file:/tmp/mleap-examples/simple-json.zip"))) {
pipeline.writeBundle.format(SerializationFormat.Json).save(bundle)
}
Protobuf Format
for(bundle <- managed(BundleFile("jar:file:/tmp/mleap-examples/simple-protobuf.zip"))) {
pipeline.writeBundle.format(SerializationFormat.Protobuf).save(bundle)
}
Serialize to Directory
In order to serialize to a directory, make sure the URI begins withfile
.
For example file:/tmp/mleap-bundle-dir
JSON Format
for(bundle <- managed(BundleFile("file:/tmp/mleap-examples/simple-json-dir"))) {
pipeline.writeBundle.format(SerializationFormat.Json).save(bundle)
}
Protobuf Format
for(bundle <- managed(BundleFile("file:/tmp/mleap-examples/simple-protobuf-dir"))) {
pipeline.writeBundle.format(SerializationFormat.Protobuf).save(bundle)
}
Deserializing
Deserializing is just as easy as serializing. You don’t need to know theformat the MLeap Bundle was serialized as beforehand, you just need toknow where the bundle is.
Zip Bundle
// Deserialize a zip bundle
// Use Scala ARM to make sure resources are managed properly
val zipBundle = (for(bundle <- managed(BundleFile("jar:file:/tmp/mleap-examples/simple-json.zip"))) yield {
bundle.loadMleapBundle().get
}).opt.get
Directory Bundle
// Deserialize a directory bundle
// Use Scala ARM to make sure resources are managed properly
val dirBundle = (for(bundle <- managed(BundleFile("file:/tmp/mleap-examples/simple-json-dir"))) yield {
bundle.loadMleapBundle().get
}).opt.get
当前内容版权归 combust.ml 或其关联方所有,如需对内容或内容相关联开源项目进行关注与资助,请访问 combust.ml .