Tensorflow

AI inference is a computationally intensive task that could benefit greatly from the speed of Rust and WebAssembly. However, the standard WebAssembly sandbox provides very limited access to the native OS and hardware, such as multi-core CPUs, GPU and specialized AI inference chips. It is not ideal for the AI workload.

The popular WebAssembly System Interface (WASI) provides a design pattern for sandboxed WebAssembly programs to securely access native host functions. The WasmEdge Runtime extends the WASI model to support access to native Tensorflow libraries from WebAssembly programs. The WasmEdge Tensorflow Rust SDK provides the security, portability, and ease-of-use of WebAssembly and native speed for Tensorflow.

If you are not familiar with Rust, you can try our experimental AI inference DSL or try our JavaScript examples.

Table of contents

A Rust example

Prerequisite

You need to install WasmEdge and Rust.

Build

Check out the example source code.

git clone https://github.com/second-state/wasm-learning/ cd cli/tflite

Use Rust Cargo to build the WebAssembly target.

rustup target add wasm32-wasi cargo build --target wasm32-wasi --release

Run

The wasmedge-tensorflow-lite utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.

$ wasmedge-tensorflow-lite target/wasm32-wasi/release/classify.wasm < grace_hopper.jpg It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture

Make it run faster

To make Tensorflow inference run much faster, you could AOT compile it down to machine native code, and then use WasmEdge sandbox to run the native code.

$ wasmedgec target/wasm32-wasi/release/classify.wasm classify.wasm $ wasmedge-tensorflow-lite classify.wasm < grace_hopper.jpg It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture

Code walkthrough

It is fairly straightforward to use the WasmEdge Tensorflow API. You can see the entire source code in main.rs.

First, it reads the trained TFLite model file (ImageNet) and its label file. The label file maps numeric output from the model to English names for the classified objects.

  1. #![allow(unused)]
  2. fn main() {
  3. let model_data: &[u8] = include_bytes!("models/mobilenet_v1_1.0_224/mobilenet_v1_1.0_224_quant.tflite");
  4. let labels = include_str!("models/mobilenet_v1_1.0_224/labels_mobilenet_quant_v1_224.txt");
  5. }

Next, it reads the image from STDIN and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.

  1. #![allow(unused)]
  2. fn main() {
  3. let mut buf = Vec::new();
  4. io::stdin().read_to_end(&mut buf).unwrap();
  5. let flat_img = wasmedge_tensorflow_interface::load_jpg_image_to_rgb8(&buf, 224, 224);
  6. }

Then, the program runs the TFLite model with its required input tensor (i.e., the flat image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.

  1. #![allow(unused)]
  2. fn main() {
  3. let mut session = wasmedge_tensorflow_interface::Session::new(&model_data, wasmedge_tensorflow_interface::ModelType::TensorFlowLite);
  4. session.add_input("input", &flat_img, &[1, 224, 224, 3])
  5. .run();
  6. let res_vec: Vec<u8> = session.get_output("MobilenetV1/Predictions/Reshape_1");
  7. }

Let’s find the object with the highest probability, and then look up the name in the labels file.

  1. #![allow(unused)]
  2. fn main() {
  3. let mut i = 0;
  4. let mut max_index: i32 = -1;
  5. let mut max_value: u8 = 0;
  6. while i < res_vec.len() {
  7. let cur = res_vec[i];
  8. if cur > max_value {
  9. max_value = cur;
  10. max_index = i as i32;
  11. }
  12. i += 1;
  13. }
  14. let mut label_lines = labels.lines();
  15. for _i in 0..max_index {
  16. label_lines.next();
  17. }
  18. }

Finally, it prints the result to STDOUT.

  1. #![allow(unused)]
  2. fn main() {
  3. let class_name = label_lines.next().unwrap().to_string();
  4. if max_value > 50 {
  5. println!("It {} a <a href='https://www.google.com/search?q={}'>{}</a> in the picture", confidence.to_string(), class_name, class_name);
  6. } else {
  7. println!("It does not appears to be any food item in the picture.");
  8. }
  9. }

Deployment options

All the tutorials below use the WasmEdge Rust API for Tensorflow to create AI inference functions. Those Rust functions are then compiled to WebAssembly and deployed together with WasmEdge on the cloud.

Serverless functions

The following tutorials showcase how to deploy WebAssembly programs (written in Rust) on public cloud serverless platforms. The WasmEdge Runtime runs inside a Docker container on those platforms. Each serverless platform provides APIs to get data into and out of the WasmEdge runtime through STDIN and STDOUT.

Second Sate FaaS and Node.js

The following tutorials showcase how to deploy WebAssembly functions (written in Rust) on the Second State FaaS. Since the FaaS service is running on Node.js, you can follow the same tutorials for running those functions in your own Node.js server.

Service mesh

The following tutorials showcase how to deploy WebAssembly functions and programs (written in Rust) as sidecar microservices.

  • The Dapr template shows how to build and deploy Dapr sidecars in Go and Rust languages. The sidecars then use the WasmEdge SDK to start WebAssembly programs to process workloads to the microservices.

Data streaming framework

The following tutorials showcase how to deploy WebAssembly functions (written in Rust) as embedded handler functions in data streaming frameworks for AIoT.

  • The YoMo template starts the WasmEdge Runtime to process image data as the data streams in from a camera in a smart factory.