Serving Tensorflow Model

Deploying Tensorflow models does not require any additional manifest writings for model uploading since ML Lambda can automatically infer models’ signatures. Running Tensorflow models can be done with hydrosphere/serving-runtime-tensorflow runtime. The only thing you have to worry about is how to properly export your model with tf.saved_model.

Suppose we already have a Tensorflow model, that recognizes hand-written digits. We will cover serving for Tensorflow Basic API and Estimator API.

Basic API

When serving a Tensorflow model the only thing you have to do is to correctly identify and prepare input and output tensors. We’ve implemented a linear model using low-level Tensorflow API. The full code is available in our GitHub repository. In the code you can find the following lines which are the input and output tensors.

img, label = iterator.get_next()    # input image and label
pred = tf.nn.softmax(logits)        # output prediction

Once the model have been trained you can export the whole computational graph and the trained weights with tf.saved_model. You have to define a signature definition which declares a computation supported on the graph.

# Save model
signature_map = {
    "infer": tf.saved_model.signature_def_utils.predict_signature_def(
        inputs={"img": img}, outputs={"pred": pred})

After that create a SavedModelBuilder object and save the model under desired directory.

builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
    sess=sess,                                          # session, where the graph was initialized

Serving the model

Upload the exported model to ML Lambda.

$ hs upload --name mnist

Now the model is uploaded to the serving service but does not yet available for the invokation. Create an application to declare an endpoint to your model. You can create it manually via UI interface, or by providing an application manifest. To do it with web interface, open your http://<host>/ where ML Lambda has been deployed, open Applications page and create a new app which will use mnist model. Or by manifest:

# application.yaml 

kind: Application
name: mnist_app
  model: mnist:1
  runtime: hydrosphere/serving-runtime-tensorflow:1.7.0-latest
$ hs apply -f application.yaml

That’s it, you can now infer predictions.

$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ "imgs": [ [ [ 1, 1, 1, ... 1, 1, 1 ] ] ] }' 'https://<host>/gateway/applications/mnist_app/infer'

Estimator API

Saving model written with Estimator API does not differ much from Basic API. Likewise you have to declare a signature map and export the model with tf.saved_model. All code for this model available in our GitHub repository.

imgs = tf.feature_column.numeric_column("imgs", shape=(784,))
estimator = tf.estimator.DNNClassifier(
    hidden_units=[256, 64],

Instead of declaring a signature map as in the Basic API example, you need to declare a build_raw_serving_input_receiver_fn function and pass it to the export_estimator. imgs tensor in the input_receiver_fn should be similiar to the tf.feature_column which is expected by the estimator.

serving_input_receiver_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({
    "imgs": tf.placeholder(tf.float32, shape=(None, 784))})
estimator.export_savedmodel(export_dir, serving_input_receiver_fn)

Serving the model

Upload the exported model to ML Lambda.

$ hs upload --name mnist

The rest is the same as with the Basic API example. Create an application and infer predictions.

$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ "imgs": [ [ [ 1, 1, 1, ... 1, 1, 1 ] ] ] }' 'https://<host>/gateway/applications/mnist_app/predict'

What’s next?