Serving Python Model

When we talk about Python models we mean any arbitrary actions that can be done with Python. You can add some number to your inputs, process images, serve models saved in binary format (scikit-learn, Keras, gluon-cv), etc. You’ve got the idea. For the simplicity in this tutorial we will do just that - increment an input number by one.

Create a handler

Python models have to follow a specific directory structure in order to be used with hydrosphere/serving-runtime-python runtime. Create a directory increment_model.

$ mkdir -p increment_model/src
$ cd increment_model
$ touch src/

The model directory must contain a src folder with file inside. This would be the main file used by the runtime. You may create any arbitrary Python application within your model, just keep in mind that an entry point have to be src/

# src/

import tensorflow as tf
import hydro_serving_grpc as hs

def increment(number):
    request_number = tf.make_ndarray(number)
    response_number = requets_number + 1

    response_tensor_shape = [hs.TensorShapeProto.Dim(size=dim) for dim in number.tensor_shape.dim]
    response_tensor = hs.TensorProto(
        int_val=response_number.flatten(), dtype=hs.DT_INT32,

    return hs.PredictResponse(outputs={"number": response_tensor})

There are different ways of handling incoming tensors. Here we’ve used a Tensorflow function tf.make_ndarray to create a numpy ndarray from a raw TensorProto-like structure number. Then we’ve added a one to that ndarray, packed it back to a response tensor and returned it as a result.

All model’s files are stored in the /model/files directory inside the container.

Note: If you use some external file (such as model’s weights), you would have to specify the absolute path to that file.

Define dependencies

By default the container where this model will be running does not have any scientific package pre-installed. You have to provide a requirements.txt file with the dependencies to let Serving know which Python packages to install during model building. We’ve used a Tensorflow and Numpy packages in this example:


Write a manifest

Since we have defined a function-handler, we have to tell Serving what to call from src/ In this case it’s the increment function. Create a serving.yaml manifest.

# serving.yaml

kind: Model
name: "increment_model"
model-type: "python:3.6"
  - "src/"
  - "requirements.txt"

  increment:                  # Signature function
      number:                 # Input field name
        shape: [-1]
        type: int32
        profile: numeric
      number:                 # Output field name
        shape: [-1]
        type: int32
        profile: numeric

This file describes the model, its name, type, payload files and contract. Contract declares input and output fields for the model; data types, shapes and profile information.

That’s it, you’ve just created a simple model which you can use within your business applications.

Serve the model

Upload the model to Serving.

$ hs upload

Now the model is uploaded to the serving service but does not yet available for the invocation. Create an application to declare an endpoint to your model. You can create it manually via UI interface, or by providing an application manifest. To do it with web interface, open your http://<host>/ where Serving has been deployed, open Applications page and create a new app which will use mnist model. Or by manifest:

# application.yaml 

kind: Application
name: increment_app
  model: increment_model:1
  runtime: hydrosphere/serving-runtime-python:3.6-latest
$ hs apply -f application.yaml

That’s it, now you can increment numbers.

$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ "number": [1] }' 'https://<host>/gateway/applications/increment_app/increment'

What’s next?