Getting Started

This document presents a quick way to set everything up and deploy your first model.


To get started, you need Docker and Docker Compose on your machine. Make sure that everything is configured correctly and works on your machine. Note for Windows users - you need to fix volume paths.


Clone ML Lambda to the desired directory.

$ cd ~/
$ git clone

Now set up a docker environment. You have 2 options:

  1. The lightweight version lets you manage your models in a continuous manner, version them, create applications that use your models and deploy all of this into production. To do that, just execute the following:

     $ cd ~/hydro-serving/
     $ docker-compose up
  2. The integrations version extends the lightweight version and lets you also integrate kafka, graphana and influxdb .

     $ cd ~/hydro-serving/integrations/
     $ docker-compose up

Note: If you’ve already installed one of the versions and want to install the other one, you’ll need to remove coinciding containers defined in docker-compose.yaml (those are placed in both lightweight and integrations versions).

Open web interface http://localhost/. You are ready to go.


ML Lambda has CLI tool that is used to upload your models to the server. It supports Python 3.4 and above. To install, run:

$ pip install hs

To work with ML Lambda you’d need to set up a cluster, that will be used to upload your models to.

$ hs cluster add --name local --server http://localhost

To learn more about clusters check the CLI page.

Uploading models

To get the notion of ML Lambda we recommend you to go through 2 bellow tutorials of uploading demo model and own model. This will show you different aspects of working with model configuration, building multi-staged, single-staged applications, etc.

Demo model

We’ve already created a few examples that you can run to see, how everything works. Let’s clone them and pick a model.

$ cd ~/
$ git clone
$ cd ~/hydro-serving-example/models/$MODEL_OF_YOUR_CHOICE

Fetching & uploading Stateful LSTM

For the purpose of this tutorial we chose Stateful LSTM. All you have to do is just to upload the model to the server.

$ cd ~/hydro-serving-example/models/stateful_lstm
$ hs upload 

Stateful LSTM is actually a Multi-Staged application. It includes additional parts: pre and post processing stages. So we upload them as well.

$ cd ~/hydro-serving-example/models/stateful_lstm_preprocessing
$ hs upload
$ cd ~/hydro-serving-example/models/stateful_lstm_postprocessing
$ hs upload 

Your models now have been uploaded to ML Lambda. You can find them here - http://localhost/models

Creating an application

Let’s create an application that can use our models. Open Applications page and press Add New and reproduce the following structure in the Models section.

Stage Model Runtime
Stage_1 demo_preprocessing hydrosphere/serving-runtime-python:3.6:latest
Stage_2 stateful_lstm hydrosphere/serving-runtime-tensorflow:1.7.0:latest
Stage_3 demo_postprocessing hydrosphere/serving-runtime-python:3.6:latest

Note: There’s an option to add multiple models to one stage which might be confusing, because you may include all pipeline steps(pre, lstm, post) into a signle stage. Make sure, you’ve added new steps via Add New Stage button.

When creating an application, ML Lambda will automatically infer contract for it as well as for models. If it’s a single-staged application, it will look up model’s signatures and take one from there, if it’s a multi-staged application, it will create a signature with signature_name equal to the application’s name.

Invoking applications described in a section below.

Own model

In this section we will start from scratch, create a simple linear regression model, configure it, upload to the server and create an application for it.

Creating the model

Create a directory for the model and add inside it.

$ mkdir ~/linear_regression
$ cd ~/linear_regression
$ touch

Inside we will define our model, train it and save.

from keras.models import Sequential
from keras.layers import Dense
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# initialize data
n_samples = 1000
X, y = make_regression(n_samples=n_samples, n_features=2, noise=0.5, random_state=112)

scallar_x, scallar_y = MinMaxScaler(), MinMaxScaler(), 1))
X = scallar_x.transform(X)
y = scallar_y.transform(y.reshape(n_samples, 1))

# create a model
model = Sequential()
model.add(Dense(4, input_dim=2, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam'), y, epochs=1000, verbose=0)

# save model'model.h5')

There’s actually 2 ways to run keras models inside ML Lambda:

  1. If Keras is used with TensorFlow backend, then it’s possible to export tf.session and use it inside hydrosphere/serving-runtime-tensorflow runtime.
  2. Otherwise it’s possible to use hydrosphere/serving-runtime-python runtime and define all actions in a python script.

Note: For this tutorial we will do the second approach. If you want to get familiar with the first one, you can visit Models page.

Creating a handler

For running this model we will use the Python Runtime. Internal structure of a model for this runtime should have src directory and contain file inside it.

$ mkdir src 
$ cd src
$ touch

By default ML Lambda will put all of the files of our model inside /model/files/ directory. So, we load our model from there.

import hydro_serving_grpc as hs
import numpy as np
from keras.models import load_model

model = load_model('/model/files/model.h5')

def infer(x):
    # 1. Prepare data points
    data = np.array(x.double_val)
    data = data.reshape([dim.size for dim in x.tensor_shape.dim])

    # 2. Make prediction
    result = model.predict(data)
    y = hs.TensorProto(

    # 3. Return the result
    return hs.PredictResponse(outputs={"y": y})

Note: ML Lambda interacts with a runtime container via gRPC interface.

Describing the model

Since our model will be running inside a raw Python container, the container won’t have any required dependencies pre-installed. Let’s create a requirements.txt for that.


Right now ML Lambda does not understand what our model expects. Let’s define serving.yaml contract.

kind: Model
name: linear_regression
model-type: python:3.6
  - "src/"
  - "requirements.txt"
  - "model.h5"

        shape: [-1, 2]
        type: double
        profile: numerical
        shape: [-1]
        type: double
        profile: numerical

Here we’ve decsribed the type of the model, it’s files, some additional artefacts and defined the inputs and outputs of the model. For more information check Models page.

Overall structure of our model now should look like this:

├── model.h5
├── requirements.txt
├── serving.yaml
└── src

Note: although, we have inside directory, it won’t be uploaded to ML Lambda since we didn’t specify it in payload.

Uploading the model

Now we’re ready to upload our model.

$ hs upload

You can open http://localhost/models page to see uploaded model.

Creating an applicaion

Open Applications page and press Add New button. Select your model and as a runtime select hydrosphere/serving-runtime-python then create an application. Now you can invoke your app.

Note: If you cannot find your newly uploaded model and it’s listed in your models page, that means, it’s still in a building stage. Wait until the model changes its status to Released, then you can use it.

Invoking applications

Invoking applications is available via different interfaces.

Test request

You can perform test request to the model from interface. Open desired application and press Test button. Internally it will generate arbitrary input data from model’s contract and send an HTTP-request to API endpoint.


Send POST request to ML Lambda.

  1. demo_lstm

     $ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
     "data": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] 
     }' 'http://localhost/gateway/applications/demo_lstm/demo_lstm'
  2. linear_regression

     $ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
     "x": [[1, 1],[1, 1]]}' 'http://localhost/gateway/applications/linear_regression/infer'

gRPC API call

You can define a gRPC client on your side and make a call from it. Here we provide a Python example, but this can be done in any language.

import grpc 
import hydro_serving_grpc as hs

# connect to your ML Lamba instance
channel = grpc.insecure_channel("localhost")
stub = hs.PredictionServiceStub(channel)

# 1. define a model, that you'll use
model_spec = hs.ModelSpec(name="linear_regression", signature_name="infer")
# 2. define tensor_shape for Tensor instance
tensor_shape = hs.TensorShapeProto(dim=[hs.TensorShapeProto.Dim(size=-1), hs.TensorShapeProto.Dim(size=2)])
# 3. define tensor with needed data
tensor = hs.TensorProto(dtype=hs.DT_DOUBLE, tensor_shape=tensor_shape, double_val=[1,1,1,1])
# 4. create PredictRequest instance
request = hs.PredictRequest(model_spec=model_spec, inputs={"x": tensor})

# call Predict method
result = stub.Predict(request)

Note: For convinience we’ve already generated all our proto files to a python library and published it in PyPI. You can install it via pip install hydro-serving-grpc