Add custom monitoring metrics

Dozens of new models are introduced every month and monitoring all them is extremely hard due to variability of types and architectures that lie upon them. To mitigate this issue we introduced the ability to write custom monitoring metrics. In this article you will learn how to create one with in depth explanations.

Case

As an example we chose to create a custom metric, which will track the reconstruction rate of your production data. It will let you know how similar your production data to your training data.

Monitored Model

The actual model that we want to monitor tries to predict based on a set of features, will the person earn more than $50k per year or not. For the sake of example, we are defining only a subset of features.

kind: Model
name: adult-salary
payload:
  - "src/"
  - "requirements.txt"
  - "classification_model.joblib"
runtime: "hydrosphere/serving-runtime-python-3.6:2.1.0"
install-command: "pip install -r requirements.txt"
contract:
  name: "predict"
  inputs:
    age:
      shape: scalar
      type: int64
      profile: numerical
    workclass:
      shape: scalar
      type: int64
      profile: numerical
    education:
      shape: scalar
      type: int64
      profile: numerical
  outputs:
    classes:
      shape: scalar
      type: int64
      profile: numerical

You may ask yourself what are the inputs to this model? Well, those are the inputs plus the outputs of adult-salary model, defined above. Why do we also need outputs of the monitored models? There are cases, when you need to know the outputs of the model as well to calculate metrics, but if you don’t need them right away, just leave them alone. We will cover cases with model outputs more deeply in the upcoming releases.

Metric Resource Definition

Because monitoring model will be served just as a regular model, it also needs a defined contract.

kind: Model
name: "adult-salary-metric"
payload:
  - "src/"
  - "requirements.txt"
  - "monitoring_model.joblib"
runtime: "hydrosphere/serving-runtime-python-3.6:2.1.0"
install-command: "pip install -r requirements.txt"
contract:
  name: "predict"
  inputs:
    age:
      shape: scalar
      type: int64
      profile: numerical
    workclass:
      shape: scalar
      type: int64
      profile: numerical
    education:
      shape: scalar
      type: int64
      profile: numerical
    classes:
      shape: scalar
      type: int64
      profile: numerical
  outputs:
    value:
      shape: scalar
      type: double
      profile: numerical

Once the resource definition of the model is changed, the next model upload will be supplied with additional metrics.

SDK

You can also add monitoring to a model using Python SDK library. This library can be used within your automation pipeline to continuously deliver machine learning models to production.

First, let’s install SDK.

pip install hydrosdk

Next, we are going to declare actual model definitions and the deployment.

from hydrosdk import sdk

# First we define monitoring metrics and all other necessary attributes 
monitoring = [
    (
        sdk.Monitoring('Custom Metric')
        .with_health()
        .with_spec(
            'CustomModelMetricSpec', 
            application="adult-salary-metric-app",
            operator=">=",
            interval=15, 
            threshold=10,
        ),
    ),
]

name = ...
runtime = ...
payload = ...
...

# Next, we assemble the model 
model = (
    sdk.Model()
    .with_name(name)
    .with_runtime(runtime)
    .with_payload(payload)
    ...
    .with_monitoring(monitoring)
)

# Finally, we upload the model to the cluster
model.apply(platform_uri)

After executing this script the model will be assembled and uploaded to the platform just as with regular serving.yaml gets executed.

Note

To learn more about SDK visit this page.