Resource definitions describe serving cluster entities. It could be your model, application or deployment configuration also known as HostSelector. The type of each definition is defined by
kind: Model # or Application or HostSelector name: "example"
runtimedefines the docker image that will be used in deployment.
payloaddefines all files for the model;
contractdefines a prediction signature of a model.
install-commanddefines an initialization command to be executed during upload procedure.
training-datadefines a local file or path to S3 object where training data is stored.
kind: Model name: sample_model training-data: s3://bucket/train.csv runtime: hydrosphere/serving-runtime-dummy:dev install-command: "sudo apt install jq" payload: - "./*" contract: name: infer inputs: input_field_1: shape: [-1, 1] type: string profile: text input_field_2: shape: scalar type: int32 profile: numerical outputs: output_field_1: shape: [-1, 2] type: int32 profile: numerical
In the example above we’ve defined a signature with
infer name. Each signature has to have
outputs. They define what kind of data the model will receive and what will it produce. Each input and output field has a 3 defined properties -
For this type of resource you have to declare one of the following fields:
singulardefines a single-model application.
pipelinedefines application as a pipeline of models.
singular applications usually consist of smaller amount of definitions.
kind: Application name: sample_application singular: model: sample_model:1
singular field has a single
model property. It’s expected to be in the form
pipeline applications have more detailed definitions.
kind: Application name: sample-claims-app pipeline: - model: claims-preprocessing:1 - modelservices: - model: claims-model:1 weight: 80 - model: claims-model-old:2 weight: 20
pipeline is a list of stages. Each item in the list can have the following attributes: -
model defines the model and its version to use. Expected to be in the form
model-name:model-version; - A stage can consist of multiple models. In that case you can define
modelservices where you will list needed models. For each model in you would have to declare a
weight attribute, which has to sum up to 100 across all the models in the stage. The
weight defines how much traffic would go through the model.
HostSelector gives an ability to set environment requirements for your model deployment. Your model uses GPU? Maybe you want some experimental ARM64 version? Other requirements? This resource is for you.
Having said that, it’s not fully implemented, since it depends on cluster infrastructure and cloud provider.
We will let you know when it is ready. ;)