What is a Hydrosphere.io?

Hydrosphere.io is an open source service that reduces engineering overhead required to scale machine learning operations from the level of proof-of-concept to production and sustenance.

Hydrosphere.io provides a cluster for serving machine learning models in real-time, a serverless proxy for Spark cluster and a service for quality monitoring of data and machine learning intensive applications.


You are here!

You have built an Apache Spark data pipeline and trained machine learning models in hosted notebook environment, TensorFlow toolkit or scikit-learn scripts.

Now scale your process and operations for research and development of machine learning applications to support multiple data science teams, hundreds of training pipelines, thousands of machine learning models that serve predictions for real customers in real time.


Serverless proxy for Spark cluster

Make your Spark operations serverless for data scientists, engineers, and multi-tenant applications. Hydrosphere.io increases the reliability of your Spark jobs, thereby saving the cluster resources and increasing the productivity of data scientists and engineers. Unlock new revenue streams by exposing Spark job server REST API for interactive applications to business users and tenants.

Get Started with Hydrosphere Mist



Realtime ML serving cluster

Deploy your machine learning zoo of sckit-learn, Spark ML, TensorFlow, fastText, xgboost models as end-to-end prediction pipelines. Power smart applications for your users with realtime serving REST API.

Hydrosphere.io greatly decreases engineering and operations burden and accelerates time to value for data science projects. and accelerates time to value for data science projects.

Get Started with ML Lambda


UDF runtime for ML

Deploy machine learning models as Elasticsearch, Spark SQL, Cassandra or Redshift User Defined Functions (UDFs), so web engineers can seamlessly integrate machine learning capabilities into existing applications.

Hydrosphere.io simplifies querying and scoring machine learning algorithms from the application stack.

Get Started -> Contact Us


Data and ML QA as a service

Gain end-to-end quality of your data transformation, training, and prediction pipelines to identify the data quality issues, side effects and model degradation trends before they start affecting your business.

Hydrosphere.io provides anomaly detection and pattern recognition components designed for the monitoring data and machine learning heavy applications. It greatly impacts customer experience and reliability of your data driven business.

Get Started with Sonar


Infrastructure options

Hydrosphere.io is agnostic to your infrastructure, Apache Spark backend and machine learning frameworks.

Infrastructure Options

  • AWS
  • Google Cloud Engine
  • YesDC/OS
  • Kuberneres
  • On Premise

Machine Learning Frameworks

  • Spark MLLib
  • Scikit-learn
  • TensorFlow
  • xgboost
  • Deeplearning4j
  • Others

Serving Runtimes

  • AWS Lambda
  • Hydrosphere Dockerized Runtime
  • On-premise deb/apk packages

UDF deployment

Apache Spark Backends

  • EMR
  • Vanila Spark
  • Hortonworks
  • MapR
  • Cloudera
  • Custom

Request Demo