• Stars
    star
    285
  • Rank 145,115 (Top 3 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Learning in H2O using Native GPU Backends

Deep Water

What it is

  • Native implementation of Deep Learning models for GPU-optimized backends (MXNet, Caffe, TensorFlow, etc.)
  • State-of-the-art Deep Learning models trained from the H2O Platform
  • Train user-defined or pre-defined Deep Learning models for image/text/H2OFrame classification from Flow, R, Python, Java, Scala or REST API
  • Behaves just like any other H2O model (Flow, cross-validation, early stopping, hyper-parameter search, etc.)
  • Deep Water is a legacy project (as of December 2017), which means that it is no longer under active development. The H2O.ai team has no current plans to add new features, however, contributions from the community (in the form of pull requests) are welcome.

Python/R Jupyter Notebooks

Check out a sample of cool Deep Learning Jupyter notebooks!

Pre-Release Downloads

This release of Deep Water is based on the latest H2O-3 release

The downloadable packages below are built for the following system specifications:

  • Ubuntu 16.04 LTS
  • NVIDIA Display driver at least 367
  • CUDA 8.0.44 or later (we recommend the latest version) in /usr/local/cuda
  • CUDNN 5.1 (placed inside of lib and include directories in /usr/local/cuda/)

To use the GPU, please set the following environment variables:

export CUDA_PATH=/usr/local/cuda
export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATH

Python + Flow (most common)

R + Flow (R users)

Flow (Web UI)

  • To run from Flow only: H2O Standalone h2o.jar -- launch via java -jar h2o.jar for image tasks we recommend java -Xmx30g -jar h2o.jar

If you are interested in running H2O Deep Water on a different infrastructure, see the DIY build instructions below.

Running GPU enabled Deep Water in H2O

(Optional) Launch H2O by hand and build Deep Water models from Flow (localhost:54321)

java -jar h2o.jar

Java example use cases

Example Java GPU-enabled unit tests.

Python example use cases

Example Python GPU-enabled unit tests. Check out a sample of cool Deep Learning Python Jupyter notebooks!

R example use cases

Example R GPU-enabled unit tests. Check out a sample of cool Deep Learning R Jupyter notebooks!

Scala / Sparkling Water example use cases

Coming soon.

Pre-Release Amazon AWS Image

We have a pre-built image for Amazon Web Services's EC2 environment:

  • AMI ID: ami-97591381
  • AMI Name: h2o-deepwater-ami-latest
  • AWS Region: US East (N. Virginia)
  • Recommended instance type: p2.xlarge

The AMI image contains the Docker Image described below. Once started, login to the shell prompt. It's a good idea to update the docker image with docker pull opsh2oai/h2o-deepwater to ensure that you have the most recent version. Then start the docker image, either with the provided shell script or with nvidia-docker run -it --net host opsh2oai/h2o-deepwater.

Start H2O with java -Xmx30g -jar /opt/h2o.jar &. Connect to port 54321.

Start Jupyter with jupyter notebook --allow-root --ip=* &. Connect to the link shown, with your IP exchanged for localhost.

Pre-Release Docker Image

We have a GPU-enabled Docker image and one the CPU only. Both are available on Docker Hub.

For both images you need to install Docker, see http://www.docker.com

  • Optional Step. Make docker run without sudo. Instructions for Ubuntu 16.04:
    • sudo groupadd docker
    • sudo gpasswd -a ${USER} docker
    • sudo service docker restart
    • log out then log in, or newgrp docker

GPU-Enabled Docker Image (Recommended)

To use the GPU-enabled Docker image you need a Linux machine with at least one GPU, a GPU driver, and with docker and nvidia-docker installed.

An NVIDIA GPU with a Compute Capability of at least 3.5 is necessary. See https://developer.nvidia.com/cuda-gpus .

If you use Amazon Web Services (AWS), a good machine type to use is the P2 series. Note that G2 series machines have GPUs that are too old.

If you have used these docker images before, please run docker pull IMAGENAME to ensure that you have the latest version.

  1. Install nvidia-docker, see https://github.com/NVIDIA/nvidia-docker . Note that you can only use Linux machines with one or more NVIDIA GPUs:

    • GNU/Linux x86_64 with kernel version > 3.10
    • Docker >= 1.9 (official docker-engine, docker-ce or docker-ee only)
    • NVIDIA GPU with Architecture > Fermi (2.1) and Compute Capability >= 3.5
    • NVIDIA drivers >= 340.29 with binary nvidia-modprobe
  2. Download and run the H2O Docker image

    • nvidia-docker run -it --rm --net host -v $PWD:/host opsh2oai/h2o-deepwater
    • You now get a prompt in the image: # . The directory you started from is avaiable as /host
    • Start H2O with java -jar /opt/h2o.jar
    • Python, R and Jupyter Notebooks are available
    • exit or ctrl-d closes the image

CPU-only Docker Image

To use the CPU-enabled Docker image you just need to have Docker installed. Note that this image is significantly slower than the GPU image, which is why we don't recommend it.

  • Download and run the H2O Docker image:
    • On Linux: docker run -it --rm --net host -v $PWD:/host opsh2oai/h2o-deepwater-cpu
    • On MacOS: docker run -it --rm -p 54321:54321 -p 8080:8080 -v $PWD:/host opsh2oai/h2o-deepwater-cpu
    • You now get a prompt in the image: # . The directory you started from is avaiable as /host
    • Start H2O with java -jar /opt/h2o.jar
    • Python, R and Jupyter Notebooks are available
    • exit or ctrl-d closes the image

Roadmap, Architecture and Demo

Download the Deep Water overview slides.

architecture architecture architecture architecture

DIY Build Instructions

If you want to use Deep Water in H2O-3, you'll need to have a .jar file that includes backend support for at least one of MXNet, Caffe or TensorFlow.

1. Build MXNet

Instructions to build MXNet

2. Build TensorFlow

Instructions to build TensorFlow

3. Build Caffe

Coming soon.

4. Build H2O Backend Connectors

From the top-level of the deepwater repository, do

./gradlew build -x test

This will create the following file: build/libs/deepwater-all.jar

5. Add DeepWater support to H2O-3

You need to check out the h2o-3. Copy the freshly created jar file build/libs/deepwater-all.jar from the previous step to H2O-3's library h2o-3/lib/deepwater-all.jar (create the directory if it's not there) and you're done!

Build H2O-3 as usual:
./gradlew build -x test

This H2O version will now have GPU Deep Learning support!

To use the GPU, please make sure to set your path to your CUDA installation:

export CUDA_PATH=/usr/local/cuda
Install the Python wheel:
sudo pip install h2o-3/h2o-py/dist/h2o-3.11.0.99999-py2.py3-none-any.whl
(Optional) Install the MXNet Python/R packages

If you want to build your own MXNet models from Python or R, install the MXNet wheel (which was built together with MXNet above):

sudo easy_install deepwater/thirdparty/mxnet/python/dist/mxnet-0.7.0-py2.7.egg
R CMD INSTALL deepwater/thirdparty/mxnet/mxnet_0.7.tar.gz

Releasing

The release process bundles all defined submodules and push them into Maven central via Sonatype repository provider. The released artifacts are Java 6 compatible.

The release can be invoked for all modules by:

./gradlew -PdoRelease -PbuildOnlyBackendApi -PdoJava6Bytecode=true -Prelease.useAutomaticVersion=true release

The process performs the following steps:

  • Updates gradle.properties and removes SNAPSHOT and increases minor version (can be changed)
  • Creates a new release commit and tags it with release tag. (See gradle/release.gradle file to override the default template.)
  • Builds
  • Verifies compatibility of used API with Java 6 API
  • Bytecode rewrite to be compatible with Java 6
  • Generation of artifact metadata
  • Pushes artifacts into staging area at https://oss.sonatype.org/

The process needs to be finished manually by:

Note: The release process creates two new commits and a new tag with the release version. However, the process does not push it to a remote repository, so it is necessary to perform a remote update manually using git push --tags or update the gradle/release.gradle settings and remove the --dry-run option from the pushOptions field.

More Repositories

1

h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
Python
11,235
star
2

h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Jupyter Notebook
6,862
star
3

h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
Python
3,991
star
4

wave

Realtime Web Apps and Dashboards for Python and R
Python
3,966
star
5

h2o-2

Please visit https://github.com/h2oai/h2o-3 for latest H2O
Java
2,222
star
6

datatable

A Python package for manipulating 2-dimensional tabular data structures
C++
1,807
star
7

h2o-tutorials

Tutorials and training material for the H2O Machine Learning Platform
Jupyter Notebook
1,457
star
8

sparkling-water

Sparkling Water provides H2O functionality inside Spark cluster
Scala
958
star
9

mli-resources

H2O.ai Machine Learning Interpretability Resources
Jupyter Notebook
479
star
10

h2o4gpu

H2Oai GPU Edition
C++
455
star
11

h2o-meetups

Presentations from H2O meetups & conferences by the H2O.ai team
Jupyter Notebook
412
star
12

awesome-h2o

A curated list of research, applications and projects built using the H2O Machine Learning platform
353
star
13

db-benchmark

reproducible benchmark of database-like ops
R
299
star
14

pystacknet

Jupyter Notebook
286
star
15

h2o-wizardlm

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning
Python
242
star
16

driverlessai-recipes

Recipes for Driverless AI
Python
224
star
17

nitro

Create apps 10x quicker, without Javascript/HTML/CSS.
TypeScript
200
star
18

wave-apps

Sample AI Apps built with H2O Wave.
Python
144
star
19

h2o-flow

Web based interactive computing environment for H2O
CoffeeScript
131
star
20

tutorials

This is a repo for all the tutorials put out by H2O.ai. This includes learning paths for Driverless AI, H2O-3, Sparkling Water and more...
Jupyter Notebook
129
star
21

enterprise-h2ogpte

Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform
Python
79
star
22

rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
R
64
star
23

steam

DEPRECATED Build, manage and deploy H2O's high-speed machine learning models.
Java
61
star
24

h2o-world-2014-training

training material
Java
47
star
25

h2o-sparkling

DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.
Scala
43
star
26

app-consumer-loan

HTML
41
star
27

h2o-kubeflow

Jsonnet
37
star
28

h2o-droplets

Templates for projects based on top of H2O.
Java
37
star
29

driverlessai-tutorials

H2OAI Driverless AI Code Samples and Tutorials
Jupyter Notebook
37
star
30

app-malicious-domains

Domain name classifier looking for good vs. possibly malicious providers
HTML
33
star
31

data-science-examples

A collection of data science examples implemented across a variety of languages and libraries.
CSS
33
star
32

xgboost-predictor

Java
32
star
33

wave-ml

Automatic Machine Learning (AutoML) for Wave Apps
Python
32
star
34

AITD

Jupyter Notebook
31
star
35

h2o-LLM-eval

Large-language Model Evaluation framework with Elo Leaderboard and A-B testing
Jupyter Notebook
28
star
36

Deep-Learning-with-h2o-in-R

Deep neural networks on over 50 classification problems from the UC Irvine Machine Learning Repository
R
23
star
37

sql-sidekick

Experiment on QnA tabular data using LLMs and SQL
Python
22
star
38

h2o.js

Node.js bindings to H2O, the open-source prediction engine for big data science.
CoffeeScript
21
star
39

perf

Performance Benchmarks
Jupyter Notebook
21
star
40

typesentry

Python 2.7 & 3.5+ runtime type-checker
Python
20
star
41

covid19-datasets

20
star
42

h2o-kubernetes

H2O Open Source Kubernetes operator and a command-line tool to ease deployment (and undeployment) of H2O open-source machine learning platform H2O-3 to Kubernetes.
Rust
20
star
43

mlops-dai-runtimes

Production ready templates for deploying Driverless AI (DAI) scorers. https://h2oai.github.io/dai-deployment-templates/
Java
17
star
44

genai-app-store-apps

GenAI apps from H2O made Wave
Python
16
star
45

qcon2015

Repository for SF QConf 2015 Workshop
Java
16
star
46

h2o3-sagemaker

Integrating H2O-3 AutoML with Amazon Sagemaker
Python
13
star
47

wave-image-styling-playground

A interactive playground to style and edit images, generate art and have fun.
Python
13
star
48

article-information-2019

Article for Special Edition of Information: Machine Learning with Python
Jupyter Notebook
13
star
49

social_ml

Python
12
star
50

challenge-wildfires

Starter kit for H2O.ai competition Challenge Wildfires.
Jupyter Notebook
11
star
51

h2o-jenkins-pipeline-lib

Library of different Jenkins pipeline building blocks.
Groovy
11
star
52

haic-tutorials

Jupyter Notebook
10
star
53

wave-h2o-automl

Wave App for H2O AutoML
Python
9
star
54

cvpr-multiearth-deforestation-segmentation

Jupyter Notebook
8
star
55

ht-catalog

Diverse collection of 100 Hydrogen Torch Use-Cases by different industries, data-types, and problem types
HTML
8
star
56

app-ask-craig

Ask Craig application
Scala
7
star
57

dai-deployment-examples

Examples for deploying Driverless AI (DAI) scorers.
Java
7
star
58

ml-security-audits

TeX
7
star
59

wave-big-data-visualizer

Python
6
star
60

xai_guidelines

Guidelines for the responsible use of explainable AI and machine learning
Jupyter Notebook
5
star
61

authn-py

Universal Token Provider
Python
5
star
62

h2o-scoring-service

Scoring service backend by model POJOs.
Java
5
star
63

app-news-classification

Scala
5
star
64

jdupes

H2O.ai fork of https://codeberg.org/jbruchon/jdupes
C
5
star
65

covid19-backtesting-publication

Jupyter Notebook
5
star
66

fluid

Rapid application development for a more... civilized age.
CoffeeScript
5
star
67

app-mojo-servlet

Example of putting a mojo zip file as a resource into a java servlet.
Java
5
star
68

cloud-discovery-py

H2O Cloud Discovery Client.
Python
4
star
69

jacocoHighlight

Java
4
star
70

h2o-automl-paper

H2O AutoML paper
R
4
star
71

docai-recipes

Jupyter Notebook
4
star
72

deepwater-nae

Python
3
star
73

h2oai-power-nae

Shell
3
star
74

nitro-matplotlib

Matplotlib plugin for H2O Nitro
Python
3
star
75

h2o-cloud

H2O Cloud code.
Jupyter Notebook
3
star
76

nitro-plotly

Plotly plugin for H2O Nitro
Python
3
star
77

h2o-rf1-bench

Python
3
star
78

residuals-vis

JavaScript
3
star
79

wave-r-data-table

This wave application is a R data.table tutorial and interactive learning environment developed using the wave library for R.
R
3
star
80

python-chat-ui

3
star
81

h2o_genai_training

Repository for H2O.ai's Generative AI Training
Jupyter Notebook
3
star
82

roc-chart

JavaScript
2
star
83

driverlessai-alt-containers

Shell
2
star
84

camelot

Modified version of https://github.com/camelot-dev/camelot
Python
2
star
85

nitro-bokeh

Bokeh plugin for H2O Nitro
Python
2
star
86

wave-amlb

Wave Dashboard for the OpenML AutoML Benchmark
Python
2
star
87

h2o-evals

Bring Your Own Evals
Python
2
star
88

pydart

Dart/Flutter <-> Python transpiler
Python
2
star
89

app-titanic

HTML
2
star
90

h2o3-xgboost-nae

Shell
2
star
91

residuals-vis-example-project

JavaScript
2
star
92

py-repo

Python package repository
HTML
2
star
93

dai-centos7-x86_64-nae

Dockerfile
1
star
94

correlation-graph

JavaScript
1
star
95

lightning

High performance, interactive statistical graphics engine for the web.
CoffeeScript
1
star
96

residuals-vis-data

JavaScript
1
star
97

2017-06-21-hackathon

Meetup Hackathon 06/21/2017
HTML
1
star
98

h2o-health

An initiate of H2O.ai to build AI apps to solve complex healthcare and life science problems
Makefile
1
star
99

h2o-google-bigquery

Python
1
star
100

dallas-tutorials

Temporary repository for fast git cloning during the h2o dallas event.
Jupyter Notebook
1
star