• This repository has been archived on 15/Oct/2019
  • Stars
    star
    689
  • Rank 62,941 (Top 2 %)
  • Language
    C++
  • License
    Other
  • Created over 9 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Minerva: a fast and flexible tool for deep learning on multi-GPU. It provides ndarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy.

Minerva: a fast and flexible system for deep learning

Latest News

  • We've cleared quite a lot of Minerva's dependencies and made it easier to build. Basically, almost all needed are:

    ./build.sh

    Please see the wiki page for more information.

  • Minerva's Tutorial and API documents are released!

  • Minerva had migrated to dmlc, where you could find many awesome machine learning repositories!

  • Minerva now evolves to use cudnn_v2. Please download and use the new library.

  • Minerva now supports the latest version of Caffe's network configuration protobuf format. If you are using older version, error may occur. Please use the tool to upgrade the configure file.

Overview

Minerva is a fast and flexible tool for deep learning. It provides NDarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy. Please refer to the examples to see how multi-GPU setting is used.Minerva is a fast and flexible tool for deep learning. It provides NDarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy. Please refer to the examples to see how multi-GPU setting is used.

Quick try

After building and installing Minerva and Owl package (python binding) as in Install Minerva. Try run ./run_owl_shell.sh in Minerva's root directory. And enter:

>>> x = owl.ones([10, 5])
>>> y = owl.ones([10, 5])
>>> z = x + y
>>> z.to_numpy()

The result will be a 10x5 array filled by value 2. Minerva supports many numpy style ndarray operations. Please see the API document for more information.

Features

  • N-D array programming interface and easy integration with numpy

    >>> import numpy as np
    >>> x = np.array([1, 2, 3])
    >>> y = owl.from_numpy(x)
    >>> y += 1
    >>> y.to_numpy()
    array([ 2., 3., 4., ], dtype=float32)

    More is in the API cheatsheet

  • Automatically parallel execution

    >>> x = owl.zeros([256, 128])
    >>> y = owl.randn([1024, 32], 0.0, 0.01)

    The above x and y will be executed concurrently. How is this achieved?

    See Feature Highlight: Data-flow and lazy evaluation

  • Multi-GPU, multi-CPU support:

    >>> owl.set_device(gpu0)
    >>> x = owl.zeros([256, 128])
    >>> owl.set_device(gpu1)
    >>> y = owl.randn([1024, 32], 0.0, 0.01)

    The above x and y will be executed on two cards simultaneously. How is this achieved?

    See Feature Highlight: Multi GPU Training

Tutorial and Documents

  • Tutorials and high-level concepts could be found in our wiki page
  • A step-by-step walk through on MNIST example could be found here
  • We also built a tool to directly read Caffe's configure file and train. See document.
  • API documents could be found here

Performance

We will keep updating the latest performance we could achieve in this section.

Training speed

Training speed
(images/second)
AlexNet VGGNet GoogLeNet
1 card 189.63 14.37 82.47
2 cards 371.01 29.58 160.53
4 cards 632.09 50.26 309.27
  • The performance is measured on a machine with 4 GTX Titan cards.
  • On each card, we load minibatch size of 256, 24, 120 for AlexNet, VGGNet and GoogLeNet respectively. Therefore, the total minibatch size will increase as the number of cards grows (for example, training AlexNet on 4 cards will use 1024 minibatch size).

An end-to-end training

We also provide some end-to-end training codes in owl package, which could load Caffe's model file and perform training. Note that, Minerva is not the same tool as Caffe. We are not focusing on this part of logic. In fact, we implement these just to play with the Minerva's powerful and flexible programming interface (we could implement a Caffe-like network trainer in around 700~800 lines of python codes). Here is the training error with time compared with Caffe. Note that Minerva could finish GoogleNet training in less than four days with four GPU cards.

Error curve

Testing error rate

We trained several models using Minerva from scratch to show the correctness. The following table shows the error rate of different network under different testing settings.

Testing error rate AlexNet VGGNet GoogLeNet
single view top-1 41.6% 31.6% 32.7%
multi view top-1 39.7% 30.1% 31.3%
single view top-5 18.8% 11.4% 11.8%
multi view top-5 17.5% 10.8% 11.0%
  • AlexNet is trained with the solver except that we didn't use multi-group convolution.
  • GoogLeNet is trained with the quick_solver.
  • We didn't train VGGNet from scratch. We just transform the model into Minerva format and testing.

The models can be found in the following link: AlexNet GoogLeNet VGGNet

You can download the trained models and try them on your own machine using net_tester script.

Next Plan

  • Get rid of boost library dependency by using Cython. (DONE)
  • Large scale LSTM example using Minerva.
  • Easy support for user-defined new operations.

License and support

Minerva is provided in the Apache V2 open source license.

You can use the "issues" tab in github to report bugs. For non-bug issues, please send up an email at [email protected]. You can subscribe to the discussion group: https://groups.google.com/forum/#!forum/minerva-support.

Wiki

For more information on how to install, use or contribute to Minerva, please visit our wiki page: https://github.com/minerva-developers/minerva/wiki

More Repositories

1

xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
C++
25,402
star
2

dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
Python
12,966
star
3

gluon-cv

Gluon CV Toolkit
Python
5,749
star
4

gluon-nlp

NLP made easy
Python
2,548
star
5

nnvm

C++
1,655
star
6

decord

An efficient video loader for deep learning with smart shuffling that's super easy to digest
C++
1,567
star
7

ps-lite

A lightweight parameter server interface
C++
1,502
star
8

minpy

NumPy interface with mixed backend execution
Python
1,111
star
9

mshadow

Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
C++
1,098
star
10

cxxnet

move forward to https://github.com/dmlc/mxnet
C++
1,025
star
11

dmlc-core

A common bricks library for building scalable and portable distributed machine learning.
C++
861
star
12

dlpack

common in-memory tensor structure
Python
829
star
13

treelite

Universal model exchange and serialization format for decision tree forests
C++
703
star
14

parameter_server

moved to https://github.com/dmlc/ps-lite
C++
645
star
15

mxnet-notebooks

Notebooks for MXNet
Jupyter Notebook
613
star
16

rabit

Reliable Allreduce and Broadcast Interface for distributed machine learning
C++
507
star
17

mxnet.js

MXNetJS: Javascript Package for Deep Learning in Browser (without server)
JavaScript
435
star
18

MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia
372
star
19

tensorboard

Standalone TensorBoard for visualizing in deep learning
Python
370
star
20

wormhole

Deprecated
C++
341
star
21

mxnet-memonger

Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets
Python
308
star
22

difacto

Distributed Factorization Machines
C++
296
star
23

XGBoost.jl

XGBoost Julia Package
Julia
280
star
24

mxnet-model-gallery

Pre-trained Models of DMLC Project
266
star
25

GNNLens2

Visualization tool for Graph Neural Networks
TypeScript
206
star
26

HalideIR

Symbolic Expression and Statement Module for new DSLs
C++
202
star
27

mxnet-gtc-tutorial

MXNet Tutorial for NVidia GTC 2016.
Jupyter Notebook
131
star
28

experimental-lda

C++
127
star
29

MXNet.cpp

C++ interface for mxnet
C++
114
star
30

experimental-mf

cache-friendly multithread matrix factorization
C++
86
star
31

web-data

The repo to host all the web data including images for documents in dmlc projects.
Jupyter Notebook
80
star
32

nnvm-fusion

Kernel Fusion and Runtime Compilation Based on NNVM
C++
64
star
33

dmlc.github.io

HTML
27
star
34

cub

Cuda
18
star
35

tl2cgen

TL2cgen (TreeLite 2 C GENerator) is a model compiler for decision tree models
C++
13
star
36

mxnet-deepmark

Benchmark speed and other issues internally, before push to deep-mark
Python
7
star
37

mxnet-examples

MXNet Example
6
star
38

xgboost-bench

Python
4
star
39

drat

Drat Repository for DMLC R packages
4
star
40

nn-examples

1
star
41

gluon-nlp-notebooks

1
star
42

docs-redirect-for-mxnet

redirect mxnet.readthedocs.io to mxnet.io
Python
1
star