• This repository has been archived on 15/Oct/2019
  • Stars
    star
    1,109
  • Rank 41,870 (Top 0.9 %)
  • Language
    Python
  • License
    Other
  • Created about 9 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

NumPy interface with mixed backend execution

MinPy

Build Status PyPI version Docs

Please try MXNet Gluon. Our project has merged with MXNet Gluon which follows our vision. This repo is being deprecated.

This repository aims at providing a high performing and flexible deep learning platform, by prototyping a pure NumPy interface above MXNet backend. In one word, you get the following automatically with your NumPy code:

import minpy.numpy as np
  • Operators with GPU support will be ran on GPU.
  • Graceful fallback for missing operations to NumPy on CPU.
  • Automatic gradient generation with Autograd support.
  • Seamless MXNet symbol integration.

Pure NumPy, purely imperative

Why obsessed with NumPy interface? First of all, NumPy is an extension to the Python programming language, with support for large, multi-dimensional arrays, matrices, and a large library of high-level mathematical functions to operate on these abstractions. If you just begin to learn deep learning, you should absolutely start from NumPy to gain a firm grasp of its concepts (see, for example, the Stanford's CS231n course). For quick prototyping of advanced deep learning algorithms, you may often start composing with NumPy as well.

Second, as an extension of Python, your implementation follows the intuitive imperative style. This is the only style, and there is no new syntax constructs to learn. To have a taste of this, let's look at some examples below.

Printing and Debugging

p1 In symbolic programming, the control dependency before the print statement is required, otherwise the print operator will not appear on the critical dependency path and thus not being executed. In contrast, MinPy is simply NumPy, as straightforward as Python's hello world.

Data-dependent branches

p2 In symbolic programming, the lambda is required in each branch to lazily expand the dataflow graph during runtime, which can be quite confusing. Again, MinPy is NumPy, and you freely use the if statement anyway you like.

Tensorflow is just one typical example, many other packages (e.g. Theano, or even MXNet) have similar problems. The underlying reason is the trade-off between symbolic programming and imperative programming. Codes in symbolic programs (Tensorflow and Theano) generates dataflow graph instead of performing concrete computation. This enables extensive optimizations, but requires reinventing almost all language constructs (like if and loop). Imperative programs (NumPy) generates dataflow graph along with the computation, enabling you freely query or use the value just computed.

In MinPy, we use NumPy syntax to ease your programming, while simultaneously achieving good performance.

Dynamic automatic gradient computation

Automatic gradient computation has become essential in modern deep learning systems. In MinPy, we adopt Autograd's approach to compute gradients. Since the dataflow graph is generated along with the computation, all kinds of native control flow are supported during gradient computation. For example:

import minpy
from minpy.core import grad

def foo(x):
  if x >= 0:
    return x
  else:
    return 2 * x

foo_grad = grad(foo)
print foo_grad(3)  # should print 1.0
print foo_grad(-1) # should print 2.0

Here, feel free to use native if statement. A complete tutorial about auto-gradient computation could be found here.

Elegant fallback for missing operators

You never like NotImplementedError, so do we. NumPy is a very large library. In MinPy, we automatically fallback to NumPy if some operators have not been implemented in MXNet yet. For example, the following code runs smoothly and you don't need to worry about copying arrays back and forth from GPU to CPU; MinPy handles the fallback and its side effect transparently.

import minpy.numpy as np
x = np.zeros((2, 3))     # Use MXNet GPU implementation
y = np.ones((2, 3))      # Use MXNet GPU implementation
z = np.logaddexp(x, y)   # Use NumPy CPU implementation

Seamless MXNet symbol support

Although we pick the imperative side, we understand that symbolic programming is necessary for operators like convolution. Therefore, MinPy allows you to "wrap" a symbol into a function that could be called together with other imperative calls. From a programmer's eye, these functions is just as other NumPy calls, thus we preserve the imperative style throughout:

import mxnet as mx
import minpy.numpy as np
from minpy.core import Function
# Create Function from symbol.
net = mx.sym.Variable('x')
net = mx.sym.Convolution(net, name='conv', kernel=(3, 3), num_filter=32, no_bias=True)
conv = Function(net, input_shapes={'x', (8, 3, 10, 10)}
# Call Function as normal function.
x = np.zeros((8, 3, 10, 10))
w = np.ones((32, 3, 3, 3,))
y = np.exp(conv(x=x, conv_weight=w))

Is MinPy fast?

The imperative interface does raise many challenges, especially it foregoes some of the deep optimization that only (currently) embodied in symbolic programming. However, MinPy manages to retain performance, especially when the actual computation is intense. Our next target is to get back the performance with advanced system techniques. benchmark

Get Started

Installation Guide

MinPy depends on MXNet. In order to get up and running with MinPy you'll need to

  1. Install MXNet for Python;

  2. Install Minpy.

Please read installation guide for more details.

MXNet version

Currently both MXNet and MinPy are going through rapid development. MinPy is not guaranteed to work with all MXNet versions. The minimum version required for MXNet is 0.9.2. To achieve the best performance, we recommend you download the MXNet from engine branch and build it from source. The following command would be useful:

git clone --recursive -b engine https://github.com/dmlc/mxnet.git

Then use the instructions to build MXNet with python interface.

NumPy version

Minpy prototypes a pure Numpy interface. To make the interface consistent, please make sure Numpy version >= 1.10.0 before install Minpy.

MXNet and Numpy could meet version conflicts if you are working with them on other projects. Our installation guide provides how to use virtualenv and virtualenvwrapper to resolve the issue.

Easy installation

pip install minpy

We are still actively polishing the package. You can look at this tutorial to understand its concept. Documents are hosted here.

More Repositories

1

xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
C++
26,028
star
2

dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
Python
13,511
star
3

gluon-cv

Gluon CV Toolkit
Python
5,821
star
4

gluon-nlp

NLP made easy
Python
2,553
star
5

decord

An efficient video loader for deep learning with smart shuffling that's super easy to digest
C++
1,772
star
6

nnvm

C++
1,657
star
7

ps-lite

A lightweight parameter server interface
C++
1,525
star
8

mshadow

Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
C++
1,106
star
9

cxxnet

move forward to https://github.com/dmlc/mxnet
C++
1,025
star
10

dlpack

common in-memory tensor structure
Python
885
star
11

dmlc-core

A common bricks library for building scalable and portable distributed machine learning.
C++
862
star
12

treelite

Universal model exchange and serialization format for decision tree forests
C++
729
star
13

minerva

Minerva: a fast and flexible tool for deep learning on multi-GPU. It provides ndarray programming interface, just like Numpy. Python bindings and C++ bindings are both available. The resulting code can be run on CPU or GPU. Multi-GPU support is very easy.
C++
698
star
14

parameter_server

moved to https://github.com/dmlc/ps-lite
C++
648
star
15

mxnet-notebooks

Notebooks for MXNet
Jupyter Notebook
615
star
16

rabit

Reliable Allreduce and Broadcast Interface for distributed machine learning
C++
505
star
17

mxnet.js

MXNetJS: Javascript Package for Deep Learning in Browser (without server)
JavaScript
435
star
18

MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia
371
star
19

tensorboard

Standalone TensorBoard for visualizing in deep learning
Python
369
star
20

wormhole

Deprecated
C++
338
star
21

mxnet-memonger

Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets
Python
308
star
22

difacto

Distributed Factorization Machines
C++
296
star
23

XGBoost.jl

XGBoost Julia Package
Julia
288
star
24

mxnet-model-gallery

Pre-trained Models of DMLC Project
266
star
25

GNNLens2

Visualization tool for Graph Neural Networks
TypeScript
232
star
26

HalideIR

Symbolic Expression and Statement Module for new DSLs
C++
205
star
27

mxnet-gtc-tutorial

MXNet Tutorial for NVidia GTC 2016.
Jupyter Notebook
131
star
28

experimental-lda

C++
127
star
29

MXNet.cpp

C++ interface for mxnet
C++
114
star
30

experimental-mf

cache-friendly multithread matrix factorization
C++
87
star
31

web-data

The repo to host all the web data including images for documents in dmlc projects.
Jupyter Notebook
83
star
32

nnvm-fusion

Kernel Fusion and Runtime Compilation Based on NNVM
C++
70
star
33

dmlc.github.io

HTML
27
star
34

tl2cgen

TL2cgen (TreeLite 2 C GENerator) is a model compiler for decision tree models
C++
21
star
35

cub

Cuda
19
star
36

mxnet-deepmark

Benchmark speed and other issues internally, before push to deep-mark
Python
7
star
37

mxnet-examples

MXNet Example
6
star
38

xgboost-bench

Python
4
star
39

drat

Drat Repository for DMLC R packages
4
star
40

nn-examples

1
star
41

gluon-nlp-notebooks

1
star
42

docs-redirect-for-mxnet

redirect mxnet.readthedocs.io to mxnet.io
Python
1
star