• Stars
    star
    123
  • Rank 288,696 (Top 6 %)
  • Language
    Lua
  • License
    BSD 2-Clause "Sim...
  • Created over 9 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

OpenCL backend for Torch nn neural networks library

clnn

OpenCL backend for Torch nn neural networks library.

Installation

Please see distro-cl for installation instructions.

What works

Parameterized Modules

  • nn.Linear

Basic Tensor methods

These mostly 'just work', since based on underlying tensor methods, already implemented in cltorch. Tested with:

  • nn.Narrow

Miscellaneous modules

  • nn.Identity
  • nn.Dropout

Convolution layers

  • nn.SpatialConvolutionMM
  • nn.SpatialMaxPooling (including ceil mode)
  • nn.SpatialAveragePooling
  • nn.TemporalConvolution2 This is specific to clnn. It works on cpu and cuda too, not just on OpenCL. It is API-compatible with TemporalConvolution, and faster than TemporalConvolution, on both CUDA and OpenCL.

Transfer function layers

  • nn.Tanh
  • nn.Sigmoid
  • nn.ReLU
  • nn.ELU
  • nn.Exp
  • nn.Sqrt
  • nn.Square
  • nn.Abs
  • nn.LogSigmoid
  • nn.HardTanh
  • nn.LogSoftMax
  • nn.SoftMax (including spatial mode)

Table layers

These 'just work', since they are based on underlying torch operations, which are already implemented in cltorch. Tested with:

  • nn.CMulTable
  • nn.CAddTable

Criterions

  • nn.MSECriterion
  • nn.ClassNLLCriterion

Containers:

Containers 'just work', since they just call standard operations on the contained modules. Tested with:

  • nn.Sequential
  • nngraph

Trainers

In theory, trainers 'just work', since they just call standard torch methods on the network. The following are good first choices:

  • nn.StochasticGradient
  • optim.lbfgs
  • optim.adam

Timings

Soumith benchmark layers

Please see https://github.com/soumith/convnet-benchmarks#imagenet-winners-benchmarking

  • On a Titan X, OpenCL torch is about 3 times slower than CUDA torch
    • eg for VGG, cutorch takes 1100ms, and cltorch takes 3400ms

Example networks

Porting guidelines

Porting guidelines, for project maintainers, available here: porting-guidelines.md.

Recent changes

  • 2nd May:
    • Re-applied:
      • 26th March:
        • add TemporalConvolution2: same API and usage as TemporalConvolution, but faster on GPUs
  • 31st April:
    • Re-applied:
      • 10th March:
        • @pawni (Nick Pawlowski) added SpatialUpSamplingNearest. Thank you Nick
      • 20th February:
        • @gloine (Jaehyung Lee) added support for non-batched input to ClassNLLCriterion. Thank you Jaehyung
  • 30th April:
    • rolled back to as-of 21st February, prior to lots of THNN changes in upstream Torch
    • additionally, installation procedure is now to use a specific torch distro, for stability
  • 1st Feb:
    • merged/ported THNN phase 3. Any weird build issues, please update both nn and clnn.
  • 2nd January, 2016:
    • merged/ported THNN architecture across, and the implementation of Abs, so the unit-tests pass again now
  • 15th December:
  • 29th November:
    • added ELU
  • 25th September:
  • 23rd September:
    • ported latest cunn implementation of SpatialMaxPooling across, ie approximately Sergey's Deterministic max-pooling PR
      • this includes :ceil() implementation
  • 22nd September:
    • added non-batch implementation of LogSoftMax (previously only handled batched input)
    • added SoftMax, for both batched and non-batched
  • 20th September:
    • added non-batch implementation for SpatialMaxPooling (previously only handled batched input), for contiguous pools

Older changes

More Repositories

1

DeepCL

OpenCL library to train deep convolutional neural networks
C++
849
star
2

coriander

Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices
LLVM
837
star
3

tf-coriander

OpenCL 1.2 implementation for Tensorflow
C++
789
star
4

VeriGPU

OpenSource GPU, in Verilog, loosely based on RISC-V ISA
SystemVerilog
716
star
5

pytorch

Python wrappers for torch and lua
Python
431
star
6

cltorch

An OpenCL backend for torch.
C++
287
star
7

EasyCL

Easy to run kernels using OpenCL
C++
177
star
8

Jinja2CppLight

(very) lightweight version of Jinja2 for C++
C++
140
star
9

howto-jenkins-ssl

quick how to on activating ssl on jenkins, so I can find it easily
109
star
10

jeigen

Java wrapper for Eigen C++ fast matrix library
C++
104
star
11

kgsgo-dataset-preprocessor

Dataset preprocessor for the KGS go dataset, eg according to Clark and Storkey input planes
Python
70
star
12

peaceful-pie

Control Unity from Python! Use for reinforcement learning.
C#
26
star
13

cpu-tutorial

Tutorial on building your own CPU, in Verilog
26
star
14

coriander-dnn

Partial implementation of NVIDIA® cuDNN API for Coriander, OpenCL 1.2
C++
22
star
15

rnn-notes

Notes on how element-research rnn works
17
star
16

luacpptemplater

Write Jinja2-style templates in C++. Uses lua as a scripting language (very lightweight)
C++
16
star
17

jfastparser

Very fast parsing of doubles and ints from a string
Java
15
star
18

pytorch-coriander

OpenCL build of pytorch - (in-progress, not useable)
Python
14
star
19

pub-prototyping

prototyping stuff
C++
13
star
20

securewebcmd

Execute commands on a linux server through a webpage. Secured using md5 hashing
JavaScript
12
star
21

neonCl-underconstruction

experimental port of nervana neon kernels in OpenCL
Python
11
star
22

selfstudy-IBP

Self-study notes for Indian Buffet Process, from reading through "The Indian Buffet Process: An Introduction and Review", Griffiths, Ghahramani, 2011
Jupyter Notebook
10
star
23

torch-modder-notes

Notes for torch maintainers/modders
10
star
24

pycudatorch

poc for using cuda torch from python :-)
Python
7
star
25

nimbix-admin

utility scripts for start/stop/ssh to nimbix instances
Python
7
star
26

coriander-CLBlast

BLAS implementation for Coriander, using CLBlast
C++
6
star
27

ArgParseCpp

C++ version of Python's ArgParse
C++
6
star
28

passwordbookmarklet

bookmarklet to generate unique secure passwords for each website from a single master password
JavaScript
5
star
29

torchunit

torchunit
Shell
4
star
30

UnityFluidSim-pub

UnityFluidSim-pub
C#
4
star
31

tinisles-googleauthenticator

Fork of the code at http://blog.tinisles.com/2011/10/google-authenticator-one-time-password-algorithm-in-javascript/
HTML
3
star
32

springgrid

Runs spring matches on a grid of botrunners
Python
3
star
33

osmp-cs

OSMP C# - Opensource Secondlife clone, from around 2005
C#
3
star
34

ailadder

code to create an ailadder webserver
Python
3
star
35

pycltorch

POC for Python wrappers for cltorch/clnn
Python
3
star
36

yet-another-measure-theoretic-probability-tutorial

Yet another measure theoretic probability tutorial
Jupyter Notebook
3
star
37

HughAI

HughAI Java AI for Spring
Java
3
star
38

selfstudy-LARS

least angle regression, reproducing for self-study
Jupyter Notebook
2
star
39

verigpu-cuda-frontend

Front-end for VeriGPU, providing NVIDIA® CUDA™ compatibility, for compatibility purposes
2
star
40

selfstudy-LIME

LIME
Jupyter Notebook
2
star
41

blameful-indenter

reindent code, whilst preserving git blame
Python
2
star
42

SpringRTS-CSharpAI

AI For Spring RTS game, in C#, from around 2006
C#
2
star
43

neon-benchmarks

benchmarks for neon, both cuda and OpenCL version
Python
2
star
44

FractalSpline-cpp

Provides SecondLife-like primitives , in C++/OpenGL. From around 2005
C++
2
star
45

relooper

Reloop llvm IR output, to have `for`s, `if`s, `while`s. From https://reviews.llvm.org/D12744
C++
2
star
46

osmp-cpp

OSMP C++ - opensource SecondLife clone, from around 2004
C++
2
star
47

chat-ai-npcs-video-resources

chat-ai-npcs-video-resources
C#
2
star
48

github-stars

get an email whenever someone stars your github repo :-)
Python
2
star
49

project-ideas

Ideas for projects
2
star
50

scalable-gpt-developer

scalable-gpt-developer
Python
2
star
51

gpu-experiments

Informal experiments on various gpu kernel questions
Python
1
star
52

dockerfiles

dockerfiles
1
star
53

virtualenv-move

Moves a python virtualenv
Shell
1
star
54

privacy-policies

privacy-policies
HTML
1
star
55

openpw

Password hash generator, including console, bookmarklet, chrome extension
JavaScript
1
star
56

headlessopenglstubs

stubs for opengl, glew, sdl, whatever it takes ;-), to get spring to run without an X session
1
star
57

ndplot

high-dimensional viewer, by projecting onto a 3d hypercube. Use the mouse to rotate the 3d projection.
Python
1
star
58

python-threadingx

Erlang-like threading functionality for Python
Python
1
star
59

youtube-likes

Receive a notification/email when someone 'like's one of your videos.
Python
1
star
60

youtube-rl-demos

Scripts, code used in youtube demos
Python
1
star
61

python-graphics-numpy

Scripts for youtube video on creating graphics in python using numpy
Python
1
star
62

SpringMapDesigner

3d MapDesigner for Spring
1
star
63

chinese-transcriptions

Transcriptions of Chinese language videos
1
star
64

pytorch-prettify

Prettifies exceptions coming out of pytorch
Jupyter Notebook
1
star
65

tf_cached_build

Cache tensorflow build dependencies, to accelerate repeated tf configures, or run on a plane
Python
1
star
66

cppsimpleargsparser

Simple C++ args parser. Easy to use. Automatically provides checking and usage printout.
C++
1
star
67

PortableTensor.Net

Cross-platform Tensor with transpose, slice as views over underlying data
C#
1
star