Deep Learning Workshop
This repo includes all scripts required to build a VirtualBox 'Appliance' (an easy-to-install pre-configured VM) that can be used by Deep Learning Workshop participants.
This workshop consists of an introduction to deep learning (from single layer networks-in-the-browser, then using the VM/Jupyter setup to train networks using both Theano (+Lasagne for model components) and Tensorflow (+some sugar layers). The modules also include pretrained state-of-the-art networks, such as GoogLeNet, in various applications) :
-
FOSSASIA 2016 : Deep Learning Workshop (2 hours)
-
PyCon-SG 2016 : Deep Learning Workshop (1.5 hours)
-
DataScienceSG MeetUp : 'Hardcore' session about Deep Learning (2.5 hours)
-
Fifth Elephant, India : Deep Learning Workshop (6 hours : 4x 1.5hr classes in one day)
- Application : Classifying unknown classes of images (~transfer learning)
- Application : Generative art (~style transfer)
- Application : RNN Tagger
- Application : RNN Fun (work-in-progress)
- Application : Anomaly Detection (mis-shaped MNIST digits)
- Application : Reinforcement Learning
- Slides for the talk are here, with an accompanying blog post
-
PyDataSG MeetUp : Talk on RNNs and NLP (1.5 hours)
-
TensorFlow & Deep Learning MeetUp : Talk on transfer learning (0.5 hours)
-
FOSSASIA 2017 : Deep Learning Workshop (1 hour)
-
TensorFlow & Deep Learning MeetUp : Talk on CNNs (0.5 hours)
- Application : Speech Recognition using a CNN (non-workshop version)
- Slides for the talk are [here]((http://redcatlabs.com/2017-03-20_TFandDL_IntroToCNNs/#/), with an accompanying blog post, which includes a video link
-
TensorFlow & Deep Learning MeetUp : Generative Art : Style-Transfer (0.5 hours)
- Application : Generative Art (Style-Transfer)
- Slides for the talk are here
-
APAC Machine Learning & Data Science Community Summit : In the news : AlphaGo and Reinforcement Learning (0.75 hours)
-
TensorFlow & Deep Learning MeetUp : Text : Embeddings, RNNs and NER (~1 hour)
-
TensorFlow & Deep Learning MeetUp : Advanced Text and Language (0.75 hours)
-
FOSSASIA 2018 : Deep Learning Workshop (1 hour)
NB : Ensure Conference Workshop announcement / blurb includes VirtualBox warning label
- Also : for the Art (and potentially other image-focussed) modules, having a few 'personal' images available might be entertaining *
The VM itself includes :
- Jupyter (iPython's successor)
- Running as a server available to the host machine's browser
- Data
- MNIST training and test sets
- Trained models from two of the 'big' ImageNet winners
- Test Images for both recognition, 'e-commerce' and style-transfer modules
- Corpuses and pretrained GloVe for the language examples
- Locally-runnable versions of a CNN demonstrator, and OpenAI's '3-boxes' Reptile demo
- Tool chain(s) (Python-oriented)
Theano / Lasagne- Tensorflow and Keras
- PyTorch (CPU version)
And this repo can itself be run in 'local mode', using scripts in ./local/
to :
- Set up the virtual environment correctly
- Run
jupyter
with the right flags, paths etc
Status : Workshop WORKS!
Currently working well
-
Scripts to create working Fedora 25 installation inside VM
- Has working
Python3.x
virtualenv
withJupyter
andTensorFlow / TensorBoard
- Has working
-
Script to transform the VM into a VirtualBox appliance
- Exposing
Jupyter
,TensorBoard
andssh
to host machine
- Exposing
-
Locally hosted Convnet.js for :
- Demonstration of gradient descent ('painting')
-
Locally hosted TensorFlow Playground for :
- Visualising hidden layer, and effect of features, etc
-
Locally hosted cnn demo for :
- Demonstration of how a single CNN 3x3 filter works
-
Existing workshop notebooks :
- Basics
- MNIST
- MNIST CNN
- ImageNet : GoogLeNet
- ImageNet : Inception 3
- CNN for simple Voice Recognition
- 'Anomaly Detection' - identifying mis-shaped MNIST digits
- 'Commerce' - repurpose a trained network to classify our stuff
- 'Art' - Style transfer with Lasagne, but using GoogLeNet features for speed
- 'Reinforcement Learning' - learning to play "Bubble Breaker"
- 'RNN-Tagger' - Processing text, and learning to do case-less Named Entity Recognition
-
Notebook Extras
- U - VM Upgrade tool
- X - BLAS configuration fiddle tool
- Z - GPU chooser (needs Python's
BeautifulSoup
)
-
Create rsync-able image containing :
- VirtualBox appliance image
- including data sets and pre-trained models
- VirtualBox binaries for several likely platforms
- Write to thumb-drives for actual workshop
- and/or upload to DropBox
- VirtualBox appliance image
-
Workshop presentation materials
Still Work-in-Progress
-
Create sync-to-latest-workbooks script to update existing (taken-home) VMs
-
Create additional 'applications' modules (see 'ideas.md')
-
Monitor TensorBoard - to see whether it reduces its memory footprint enough to switch from Theano...
-
'RNN-Fun' - Discriminative and Generative RNNs
Notes
Running the environment locally
See the local/README file.
Also worth investigating : Google Colab, which allows the Free (as in Beer) use of a K40 GPU in a Jupyter-notebook-like interface. In fact, there is also the possibility of pulling up GitHub-based notebooks directly using the url :
https://colab.research.google.com/github/USER/REPO/blob/master/NOTEBOOK
For a concrete example, look at this link to the recent revamped Reptile code from OpenAI that is in the MetaLearning folder of this repo.
Git-friendly iPython Notebooks
Using the code from : http://pascalbugnion.net/blog/ipython-notebooks-and-git.html (and https://gist.github.com/pbugnion/ea2797393033b54674af ), you can enable this kind of feature just on one repository, rather than installing it globally, as follows...
Within the repository, run :
# Set the permissions for execution :
chmod 754 ./bin/ipynb_optional_output_filter.py
git config filter.dropoutput_ipynb.smudge cat
git config filter.dropoutput_ipynb.clean ./bin/ipynb_optional_output_filter.py
this will add suitable entries to ./.git/config
.
or, alternatively, create the entries manually by ensuring that your .git/config
includes the lines :
[filter "dropoutput_ipynb"]
smudge = cat
clean = ./bin/ipynb_output_filter.py
Note also that this repo includes a <REPO>/.gitattributes
file containing the following:
*.ipynb filter=dropoutput_ipynb
Doing this causes git to run ipynb_optional_output_filter.py
in the REPO/bin
directory,
which only uses import json
to parse the notebook files (and so can be executed as a plain script).
To disable the output-cleansing feature in a notebook (to disable the cleansing on a per-notebook basis),
simply add to its metadata (Edit-Metadata) as a first-level entry (true
is the default):
"git" : { "suppress_outputs" : false },
Git-friendly iPython Notebooks (Looks promising, but...)
nbstripout seems to do what we want, and can be installed more easily.
Within the local python environment (or do this globally, as root, if you're committed) :
pip install nbstripout