study-group
We are a diverse group of software engineering and data science professionals from industry and academia.
Inspired by the collaborative culture of artificial neural network research, we hold regular, chilled beverage-enhanced study sessions in midtown Manhattan. At these meetings, we summarise prescribed preparatory material and leverage our individual strengths in computer science, mathematics, statistics, neuroscience, and venture capital to cement our comprehension of concepts and to implement effective deep learning models.
Over the course of our sessions, we follow three parallel paths:
- Theory: We study academic textbooks, exercises, and coursework so that we command strong theoretical foundations for neural networks and deep learning. Broadly, we cover calculus, algebra, probability, computer science, with a focus on their intersection at machine learning.
- Application: We practice deep learning in the real world. We typically commence by collectively following tutorials then we move on to solving novel and illustrative data problems involving a broad range of techniques. In addition to incorporating deep learning into our respective academic and commercial applications, we commit code to the present, public repository where possible.
- Presentations: Study group members regularly share their progress on Deep Learning projects and their area of expertise. This elicits novel discourse outside of the relatively formal paths 1 and 2, playfully encouraging along serendipity.
Theory
Theory coverage has been led by:
- Jon Krohn
- Artificial Neural Networks with Michael Nielsen's introductory text Neural Networks and Deep Learning (covered in sessions I through V)
- Machine Vision with Fei-Fei Li, Andrej Karpathy and Justin Johnson's CS231n on Convolutional Neural Networks for Visual Recognition (sessions VI through VIII)
- Natural Language Processing with Richard Socher and Christopher Manning's CS224D (2016)/CS224N (2017) on Deep Learning for Natural Language Processing (sessions IX through XIII)
- Reinforcement Learning with Sergey Levine's CS294-112 on Deep Reinforcement Learning (session XIV)
- Laura Graesser and Wah Loon Keng on Deep Reinforcement Learning
- Deep Q-Learning with their OpenAI Lab (session XV)
- Policy Gradients with their SLM Lab (session XVI)
If you're looking to get a handle on the fundamentals of Deep Learning, check out Jon Krohn's:
- Comprehensive book Deep Learning Illustrated, based on the topics covered in the study group
- Seven-hour introduction to Deep Learning in general
- Five-hour introduction to Deep Learning for Natural Language Processing specifically
- Six-hour introduction to Machine Vision, Generative Adversarial Networks, and Deep Reinforcement Learning
Application
Our applications have involved a broad range of neural network architectures built largely in Python with NumPy, TensorFlow, and PyTorch.
Presentations
In chronological order, we have experienced the joy of being enlightened by:
- Katya Vasilaky on the mathematics of deep learning (session II) and on regularization (session VIII)
- Thomas Balestri on countless machine-learning underpinnings (sessions III and IV), LSTM gates (session XII), and Reinforcement Learning (session XIV)
- Gabe Rives-Corbett on Keras implementations of deep learning deployed at untapt (session III)
- Dmitri Nesterenko on his NumPy implementation of k-nearest neighbours (session VI) and capsule networks (session XVII)
- Raphaela Sapire on the deep learning start-up investment atmosphere (session VIII)
- Grant Beyleveld on his U-Net convolutional network (session IX), GANs (session XII), and transformer architectures (session XVII)
- Jessica Graves on applications of deep learning to the fashion industry (session IX)
- V.T. Rajan on deriving the word2vec algorithm (session X)
- Karl Habermas on his NumPy implemention of the word2vec algorithm (session X)
- David Epstein on generative adversarial networks (session X)
- Claudia Perlich on predictability and how it creates biases when your target is created by mixtures (session XI)
- Brian Dalessandro on generating text with Keras LSTM models (session XI)
- Marianne Monteiro on TensorFlow Recurrent Neural Network implementations (session XIII)
- Druce Vertes on predicting which financial stories go viral on social media (session XIII)
Session Notes
Click through for detailed summary notes from each session:
- August 17th, 2016: Perceptrons and Sigmoid Neurons
- September 6th, 2016: The Backpropagation Algorithm
- September 28th, 2016: Improving Neural Networks
- October 20th, 2016: Proofs of Key Properties
- November 10th, 2016: Deep (Conv)Nets
- November 30th, 2016: Convolutional Neural Networks for Visual Recognition
- January 12th, 2017: Implementing Convolutional Nets
- February 7th, 2017: Unsupervised Learning, Regularisation, and Venture Capital
- March 6th, 2017: Word Vectors, AI x Fashion, and U-Net
- March 27th, 2017: word2vec Mania + GANs
- April 19th, 2017: Recurrent Neural Networks, including GRUs and LSTMs
- July 1st, 2017: Translation, Attention, more LSTMs, Speech-to-Text, and TreeRNNs
- August 5th, 2017: Model Architectures for Answering Questions and Overcoming NLP Limits
- October 17th, 2017: Reinforcement Learning
- December 9th, 2017: Deep Reinforcement Learning (Deep Q-Learning and OpenAI Lab)
- February 17th, 2018: Deep Reinforcement Learning (Policy Gradients and SLM-Lab)
- October 16th, 2019: Deep Learning Illustrated Book Launch, Transformer Architectures, and Capsule Networks
Acknowledgements
Thank you to untapt and its visionary, neural net-loving founder Ed Donner for hosting and subsidising all meetings of the Deep Learning Study Group.
With a desire to remain intimately-sized, our study group has reached its capacity. If you'd like to be added to our waiting list, please contact the organiser, Jon Krohn, describing your relevant experience as well as your interest in deep learning. We don't expect you to necessarily be a deep learning expert already :)