Awesome Reinforcement Learning
Click here to see icon descriptions.
🚀 - state-of-the-art agent/technique at the moment of paper publication.⭐ - valuable paper.- - Model-based RL.
- - Multi-Agent RL.
- - Self-Play.
- - Evolutionary & Genetic Algorithms.
- - Generalization on unseen environments.
- - Auto ML - Architecture search.
- - Manipulation tasks.
- - Locomotion: MuJoCo, Roboschool, etc.
- - Navigation tasks.
- - Strategy Planning Problems.
- - Transfer learning.
- - Inverse Reinforcement Learning.
- - Meta-Learning.
- - Curiosity Learning, Advanced Exploration.
- - Table games (Table).
- - Atari game (Atari).
- - Doom game (Doom).
- - Starcraft game (Starcraft).
- - Go game (Go).
Table of Contents
- Frameworks
- Benchmarks
- Policy-Based Generic Agents
- Value-Based Generic Agents
- Model-Based Generic Agents
- Evolutionary & Genetic Algorithms
- Exploration
- Self-Play
- Meta-Learning
- Multi-Agent RL
- Inverse RL
- Navigation
- Manipulation
- Locomotion
- Auto ML
- Other Domains
- Books
- Search for new Papers
- Misc
RL Frameworks & Implementations
[Baselines @ OpenAI] TensorFlow: PPO, A2C, DQN, TRPO, ACKTR, DDPG, HER, GAIL, etc
[Baselines @ DLR-RM] Pytorch: Custom envs, custom policies
[RLlib @ Ray Pytorch / TensorFlow]
[Dopamine @ Google] TensorFlow: Rainbow, c51, IQN, DQN, etc
[TensorForce] TensorFlow: A3C, PPO, TRPO, DQN, etc
[pytorch-a2c-ppo-acktr] PyTorch: A2C, ACKTR, PPO, GAIL, etc
RL Benchmarks
[OpenAI Benchmarks for PPO, A2C, ACKTR, ACER]
[OpenAI Benchmarks for DQN, Double DQN, Dueling DQN, Prioritized DQN]
[Google Benchmarks for Rainbow, c51, IQN, DQN]
Policy-Based Generic Agents
[High-dimensional continuous control using generalized advantage estimation (GAE)] 2015 @ Berkeley
Value-Based Generic Agents
Model-Based Generic Agents
[Model-Based Reinforcement Learning for Atari] 2019 @ Google Brain, etc
[Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning] [blog] [code] 2017 @ Berkeley
[Learning model-based planning from scratch], [blog] 2017 @ Google DeepMind
[The Predictron: End-To-End Learning and Planning] 2016 @ Google Deepmind
Evolutionary Algorithms
[Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari] 2018 @ Univ. of Freiburg
[Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning, pdf] 2013 @ IDSIA, USI-SUPSI
Exploration
[Exploration by Random Network Distillation (RND)] [blog] [code] 2018 @ OpenAI
[Large-Scale Study of Curiosity-Driven Learning] [blog] 2018 @ OpenAI, Berkeley, Univ. of Edinburgh
[Deep Curiosity Search] 2018 @ Univ. of Wyoming
[Parameter Space Noise for Exploration] 2017 @ OpenAI, Karlsruhe Inst. of Tech.
Self-Play
[Mastering the game of Go with deep neural networks and tree search (AlphaGo Master)], [reddit] Silver et al., 2017 @ Deepmind, Google
Meta-Learning
[Meta Learning Shared Hierarchies] [blog] Frans et al., 2017 @ OpenAI, Berkeley.
[Hybrid Reward Architecture for Reinforcement Learning (HRA)] van Seijen et al., 2017 @ Microsoft Maluuba, McGill Univ.
Multi-Agent RL
[Learning with Opponent-Learning Awareness (LOLA)] [blog] Foerster et al., 2017 @ OpenAI, Oxford, Berkeley, CMU
Inverse RL
[SFV: Reinforcement Learning of Physical Skills from Videos] [blog] Peng et al., 2018; Berkeley
[One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning] Finn et al., 2018 @ UC Berkeley
[One-Shot Visual Imitation Learning via Meta-Learning] Finn et al., 2017 @ UC Berkeley, OpenAI
Navigation
[Learning to Navigate in Cities Without a Map] Mirowski et al, 2019 @ Deepmind
[Human-level performance in first-person multiplayer games with population-based deep reinforcement learning] [blog] Jaderberg et al, 2018 @ DeepMind
[Building Generalizable Agents with a Realistic and Rich 3D Environment] Wu et al, 2018 @ Berkeley, FAIR
Distral: Robust Multitask Reinforcement Learning] Teh et al, 2017 @ Deepmind
[RL2: Fast Reinforcement Learning via Slow Reinforcement Learning] Duan et al., 2016 @ Berkeley, OpenAI
[Playing FPS Games with Deep Reinforcement Learning (VizDoom 2016 Limited DM 2nd place)] Lample, Chaplot, 2016 @ CMU
Manipulation
[Learning Dexterous In-Hand Manipulation] [blog] Andrychowicz et al., 2018 @ OpenAI
[Asymmetric Actor Critic for Image-Based Robot Learning] [blog] Pinto et al., 2017 @ OpenAI, CMU
[Sim-to-Real Transfer of Robotic Control with Dynamics Randomization], [blog] Peng et al., 2017 @ OpenAI, Berkeley
Locomotion
[Emergence of Locomotion Behaviours in Rich Environments] [blog] Heess et al., 2017 @ DeepMind
[Programmable Agents] Denil et al., 2017 @ Google Deepmind
Auto ML
[AutoAugment: Learning Augmentation Policies from Data] Cubuk et al., 2018 @ Google Brain
[Neural Optimizer Search with Reinforcement Learning, pdf] Bello et al., 2017 @ Google Brain
[Neural Architecture Search with Reinforcement Learning] B. Zoph and Quoc V. Le, 2016 @ Google Brain
Other Domains
[A Deep Reinforcement Learning Chatbot] Serban et al., 2017 @ MILA
Books
Search for new Papers
[A Brief Survey of Deep Reinforcement Learning] Arulkumaran et al., 2017
Another Awesome Deep RL list: https://github.com/tigerneil/awesome-deep-rl
ArXiv Sanity Preserver: http://www.arxiv-sanity.com/
GitXiv: http://www.gitxiv.com/
Misc
[How to Read a Paper] S. Keshav, 2007 @ Univ. of Waterloo
[Transfromers: Attention is all you need] Vaswani et al. 2017 @ Google Brain/Research