XinJingHao/Deep-Reinforcement-Learning-Algorithms-with-Pytorch

Stars
833
Rank 54,305 (Top 2 %)
Language
Python
Created almost 3 years ago
Updated 3 months ago

XinJingHao/Deep-Reinforcement-Learning-Algorithms-with-Pytorch

XinJingHao

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Clean, Robust, and Unified PyTorch implementation of popular DRL Algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Clean, Robust, and Unified implementation of classical Deep Reinforcement Learning Algorithms

Link of my code:

Recommended Resources for DRL

Books：

《Reinforcement learning: An introduction》--Richard S. Sutton
《深度学习入门：基于Python的理论与实现》--斋藤康毅

Online Courses:

RL Courses(bilibili)--李宏毅(Hongyi Li)
RL Courses(Youtube)--李宏毅(Hongyi Li)
UCL Course on RL--David Silver
动手强化学习--上海交通大学

Blogs:

Simulation Environments:

Important Papers

DQN: Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. nature, 2015, 518(7540): 529-533.

Double DQN: Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI conference on artificial intelligence. 2016, 30(1).

PER: Schaul T, Quan J, Antonoglou I, et al. Prioritized experience replay[J]. arXiv preprint arXiv:1511.05952, 2015.

PPO: Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017.

DDPG: Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. arXiv preprint arXiv:1509.02971, 2015.

TD3: Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods[C]//International conference on machine learning. PMLR, 2018: 1587-1596.

SAC: Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International conference on machine learning. PMLR, 2018: 1861-1870.

ASL: Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity

Training Curves of my Code:

Q-learning:

DQN/DDQN on Classic Control:

DQN/DDQN on Atari Game:

Pong	Enduro

Prioritized DQN/DDQN on Classic Control:

CartPole	LunarLander

PPO Discrete:

PPO Continuous:

DDPG:

Pendulum	LunarLanderContinuous

TD3:

SAC Continuous:

SAC Discrete:

Actor-Sharer-Learner (ASL):

PPO-Continuous-Pytorch

A clean and robust Pytorch implementation of PPO on continuous action space.

TD3-BipedalWalkerHardcore-v2

Solve BipedalWalkerHardcore-v2 with TD3

PPO-Discrete-Pytorch

A clean and robust Pytorch implementation of PPO on Discrete action space

SAC-Continuous-Pytorch

a clean and robust Pytorch implementation of SAC on continuous action space

Duel-Double-DQN-Pytorch

A clean and robust implementation of Duel Double DQN

OkayPlan

OkayPlan: A real-time global path palnning algorithm for dynamic environments

SAC-Discrete-Pytorch

A clean and robust Pytorch implementation of SAC on discrete action space

Actor-Sharer-Learner

Actor-Sharer-Learner training framework for off-policy DRL algorithms

TD3-Pytorch

A clean and robust Pytorch implementation of TD3 on continuous action space

Prioritized-Experience-Replay-DDQN-Pytorch

A clean and robust implementation of Prioritized DQN and Prioritized Double DQN

Sparrow-V0

A Reinforcement Learning Friendly Simulator for Mobile Robot

Real-time-Path-planning-with-SEPSO

Efficient Real-time Path Planning with SEPSO in Dynamic Scenarios

Color

Color: Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity

DDPG-Pytorch

A clean Pytorch implementation of DDPG on continuous action space.

Noisy-Duel-DDQN-Atari-Pytorch

A clean and robust implementation of Noisy-Duel-DDQN on Atari games

okayplan_ros

Real-time global path planning algorithm for dynamic environments

Q-learning

An implementation of Q-learning

Sparrow-V1

A Reinforcement Learning Friendly Simulator for Mobile Robot

C51-Categorical-DQN-Pytorch

A clean and robust Pytorch implementation of Categorical DQN (C51)