jcwleo/Reinforcement_Learning

Stars
117
Rank 301,828 (Top 6 %)
Language
Python
Created over 7 years ago
Updated about 6 years ago

jcwleo/Reinforcement_Learning

jcwleo

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

강화학습에 대한 기본적인 알고리즘 구현

Reinforcement Learning

여러 환경에 적용해보는 강화학습 예제(파이토치로 옮기고 있습니다)

Here is my new Repo for Policy Gradient!!

[Breakout / Use DQN(Nature2015)]

1. Q-Learning / SARSA

FrozenLake(Gridword)
WindyGridWorld(in Sutton's book)
- Q-Learning / SARSA

2. Q-Network (Action-Value Function Approximation)

3. DQN

DQN(NIPS2013)은 (Experience Replay Memory / CNN) 을 사용.

CartPole(Classic Control) - Cartpole 같은 경우에는 CNN을 사용하지 않고 센서 정보를 통해서 학습

DQN(Nature2015)은 (Experience Replay Memory / Target Network / CNN) 을 사용

CartPole(Classic Control)
Breakout(atari)
Breakout(atari)
- this code is made by pytorch and more efficient memory and train

5. Vanilla Policy Gradient(REINFORCE)

6. Advantage Actor Critic

episodic
- CartPole(Classic Control)
- Pong(atari)
one-step
- CartPole(Classic Control)
n-step
- CartPole(Classic Control)

7. Deep Deterministic Policy Gradient

Pendulum(Classic Control)

8. Parallel Advantage Actor Critic(is called 'A2C' in OpenAI)

CartPole(Classic Control)(used a single thread instead of multi thread)
CartPole(Classic Control)(used multiprocessing in pytorch)
Super Mario Bros(used multiprocessing in pytorch)

9. C51(Distributional RL)

DDQN
- CartPole(Classic Control)

10. PPO(Proximal Policy Optimization)

CartPole(Classic Control)

random-network-distillation-pytorch

Random Network Distillation pytorch

curiosity-driven-exploration-pytorch

Curiosity-driven Exploration by Self-supervised Prediction

mario_rl

awr-pytorch

Advantage-Weighted Regression

DPLL-Algorithm

implementing DPLL Algorithm

python-web-application-base

Application kits that can be developed with Python

Tensorflow

Tensorflow에 대한 기본적인 튜토리얼

Machine-Learning

implementing ML algorithm