• Stars
    star
    117
  • Rank 301,828 (Top 6 %)
  • Language
    Python
  • Created over 7 years ago
  • Updated about 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

강화학습에 대한 기본적인 알고리즘 구현

Reinforcement Learning

여러 환경에 적용해보는 강화학습 예제(파이토치로 옮기고 있습니다)

Here is my new Repo for Policy Gradient!!


Alt text

[Breakout / Use DQN(Nature2015)]

1. Q-Learning / SARSA

2. Q-Network (Action-Value Function Approximation)

3. DQN

DQN(NIPS2013)은 (Experience Replay Memory / CNN) 을 사용.

DQN(Nature2015)은 (Experience Replay Memory / Target Network / CNN) 을 사용

5. Vanilla Policy Gradient(REINFORCE)

6. Advantage Actor Critic

7. Deep Deterministic Policy Gradient

8. Parallel Advantage Actor Critic(is called 'A2C' in OpenAI)

9. C51(Distributional RL)

10. PPO(Proximal Policy Optimization)