• Stars
    star
    273
  • Rank 149,903 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyTorch implementation of SAC-Discrete.

SAC-Discrete in PyTorch

This is a PyTorch implementation of SAC-Discrete[1]. I tried to make it easy for readers to understand the algorithm. Please let me know if you have any questions.

UPDATE

  • 2020.5.10
    • Refactor codes and fix a bug of SAC-Discrete algorithm.
    • Implement Prioritized Experience Replay[2], N-step Return and Dueling Networks[3].
    • Test them.

Setup

If you are using Anaconda, first create the virtual environment.

conda create -n sacd python=3.7 -y
conda activate sacd

You can install Python liblaries using pip.

pip install -r requirements.txt

If you're using other than CUDA 10.2, you may need to install PyTorch for the proper version of CUDA. See instructions for more details.

Examples

You can train SAC-Discrete agent like this example here.

python train.py --config config/sacd.yaml --env_id MsPacmanNoFrameskip-v4 --cuda --seed 0

If you want to use Prioritized Experience Replay(PER), N-step return or Dueling Networks, change use_per, multi_step or dueling_net respectively.

Results

I just evaluated vanilla SAC-Discrite, with PER, N-step Return or Dueling Networks in MsPacmanNoFrameskip-v4. The graph below shows the test returns along with environment steps (which equals environment frames divided by the factor of 4). Also, note that curves are smoothed by exponential moving average with weight=0.5 for visualization.

N-step Return and PER seems helpful to better utilize RL signals (e.g. sparse rewards).

References

[1] Christodoulou, Petros. "Soft Actor-Critic for Discrete Action Settings." arXiv preprint arXiv:1910.07207 (2019).

[2] Schaul, Tom, et al. "Prioritized experience replay." arXiv preprint arXiv:1511.05952 (2015).

[3] Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." arXiv preprint arXiv:1511.06581 (2015).

More Repositories

1

gail-airl-ppo.pytorch

PyTorch implementation of GAIL and AIRL based on PPO.
Python
182
star
2

fqf-iqn-qrdqn.pytorch

PyTorch implementation of FQF, IQN and QR-DQN.
Python
158
star
3

soft-actor-critic.pytorch

PyTorch implementation of Soft Actor-Critic(SAC).
Python
94
star
4

rljax

A collection of RL algorithms written in JAX.
Python
92
star
5

slac.pytorch

PyTorch implementation of Stochastic Latent Actor-Critic(SLAC).
Python
87
star
6

discor.pytorch

PyTorch implementation of Distribution Correction(DisCor) based on Soft Actor-Critic.
Python
38
star
7

alfred-aws-icons

Alfred Workflow for quickly pasting AWS architecture icons onto PowerPoint.
Go
27
star
8

rltorch

A simple framework for distributed reinforcement learning in PyTorch.
Python
16
star
9

vae.pytorch

PyTorch Implementation of Deep Feature Consistent Variational Autoencoder.
Python
12
star
10

simple-rl.pytorch

Simple implementation of model-free RL algorithms written in PyTorch.
Python
9
star
11

wappo.pytorch

PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization(WAPPO).
Python
6
star
12

slac-discrete.pytorch

PyTorch implementation of Stochastic Latent Actor-Critic(SLAC) extended for discrete action settings.
Python
2
star
13

gec-app

This project contains frontend/backend application code and infrastructure for grammatical error correction.
Python
2
star
14

dmm-schedule-checker

DMM schedule checker continuously monitors the schedule of your favorite teachers, and notifies via LINE whenever new slots are available.
Go
1
star
15

sagemaker-tutorial

Amazon SageMaker tutorial
Jupyter Notebook
1
star
16

ssm-enforcement-tool

This project contains a set of infrastructure implemented in Terraform to monitor your "not-managed-by-SSM" instances accross all regions.
Go
1
star