• Stars
    star
    178
  • Rank 214,989 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created over 7 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Noisy Networks for Exploration

NoisyNet-A3C

MIT License

NoisyNet [1] (LSTM) asynchronous advantage actor-critic (A3C) [2] on the CartPole-v1 environment. This repo has a minimalistic design and a classic control environment to enable quick investigation of different hyperparameters.

Run with python main.py <options>. Entropy regularisation can still be added by setting --entropy-weight <value>, but it is 0 by default. Run with --no-noise to run normal A3C (without noisy linear layers).

Requirements

To install all dependencies with Anaconda run conda env create -f environment.yml and use source activate noisynet to activate the environment.

Results

NoisyNet-A3C

On the whole, NoisyNet-A3C tends to be better than A3C (with or without entropy regularisation). There seems to be more variance, with both good and poor runs, probably due to "deep" exploration.

Good-NoisyNet-A3C

Bad-NoisyNet-A3C

NoisyNet-A3C is perhaps even more prone to performance collapses than normal A3C. Many deep reinforcement learning algorithms are still prone to this.

Collapse-NoisyNet-A3C

A3C (no entropy regularisation)

A3C without entropy regularisation usually performs poorly.

A3C

A3C (entropy regularisation with ฮฒ = 0.01)

A3C with entropy regularisation usually performs a bit better than A3C without entropy regularisation, and also poor runs of NoisyNet-A3C. The performance tends to be significantly worse than the best NoisyNet-A3C runs.

A3C-entropy

Note that due to the nondeterminism introduced by asynchronous agents, different runs on even the same seed can produce different results, and hence the results presented are only single samples of the performance of these algorithms. Interestingly, the general observations above seem to hold even when increasing the number of processes (experiments were repeated with 16 processes). These algorithms are still sensitive to the choice of hyperparameters, and will need to be tuned extensively to get good performance on other domains.

Acknowledgements

References

[1] Noisy Networks for Exploration
[2] Asynchronous Methods for Deep Reinforcement Learning

More Repositories

1

Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
Python
1,424
star
2

grokking-pytorch

The Hitchiker's Guide to PyTorch
1,020
star
3

dockerfiles

Compilation of Dockerfiles with automated builds enabled on the Docker Registry
Dockerfile
503
star
4

Autoencoders

Torch implementations of various types of autoencoders
Lua
455
star
5

PlaNet

Deep Planning Network: Control from pixels by latent planning with learned dynamics
Python
337
star
6

imitation-learning

Imitation learning algorithms
Python
297
star
7

Atari

Persistent advantage learning dueling double DQN for the Arcade Learning Environment
Lua
263
star
8

ACER

Actor-critic with experience replay
Python
251
star
9

FGLab

Future Gadget Laboratory
HTML
223
star
10

spinning-up-basic

Basic versions of agents from Spinning Up in Deep RL written in PyTorch
Python
197
star
11

FCN-semantic-segmentation

Fully convolutional networks for semantic segmentation
Python
185
star
12

nninit

Weight initialisation schemes for Torch7 neural network modules
Lua
100
star
13

rlenvs

Reinforcement learning environments for Torch7
Lua
93
star
14

FGMachine

Future Gadget Machine
JavaScript
68
star
15

malmo-challenge

Malmo Collaborative AI Challenge - Team Pig Catcher
Python
65
star
16

torch-pastalog

A Torch interface for pastalog - simple, realtime visualization of neural network training performance
Lua
45
star
17

GUDRL

Generalised UDRL
Python
37
star
18

Dist-A3C

Distributed A3C
Python
35
star
19

EC

Episodic Control
Python
19
star
20

human-level-control

Presentation on Human-Level Control Through Deep Reinforcement Learning
HTML
13
star
21

Easy21

Reinforcement Learning Assignment: Easy21
Lua
11
star
22

end-to-end

Presentation on End-to-End Training of Deep Visuomotor Policies
HTML
9
star
23

docker-torch-mega

Docker image for Torch with CUDA support + extra Torch libraries
7
star
24

cuda-workshop

CUDA Workshop
Cuda
6
star
25

SARCOS

ML models trained on the SARCOS dataset
Python
6
star
26

IncSFA

Incremental Slow Feature Analysis
Lua
4
star
27

sybilsystem

MATLAB Deep Learning Library
MATLAB
1
star
28

MCAC

Minimal Criterion Artist Collective
Python
1
star
29

GlassMate

Team Inforaptor's project for IC Hack '14
Java
1
star
30

bakapunk

A tool for finding similar songs in your music library
JavaScript
1
star