• Stars
    star
    590
  • Rank 75,794 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Asynchronous Methods for Deep Reinforcement Learning

async_deep_reinforce

Asynchronous deep reinforcement learning

About

An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

http://arxiv.org/abs/1602.01783

Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is implemented with TensorFlow. Both A3C-FF and A3C-LSTM are implemented.

Learning result movment after 26 hours (A3C-FF) is like this.

Learning result after 26 hour

Any advice or suggestion is strongly welcomed in issues thread.

#1

How to build

First we need to build multi thread ready version of Arcade Learning Enviroment. I made some modification to it to run it on multi thread enviroment.

$ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
$ cd Arcade-Learning-Environment
$ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=OFF .
$ make -j 4

$ pip install .

I recommend to install it on VirtualEnv environment.

How to run

To train,

$python a3c.py

To display the result with game play,

$python a3c_disp.py

Using GPU

To enable gpu, change "USE_GPU" flag in "constants.py".

When running with 8 parallel game environemts, speeds of GPU (GTX980Ti) and CPU(Core i7 6700) were like this. (Recorded with LOCAL_T_MAX=20 setting.)

type A3C-FF A3C-LSTM
GPU 1722 steps per sec 864 steps per sec
CPU 1077 steps per sec 540 steps per sec

Result

Score plots of local threads of pong were like these. (with GTX980Ti)

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM T=5

A3C-LSTM LOCAL_T_MAX = 20

A3C-LSTM T=20

Scores are not averaged using global network unlike the original paper.

Requirements

  • TensorFlow r1.0
  • numpy
  • cv2
  • matplotlib

References

This project uses setting written in muupan's wiki [muuupan/async-rl] (https://github.com/muupan/async-rl/wiki)

Acknowledgements

More Repositories

1

unreal

Reinforcement learning with unsupervised auxiliary tasks
Python
415
star
2

TensorFlowAndroidDemo

TensorFlow Android stand-alone demo
C++
275
star
3

TensorFlowAndroidMNIST

Tensorflow MNIST demo on Android
C++
264
star
4

disentangled_vae

Replicating "Understanding disentangling in ฮฒ-VAE"
Python
193
star
5

scan

SCAN: Learning Abstract Hierarchical Compositional Visual Concepts
Python
54
star
6

heartrate-monitor

Heart rate variability (HRV) analysis tool to detect autonomic nerve state
Swift
37
star
7

predictive_coding

Predictive Coding in the Visual Cortex: a Functional Interpretation of Some Extra-classical Receptive-field Effects
Jupyter Notebook
23
star
8

rat_grid

Vector-based navigation using grid-like representations in artificial agents
Jupyter Notebook
19
star
9

rodentia

3D learning environment with rigid body simulation for Linux/MacOSX
C++
15
star
10

episodic_control

Model-Free Episodic Control
Python
15
star
11

evolution_and_ai

Python
12
star
12

intro-to-dl-android

Jinnan Android Meetup Vol.1 "Androidใงๅ‹•ใ‹ใ™ใฏใ˜ใ‚ใฆใฎDeep Learning"
C++
5
star
13

narr-note

Fast math note-taking tool with Tex notation for MacOSX.
JavaScript
5
star
14

can

Continous Attractor Network Model
Jupyter Notebook
5
star
15

dendritic_bp

Dendritic error backpropagation in deep cortical microcircuits
Jupyter Notebook
4
star
16

snmf

Hebbian/Anti-Hebbian Network for Online NMF
Jupyter Notebook
3
star
17

narr-map

A minimal mind map editor
TypeScript
2
star
18

hvrnn

Hierarchical variational autoencoder
Python
2
star
19

reinforcement_learning_samples

samples of reinforcement learning
Java
2
star
20

manimalai

Easy-to-use Animal-AI clone environment
Python
1
star
21

tinymac

Tiny old mac (Macintosh 128K) like hardware
C
1
star
22

intro-to-dl2

Python
1
star