• Stars
    star
    2,467
  • Rank 18,638 (Top 0.4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 8 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

model

This implementation contains:

  1. Deep Q-network and Q-learning
  2. Experience replay memory
    • to reduce the correlations between consecutive updates
  3. Network for Q-learning targets are fixed for intervals
    • to reduce the correlations between target and predicted Q-values

Requirements

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

best

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

tensorboard

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

tensorboard

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

A1_A2_A4_0.00025lr

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

A1_A2_A4_0.0025lr

[1] & [2]

A1_A2_A4_0.00025lr_0.0025lr

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

A4_duel_double

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

A1_0.00025lr_distributed

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

A4_0.00025lr_distributed

References

License

MIT License.

More Repositories

1

libquic

QUIC, a multiplexed stream transport over UDP
C++
1,714
star
2

goquic

QUIC support for Go
C
940
star
3

pointer-network-tensorflow

TensorFlow implementation of "Pointer Networks"
Python
469
star
4

shardcake

Sharding and location transparency for Scala
Scala
388
star
5

eclair

Simple ssh tool for Amazon EC2
Ruby
122
star
6

neural-combinatorial-rl-tensorflow

in progress
Python
106
star
7

TCML-tensorflow

Tensorflow implementation of Meta-Learning with Temporal Convolutions
Python
98
star
8

checkpoint

Kubernetes policy checker
Rust
45
star
9

gospdyquic

SPDY/QUIC support for Go
38
star
10

quicbench

HTTP/QUIC load test and benchmark tool
Go
33
star
11

cine

Actor model for golang
Go
29
star
12

UnitySettings

Runtime debugging menu (like setting on Android) for Unity.
C#
28
star
13

UnityCLI

Unity TCP CLI communication for debugging
C#
25
star
14

go-caplit

Go caplit message deserializer
Go
17
star
15

king-openvpn

king-openvpn: The one VPN that connects to any network
HCL
11
star
16

ck-domain-logic-example

์Šค์นผ๋ผ ๊ฐœ๋ฐœ์ž MEET-UP! "์ˆœ์ˆ˜ํžˆ ์•„๋ฆ„๋‹ค์šด ์ฟ ํ‚ค๋Ÿฐ ํ‚น๋ค ๋„๋ฉ”์ธ ๋กœ์ง" ์„ธ์…˜ ๋ฐœํ‘œ ์ž๋ฃŒ์˜ ์˜ˆ์‹œ ์ฝ”๋“œ
Scala
10
star
17

zio-agones

Lightweight Scala client for Agones SDK using ZIO
Scala
9
star
18

web-packages

devsisters internal npm packages
TypeScript
7
star
19

UnityLuaREPL

C#
7
star
20

go-applereceipt

Apple-issued receipts parser & verifier, without any external API call
Go
7
star
21

multi-speaker-tacotron-tensorflow

5
star
22

XLSXasJSON.jl

Julia package that converts excel data to json
Julia
4
star
23

mars-addressables

https://packages.unity.com/com.unity.addressables/
C#
3
star
24

tarballize

Make tarballs of given git repository including its submodules.
Shell
3
star
25

docker-distccd

Dockerized distccd for easy distribution.
Makefile
3
star
26

confluence-to-notion-converter

Confluence to Notion Archive Converter
C#
3
star
27

gatsby-starter-typescript-workspace

A minimal GatsbyJS starter with TypeScript & pnpm workspace
CSS
3
star
28

go-dyncapnp

Dynamic Cap'n'proto parsing & generating in Go
Go
2
star
29

JSONPointer.jl

implementation of JSONPointer on Julia
Julia
2
star
30

docker-nginx-sslproxy

Dockerized nginx proxy w/ SSL suppport
Shell
1
star
31

actions-runner-devsisters

Customized image for actions-runner-controller/actions-runner-controller
Dockerfile
1
star