• Stars
    star
    2,430
  • Rank 18,170 (Top 0.4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 8 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

model

This implementation contains:

  1. Deep Q-network and Q-learning
  2. Experience replay memory
    • to reduce the correlations between consecutive updates
  3. Network for Q-learning targets are fixed for intervals
    • to reduce the correlations between target and predicted Q-values

Requirements

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

best

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

tensorboard

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

tensorboard

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

A1_A2_A4_0.00025lr

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

A1_A2_A4_0.0025lr

[1] & [2]

A1_A2_A4_0.00025lr_0.0025lr

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

A4_duel_double

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

A1_0.00025lr_distributed

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

A4_0.00025lr_distributed

References

License

MIT License.

More Repositories

1

libquic

QUIC, a multiplexed stream transport over UDP
C++
1,714
star
2

goquic

QUIC support for Go
C
940
star
3

pointer-network-tensorflow

TensorFlow implementation of "Pointer Networks"
Python
465
star
4

shardcake

Sharding and location transparency for Scala
Scala
358
star
5

eclair

Simple ssh tool for Amazon EC2
Ruby
121
star
6

neural-combinatorial-rl-tensorflow

in progress
Python
106
star
7

TCML-tensorflow

Tensorflow implementation of Meta-Learning with Temporal Convolutions
Python
98
star
8

checkpoint

Kubernetes policy checker
Rust
45
star
9

gospdyquic

SPDY/QUIC support for Go
38
star
10

quicbench

HTTP/QUIC load test and benchmark tool
Go
33
star
11

cine

Actor model for golang
Go
29
star
12

UnitySettings

Runtime debugging menu (like setting on Android) for Unity.
C#
28
star
13

UnityCLI

Unity TCP CLI communication for debugging
C#
24
star
14

go-caplit

Go caplit message deserializer
Go
17
star
15

king-openvpn

king-openvpn: The one VPN that connects to any network
HCL
11
star
16

ck-domain-logic-example

슀칼라 개발자 MEET-UP! "순수히 μ•„λ¦„λ‹€μš΄ μΏ ν‚€λŸ° 킹덀 도메인 둜직" μ„Έμ…˜ λ°œν‘œ 자료의 μ˜ˆμ‹œ μ½”λ“œ
Scala
10
star
17

web-packages

devsisters internal npm packages
TypeScript
7
star
18

UnityLuaREPL

C#
7
star
19

zio-agones

Lightweight Scala client for Agones SDK using ZIO
Scala
6
star
20

multi-speaker-tacotron-tensorflow

5
star
21

go-applereceipt

Apple-issued receipts parser & verifier, without any external API call
Go
4
star
22

XLSXasJSON.jl

Julia package that converts excel data to json
Julia
4
star
23

mars-addressables

https://packages.unity.com/com.unity.addressables/
C#
3
star
24

tarballize

Make tarballs of given git repository including its submodules.
Shell
3
star
25

docker-distccd

Dockerized distccd for easy distribution.
Makefile
3
star
26

gatsby-starter-typescript-workspace

A minimal GatsbyJS starter with TypeScript & pnpm workspace
CSS
3
star
27

confluence-to-notion-converter

Confluence to Notion Archive Converter
C#
3
star
28

JSONPointer.jl

implementation of JSONPointer on Julia
Julia
2
star
29

go-dyncapnp

Dynamic Cap'n'proto parsing & generating in Go
Go
2
star
30

docker-nginx-sslproxy

Dockerized nginx proxy w/ SSL suppport
Shell
1
star
31

mars-entities

https://packages.unity.com/com.unity.entities
C#
1
star
32

actions-runner-devsisters

Customized image for actions-runner-controller/actions-runner-controller
Dockerfile
1
star