• Stars
    star
    291
  • Rank 141,708 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created over 8 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

KEras Reinforcement Learning gYM agents

THIS REPO IS DEPRECATED!!!

Please use something which is actually kept up to date and properly debugged such as RLLAB, https://github.com/openai/rllab

KEras Reinforcement Learning gYM agents, KeRLym

This repo is intended to host a handful of reinforcement learning agents implemented using the Keras (http://keras.io/) deep learning library for Theano and Tensorflow. It is intended to make it easy to run, measure, and experiment with different learning configuration and underlying value function approximation networks while running a variery of OpenAI Gym environments (https://gym.openai.com/).

Screenshot img

Agents

  • pg: policy gradient method with Keras NN policy network
  • dqn: q-learning agent with Keras NN Q-fn approximation (/w concurrent actor-learners)

Installation

sudo python setup.py install

Usage

./run_pong.sh

or

Exmaple: kerlym -e Go9x9-v0 -n simple_dnn -P

Usage: kerlym [options]

Options:
  -h, --help            show this help message and exit
  -e ENV, --env=ENV     Which GYM Environment to run [Pong-v0]
  -n NET, --net=NET     Which NN Architecture to use for Q-Function
                        approximation [simple_dnn]
  -b BS, --batch_size=BS
                        Batch size durring NN training [32]
  -o DROPOUT, --dropout=DROPOUT
                        Dropout rate in Q-Fn NN [0.5]
  -p EPSILON, --epsilon=EPSILON
                        Exploration(1.0) vs Exploitation(0.0) action
                        probability [0.1]
  -D EPSILON_DECAY, --epsilon_decay=EPSILON_DECAY
                        Rate of epsilon decay: epsilon*=(1-decay) [1e-06]
  -s EPSILON_MIN, --epsilon_min=EPSILON_MIN
                        Min epsilon value after decay [0.05]
  -d DISCOUNT, --discount=DISCOUNT
                        Discount rate for future reards [0.99]
  -t NFRAMES, --num_frames=NFRAMES
                        Number of Sequential observations/timesteps to store
                        in a single example [2]
  -m MAXMEM, --max_mem=MAXMEM
                        Max number of samples to remember [100000]
  -P, --plots           Plot learning statistics while running [False]
  -F PLOT_RATE, --plot_rate=PLOT_RATE
                        Plot update rate in episodes [10]
  -a AGENT, --agent=AGENT
                        Which learning algorithm to use [dqn]
  -i, --difference      Compute Difference Image for Training [False]
  -r LEARNING_RATE, --learning_rate=LEARNING_RATE
                        Learning Rate [0.0001]
  -E PREPROCESSOR, --preprocessor=PREPROCESSOR
                        Preprocessor [none]
  -R, --render          Render game progress [False]
  -c NTHREADS, --concurrency=NTHREADS
                        Number of Worker Threads [1]

or

from gym import envs
env = lambda: envs.make("SpaceInvaders-v0")

import kerlym
agent = kerlym.agents.DQN(
                    env=env, 
                    nframes=1, 
                    epsilon=0.5, 
                    discount=0.99, 
                    modelfactory=kerlym.dqn.networks.simple_cnn,
                    batch_size=32, 
                    dropout=0.1, 
                    enable_plots = True, 
                    epsilon_schedule=lambda episode,epsilon: max(0.1, epsilon*(1-1e-4)),                
                    dufference_obs = True,
                    preprocessor = kerlym.preproc.karpathy_preproc,
                    learning_rate = 1e-4, 
                    render=True
                    )
agent.train()

Custom Action-Value Function Network

def custom_Q_nn(agent, env, dropout=0, h0_width=8, h1_width=8, **args):
    S = Input(shape=[agent.input_dim])
    h = Reshape([agent.nframes, agent.input_dim/agent.nframes])(S)
    h = TimeDistributed(Dense(h0_width, activation='relu', init='he_normal'))(h)
    h = Dropout(dropout)(h)
    h = LSTM(h1_width, return_sequences=True)(h)
    h = Dropout(dropout)(h)
    h = LSTM(h1_width)(h)
    h = Dropout(dropout)(h)
    V = Dense(env.action_space.n, activation='linear',init='zero')(h)
    model = Model(S,V)
    model.compile(loss='mse', optimizer=RMSprop(lr=0.01) )
    return model

agent = keras.agents.D2QN(env, modelfactory=custom_Q_nn)

Citation

If using this work in your research, citation of our publication introducing this platform would be greatly appreciated! The arXiv paper is available at https://arxiv.org/abs/1605.09221 and a simple bibtex entry is provided below.

@misc{1605.09221,
Author = {Timothy J. O'Shea and T. Charles Clancy},
Title = {Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent},
Year = {2016},
Eprint = {arXiv:1605.09221},
}

Acknowledgements

Many thanks to the projects below for their inspiration and contributions

-Tim

Remind

Install pip install gym and pip install gym[atari]. If gym[atari] has install error, apt-get install cmake.

More Repositories

1

KerasGAN

A couple of simple GANs in Keras
Jupyter Notebook
502
star
2

openlte

C++
65
star
3

gr-eventstream

gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
C++
42
star
4

gr-pyqt

pyqt based plotters intended for plotting bursted events in gnu radio
Python
29
star
5

gr-tf

GNU Radio TensorFlow Blocks Repository
CMake
23
star
6

gr-theano

Rapid GPU Accelerated Blocks for GNU Radio
CMake
22
star
7

gr-mediatools

ffmpeg / gnu radio integration
CMake
18
star
8

nnplot

python scripts for plotting various neural network architectures
Python
12
star
9

gr-uhdgps

CMake
11
star
10

gr-fsk-burst

A simple GNU Radio message based burst FSK Transmitter and Receiver
11
star
11

gr-pcap

CMake
10
star
12

gr-bitcoin

bitcoin tools for gnuradio
Python
9
star
13

dist_hyperas

a tool for distributing keras/hyperas model training tasks
Python
8
star
14

gr-benchmark

Python
8
star
15

gr-psk-burst

6
star
16

gr-latency

GNU Radio Latency Measurement Tools
Python
5
star
17

gr-eventsim

Benchmarking tools for gr-eventstream
CMake
2
star
18

rf_helicopter

Using e-Greedy Q Learning, e-Greedy Q-Learning with e-decay, Deep Q-Network (DQN) to learn to navigate a Helicopter course
Python
2
star
19

gr-chunky

benchmarking blocks to compare tagged stream blocks to message passing / pdu blocks
Python
2
star
20

gr-pmtfile

Store samples and tags
Python
2
star
21

kml

1
star
22

gr-fec-test

1
star