• Stars
    star
    110
  • Rank 316,770 (Top 7 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation of TD-Gammon in TensorFlow.

TD-Gammon

Implementation of TD-Gammon in TensorFlow.

Before DeepMind tackled playing Atari games or built AlphaGo there was TD-Gammon, the first algorithm to reach an expert level of play in backgammon. Gerald Tesauro published his paper in 1992 describing TD-Gammon as a neural network trained with reinforcement learning. It is referenced in both Atari and AlphaGo research papers and helped set the groundwork for many of the advancements made in the last few years.

The code features eligibility traces on the gradients which are an elegant way to assign credit to actions made in the past.

Training

  1. Install TensorFlow.
  2. Clone the repo: git clone https://github.com/fomorians/td-gammon.git && cd td-gammon
  3. Run training: python main.py

Play

To play against a trained model: python main.py --play --restore

Things to try

  • Compare with and without eligibility traces by replacing the trace with the unmodified gradient.
  • Try different activation functions on the hidden layer.
  • Expand the board representation. Currently it uses the "compact" representation from the paper. A full board representation should remove some ambiguity between board states.
  • Increase the number of turns the agent will look at before making a move. The paper used a 2-ply and 3-ply search while this implementation only uses 1-ply.

More Repositories

1

lstm-odyssey

Implementations of "LSTM: A Search Space Odyssey" variants and their training results on the PTB dataset.
Jupyter Notebook
96
star
2

tfstage

TFStage: TensorFlow Project Scaffolding
Python
62
star
3

highway-cnn

Simple convolutional highway networks using TensorFlow.
Python
57
star
4

imagesearch

Python
31
star
5

highway-fcn

Simple fully-connected highway networks using TensorFlow.
Python
26
star
6

distracted-drivers-tf

Starter project for the Kaggle State Farm Distracted Driver Detection Competition
Python
22
star
7

distracted-drivers-keras

Starter project for the Kaggle State Farm Distracted Driver Detection Competition
Python
20
star
8

neural-rs

Spiking neural network library for Rust
Jupyter Notebook
11
star
9

counting-mnist

A simple synthetic dataset and baseline model for visual counting.
Jupyter Notebook
9
star
10

forward-models

A tutorial on forward models for model-based reinforcement learning.
Jupyter Notebook
6
star
11

contextual_rnn

This repository contains the code for the paper "Contextual Recurrent Neural Networks"
Python
6
star
12

gym_tool_use

Gym tool use environments.
Python
5
star
13

gym_pycolab

Gym interface for custom pycolab games.
Python
4
star
14

fomoro-cli

CLI interface to Fomoro. [DEPRECATED]
JavaScript
3
star
15

fomoro-tensorflow

TensorFlow starter project for Fomoro. [DEPRECATED]
Python
2
star
16

vae

Variational autoencoder with TF Eager and Probability.
Jupyter Notebook
2
star
17

ppo

Implementation of PPO with TF 2.0 and Pyoneer.
Python
2
star
18

tool-use

Baselines for tool use environments.
Jupyter Notebook
2
star
19

fomoro-theano

Theano + Lasagne starter project for Fomoro. [DEPRECATED]
Python
1
star