• Stars
    star
    113
  • Rank 310,115 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Connect4 reinforcement learning by AlphaGo Zero methods.

About

Connect4 reinforcement learning by AlphaGo Zero methods.

This project is based in two main resources:

  1. DeepMind's Oct19th publication: Mastering the Game of Go without Human Knowledge.
  2. The great Reversi development of the DeepMind ideas that @mokemokechicken did in his repo: https://github.com/mokemokechicken/reversi-alpha-zero

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaGo Zero implementation consists of three worker self, opt and eval.

  • self is Self-Play to generate training data by self-play using BestModel.
  • opt is Trainer to train model, and generate next-generation models.
  • eval is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.

Evaluation

For evaluation, you can play chess with the BestModel.

  • play_gui is Play Game vs BestModel using ASCII character encoding.

Data

  • data/model/model_best_*: BestModel.
  • data/model/next_generation/*: next-generation models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.

If you want to train the model from the beginning, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want use GPU,

pip install tensorflow-gpu

set environment variables

Create .env file and write this.

KERAS_BACKEND=tensorflow

Basic Usages

For training model, execute Self-Play, Trainer and Evaluator.

Self-Play

python src/connect4_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new BestModel
  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)

Trainer

python src/connect4_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every 2000 steps(mini-batch) after epoch.

options

  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)
  • --total-step: specify total step(mini-batch) numbers. The total step affects learning rate of training.

Evaluator

python src/connect4_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/connect4_zero/configs/mini.py)

Play Game

python src/connect4_zero/run.py play_gui

When executed, ordinary chess board will be displayed in ASCII code and you can play against BestModel.

Tips and Memo

GPU Memory

Usually the lack of memory cause warnings, not error. If error happens, try to change per_process_gpu_memory_fraction in src/worker/{evaluate.py,optimize.py,self_play.py},

tf_util.set_session_config(per_process_gpu_memory_fraction=0.2)

Less batch_size will reduce memory usage of opt. Try to change TrainerConfig#batch_size in NormalConfig.

Model Performance

The following table is records of the best models.

best model generation winning percentage to best model Time Spent(hours) note
1 - -  
2 100% 1
3 84,6% 1
4 78,6% 2 This model is good enough to avoid naive losing movements
5 100% 1 The NN learns to play always in the center when it moves first
6 100% 4 The model now is able to win any online Connect4 game with classic AI I've found

More Repositories

1

chess-alpha-zero

Chess reinforcement learning by AlphaGo Zero methods.
Jupyter Notebook
2,124
star
2

tensorflow-tex-wavenet

This is a TensorFlow implementation of the WaveNet generative neural network architecture https://deepmind.com/blog/wavenet-generative-model-raw-audio/ for text generation.
Python
344
star
3

tensorflow-image-wavenet

This is a TensorFlow implementation of the WaveNet generative neural network architecture https://deepmind.com/blog/wavenet-generative-model-raw-audio/ for image generation.
Python
152
star
4

muzero

A simple implementation of MuZero algorithm for connect4 game
Jupyter Notebook
93
star
5

Asynchronous-Methods-for-Deep-Reinforcement-Learning

Using a paper from Google DeepMind I've developed a new version of the DQN using threads exploration instead of memory replay as explain in here: http://arxiv.org/pdf/1602.01783v1.pdf I used the one-step-Q-learning pseudocode, and now we can train the Pong game in less than 20 hours and without any GPU or network distribution.
Python
82
star
6

random-memory-adaptation

Random memory adaptation model inspired by the paper: "Memory-based parameter adaptation (MbPA)"
Python
24
star
7

Policy-chess

A Policy Network in Tensorflow to classify chess moves
Python
18
star
8

leela-fish

UCI chess playing engine derived from Stockfish and LeelaChess Zero
C++
16
star
9

pytorch-es-tic-tac-toe

Evolution Strategies in PyTorch (Tic-tac-toe)
Python
15
star
10

mushroom-detector-kerasjs

I explain how to export weights from a Keras model and import those weights in Keras.js, a JavaScript framework for running pre-trained neural networks in the browser. I show you later how to include the final result into a Phonegap Cordova mobile application.
Java
13
star
11

Using-Google-Neural-Machine-Translation-for-chess-movements-inference-TensorFlow-

Using (Google) Neural Machine Translation for chess movements inference (TensorFlow)
Python
3
star
12

schopenhauer_GPT_2

Fine-tuning a GPT-2 pretrained model in the Schopenhauer texts
Jupyter Notebook
2
star
13

FractalExplorationImitationLearning

Using the fragile framework as a memory explorer to train a neural network in Atari games
Jupyter Notebook
2
star
14

chatgpt-slack-bot

Slack Assistant Bot with Image Generation: this Slack Assistant Bot with Image Generation is a powerful AI-driven chatbot that helps users with various tasks within a Slack workspace. Powered by OpenAI GPT, the bot can understand and respond to user messages, as well as generate, edit, and create variations of images based on user prompts.
Python
1
star