• This repository has been archived on 11/Mar/2021
  • Stars
    star
    3,428
  • Rank 13,036 (Top 0.3 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 7 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An open-source implementation of the AlphaGoZero algorithm

Minigo: A minimalist Go engine modeled after AlphaGo Zero, built on MuGo

This is an implementation of a neural-network based Go AI, using TensorFlow. While inspired by DeepMind's AlphaGo algorithm, this project is not a DeepMind project nor is it affiliated with the official AlphaGo project.

This is NOT an official version of AlphaGo

Repeat, this is not the official AlphaGo program by DeepMind. This is an independent effort by Go enthusiasts to replicate the results of the AlphaGo Zero paper ("Mastering the Game of Go without Human Knowledge," Nature), with some resources generously made available by Google.

Minigo is based off of Brian Lee's "MuGo" -- a pure Python implementation of the first AlphaGo paper "Mastering the Game of Go with Deep Neural Networks and Tree Search" published in Nature. This implementation adds features and architecture changes present in the more recent AlphaGo Zero paper, "Mastering the Game of Go without Human Knowledge". More recently, this architecture was extended for Chess and Shogi in "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm". These papers will often be abridged in Minigo documentation as AG (for AlphaGo), AGZ (for AlphaGo Zero), and AZ (for AlphaZero) respectively.

Goals of the Project

  1. Provide a clear set of learning examples using Tensorflow, Kubernetes, and Google Cloud Platform for establishing Reinforcement Learning pipelines on various hardware accelerators.

  2. Reproduce the methods of the original DeepMind AlphaGo papers as faithfully as possible, through an open-source implementation and open-source pipeline tools.

  3. Provide our data, results, and discoveries in the open to benefit the Go, machine learning, and Kubernetes communities.

An explicit non-goal of the project is to produce a competitive Go program that establishes itself as the top Go AI. Instead, we strive for a readable, understandable implementation that can benefit the community, even if that means our implementation is not as fast or efficient as possible.

While this product might produce such a strong model, we hope to focus on the process. Remember, getting there is half the fun. :)

We hope this project is an accessible way for interested developers to have access to a strong Go model with an easy-to-understand platform of python code available for extension, adaptation, etc.

If you'd like to read about our experiences training models, see RESULTS.md.

To see our guidelines for contributing, see CONTRIBUTING.md.

Getting Started

This project assumes you have the following:

The Hitchhiker's guide to python has a good intro to python development and virtualenv usage. The instructions after this point haven't been tested in environments that are not using virtualenv.

pip3 install virtualenv
pip3 install virtualenvwrapper

Install Bazel

BAZEL_VERSION=0.24.1
wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
chmod 755 bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
sudo ./bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh

Install TensorFlow

First set up and enter your virtualenv and then the shared requirements:

pip3 install -r requirements.txt

Then, you'll need to choose to install the GPU or CPU tensorflow requirements:

  • GPU: pip3 install "tensorflow-gpu==1.15.0".
    • Note: You must install CUDA 10.0. for Tensorflow 1.13.0+.
  • CPU: pip3 install "tensorflow==1.15.0".

Setting up the Environment

You may want to use a cloud project for resources. If so set:

PROJECT=foo-project

Then, running

source cluster/common.sh

will set up other environment variables defaults.

Running unit tests

./test.sh

To run individual modules

BOARD_SIZE=9 python3 tests/run_tests.py test_go
BOARD_SIZE=19 python3 tests/run_tests.py test_mcts

Automated Tests

Test Dashboard

To automatically test PRs, Minigo uses Prow, which is a test framework created by the Kubernetes team for testing changes in a hermetic environment. We use prow for running unit tests, linting our code, and launching our test Minigo Kubernetes clusters.

You can see the status of our automated tests by looking at the Prow and Testgrid UIs:

Basics

All commands are compatible with either Google Cloud Storage as a remote file system, or your local file system. The examples here use GCS, but local file paths will work just as well.

To use GCS, set the BUCKET_NAME variable and authenticate via gcloud login. Otherwise, all commands fetching files from GCS will hang.

For instance, this would set a bucket, authenticate, and then look for the most recent model.

# When you first start we recommend using our minigo-pub bucket.
# Later you can setup your own bucket and store data there.
export BUCKET_NAME=minigo-pub/v9-19x19
gcloud auth application-default login
gsutil ls gs://$BUCKET_NAME/models | tail -4

Which might look like:

gs://$BUCKET_NAME/models/000737-fury.data-00000-of-00001
gs://$BUCKET_NAME/models/000737-fury.index
gs://$BUCKET_NAME/models/000737-fury.meta
gs://$BUCKET_NAME/models/000737-fury.pb

These four files comprise the model. Commands that take a model as an argument usually need the path to the model basename, e.g. gs://$BUCKET_NAME/models/000737-fury

You'll need to copy them to your local disk. This fragment copies the files associated with $MODEL_NAME to the directory specified by MINIGO_MODELS:

MODEL_NAME=000737-fury
MINIGO_MODELS=$HOME/minigo-models
mkdir -p $MINIGO_MODELS/models
gsutil ls gs://$BUCKET_NAME/models/$MODEL_NAME.* | \
       gsutil cp -I $MINIGO_MODELS/models

Selfplay

To watch Minigo play a game, you need to specify a model. Here's an example to play using the latest model in your bucket

python3 selfplay.py \
  --verbose=2 \
  --num_readouts=400 \
  --load_file=$MINIGO_MODELS/models/$MODEL_NAME

where READOUTS is how many searches to make per move. Timing information and statistics will be printed at each move. Setting verbosity to 3 or higher will print a board at each move.

Playing Against Minigo

Minigo uses the GTP Protocol, and you can use any gtp-compliant program with it.

# Latest model should look like: /path/to/models/000123-something
LATEST_MODEL=$(ls -d $MINIGO_MODELS/* | tail -1 | cut -f 1 -d '.')
python3 gtp.py --load_file=$LATEST_MODEL --num_readouts=$READOUTS --verbose=3

After some loading messages, it will display GTP engine ready, at which point it can receive commands. GTP cheatsheet:

genmove [color]             # Asks the engine to generate a move for a side
play [color] [coordinate]   # Tells the engine that a move should be played for `color` at `coordinate`
showboard                   # Asks the engine to print the board.

One way to play via GTP is to use gogui-display (which implements a UI that speaks GTP.) You can download the gogui set of tools at http://gogui.sourceforge.net/. See also documentation on interesting ways to use GTP.

gogui-twogtp -black 'python3 gtp.py --load_file=$LATEST_MODEL' -white 'gogui-display' -size 19 -komi 7.5 -verbose -auto

Another way to play via GTP is to watch it play against GnuGo, while spectating the games:

BLACK="gnugo --mode gtp"
WHITE="python3 gtp.py --load_file=$LATEST_MODEL"
TWOGTP="gogui-twogtp -black \"$BLACK\" -white \"$WHITE\" -games 10 \
  -size 19 -alternate -sgffile gnugo"
gogui -size 19 -program "$TWOGTP" -computer-both -auto

Training Minigo

Overview

The following sequence of commands will allow you to do one iteration of reinforcement learning on 9x9. These are the basic commands used to produce the models and games referenced above.

The commands are

  • bootstrap: initializes a random model
  • selfplay: plays games with the latest model, producing data used for training
  • train: trains a new model with the selfplay results from the most recent N generations.

Training works via tf.Estimator; a working directory manages checkpoints and training logs, and the latest checkpoint is periodically exported to GCS, where it gets picked up by selfplay workers.

Configuration for things like "where do debug SGFs get written", "where does training data get written", "where do the latest models get published" are managed by the helper scripts in the rl_loop directory. Those helper scripts execute the same commands as demonstrated below. Configuration for things like "what size network is being used?" or "how many readouts during selfplay" can be passed in as flags. The mask_flags.py utility helps ensure all parts of the pipeline are using the same network configuration.

All local paths in the examples can be replaced with gs:// GCS paths, and the Kubernetes-orchestrated version of the reinforcement learning loop uses GCS.

Bootstrap

This command initializes your working directory for the trainer and a random model. This random model is also exported to --model-save-path so that selfplay can immediately start playing with this random model.

If these directories don't exist, bootstrap will create them for you.

export MODEL_NAME=000000-bootstrap
python3 bootstrap.py \
  --work_dir=estimator_working_dir \
  --export_path=outputs/models/$MODEL_NAME

Self-play

This command starts self-playing, outputting its raw game data as tf.Examples as well as in SGF form in the directories.

python3 selfplay.py \
  --load_file=outputs/models/$MODEL_NAME \
  --num_readouts 10 \
  --verbose 3 \
  --selfplay_dir=outputs/data/selfplay \
  --holdout_dir=outputs/data/holdout \
  --sgf_dir=outputs/sgf

Training

This command takes a directory of tf.Example files from selfplay and trains a new model, starting from the latest model weights in the estimator_working_dir parameter.

Run the training job:

python3 train.py \
  outputs/data/selfplay/* \
  --work_dir=estimator_working_dir \
  --export_path=outputs/models/000001-first_generation

At the end of training, the latest checkpoint will be exported to. Additionally, you can follow along with the training progress with TensorBoard. If you point TensorBoard at the estimator working directory, it will find the training log files and display them.

tensorboard --logdir=estimator_working_dir

Validation

It can be useful to set aside some games to use as a 'validation set' for tracking the model overfitting. One way to do this is with the validate command.

Validating on holdout data

By default, Minigo will hold out 5% of selfplay games for validation. This can be changed by adjusting the holdout_pct flag on the selfplay command.

With this setup, rl_loop/train_and_validate.py will validate on the same window of games that were used to train, writing TensorBoard logs to the estimator working directory.

Validating on a different set of data

This might be useful if you have some known set of 'good data' to test your network against, e.g., a set of pro games. Assuming you've got a set of .sgfs with the proper komi & boardsizes, you'll want to preprocess them into the .tfrecord files, by running something similar to

import preprocessing
filenames = [generate a list of filenames here]
for f in filenames:
    try:
        preprocessing.make_dataset_from_sgf(f, f.replace(".sgf", ".tfrecord.zz"))
    except:
        print(f)

Once you've collected all the files in a directory, producing validation is as easy as

python3 validate.py \
  validation_files/ \
  --work_dir=estimator_working_dir \
  --validation_name=pro_dataset

The validate.py will glob all the .tfrecord.zz files under the directories given as positional arguments and compute the validation error for the positions from those files.

Retraining a model

The training data for most of Minigo's models up to v13 is publicly available in the minigo-pub Cloud storage bucket, e.g.:

gsutil ls gs://minigo-pub/v13-19x19/data/golden_chunks/

For models v14 and onwards, we started using Cloud BigTable and are still working on making that data public.

Here's how to retrain your own model from this source data using a Cloud TPU:

# I wrote these notes using our existing TPU-enabled project, so they're missing
# a few preliminary steps, like setting up a Cloud account, creating a project,
# etc. New users will also need to enable Cloud TPU on their project using the
# TPUs panel.

###############################################################################

# Note that you will be billed for any storage you use and also while you have
# VMs running. Remember to shut down your VMs when you're not using them!

# To use a Cloud TPU on GCE, you need to create a special TPU-enabled VM using
# the `ctpu` tool. First, set up some environment variables:
#   GCE_PROJECT=<your project name>
#   GCE_VM_NAME=<your VM's name>
#   GCE_ZONE<the zone in which you want to bring uo your VM, e.g. us-central1-f>

# In this example, we will use the following values:
GCE_PROJECT=example-project
GCE_VM_NAME=minigo-etpu-test
GCE_ZONE=us-central1-f

# Create the Cloud TPU enabled VM.
ctpu up \
  --project="${GCE_PROJECT}" \
  --zone="${GCE_ZONE}" \
  --name="${GCE_VM_NAME}" \
  --tf-version=1.13

# This will take a few minutes and you should see output similar to the
# following:
#   ctpu will use the following configuration values:
#         Name:                 minigo-etpu-test
#         Zone:                 us-central1-f
#         GCP Project:          example-project
#         TensorFlow Version:   1.13
#  OK to create your Cloud TPU resources with the above configuration? [Yn]: y
#  2019/04/09 10:50:04 Creating GCE VM minigo-etpu-test (this may take a minute)...
#  2019/04/09 10:50:04 Creating TPU minigo-etpu-test (this may take a few minutes)...
#  2019/04/09 10:50:11 GCE operation still running...
#  2019/04/09 10:50:12 TPU operation still running...

# Once the Cloud TPU is created, `ctpu` will have SSHed you into the machine.

# Remember to set the same environment variables on your VM.
GCE_PROJECT=example-project
GCE_VM_NAME=minigo-etpu-test
GCE_ZONE=us-central1-f

# Clone the Minigo Github repository:
git clone --depth 1 https://github.com/tensorflow/minigo
cd minigo

# Install virtualenv.
pip3 install virtualenv virtualenvwrapper

# Create a virtual environment
virtualenv -p /usr/bin/python3 --system-site-packages "${HOME}/.venvs/minigo"

# Activate the virtual environment.
source "${HOME}/.venvs/minigo/bin/activate"

# Install Minigo dependencies (TensorFlow for Cloud TPU is already installed as
# part of the VM image).
pip install -r requirements.txt

# When training on a Cloud TPU, the training work directory must be on Google Cloud Storage.
# You'll need to choose your own globally unique bucket name.
# The bucket location should be close to your VM.
GCS_BUCKET_NAME=minigo_test_bucket
GCE_BUCKET_LOCATION=us-central1
gsutil mb -p "${GCE_PROJECT}" -l "${GCE_BUCKET_LOCATION}" "gs://${GCS_BUCKET_NAME}"

# Run the training script and note the location of the training work_dir
# it reports, e.g.
#    Writing to gs://minigo_test_bucket/train/2019-04-25-18
./oneoffs/train.sh "${GCS_BUCKET_NAME}"

# Launch tensorboard, pointing it at the work_dir reported by the train.sh script.
tensorboard --logdir=gs://minigo_test_bucket/train/2019-04-25-18

# After a few minutes, TensorBoard should start updating.
# Interesting graphs to look at are value_cost_normalized, policy_cost and policy_entropy.

Running Minigo on a Kubernetes Cluster

See more at cluster/README.md

More Repositories

1

tensorflow

An Open Source Machine Learning Framework for Everyone
C++
186,123
star
2

models

Models and examples built with TensorFlow
Python
77,049
star
3

tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
TypeScript
18,430
star
4

tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Python
14,693
star
5

tfjs-models

Pretrained models for TensorFlow.js
TypeScript
14,058
star
6

playground

Play with neural networks!
TypeScript
11,585
star
7

tfjs-core

WebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.
TypeScript
8,480
star
8

examples

TensorFlow examples
Jupyter Notebook
7,920
star
9

tensorboard

TensorFlow's Visualization Toolkit
TypeScript
6,686
star
10

tfjs-examples

Examples built with TensorFlow.js
JavaScript
6,553
star
11

nmt

TensorFlow Neural Machine Translation Tutorial
Python
6,315
star
12

docs

TensorFlow documentation
Jupyter Notebook
6,119
star
13

swift

Swift for TensorFlow
Jupyter Notebook
6,118
star
14

serving

A flexible, high-performance serving system for machine learning models
C++
6,068
star
15

tpu

Reference models and tools for Cloud TPUs.
Jupyter Notebook
5,214
star
16

rust

Rust language bindings for TensorFlow
Rust
4,939
star
17

lucid

A collection of infrastructure and tools for research in neural network interpretability.
Jupyter Notebook
4,611
star
18

datasets

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Python
4,298
star
19

probability

Probabilistic reasoning and statistical analysis in TensorFlow
Jupyter Notebook
4,053
star
20

adanet

Fast and flexible AutoML with learning guarantees.
Jupyter Notebook
3,474
star
21

hub

A library for transfer learning by reusing parts of TensorFlow models.
Python
3,467
star
22

skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Python
3,181
star
23

lingvo

Lingvo
Python
2,812
star
24

agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Python
2,775
star
25

graphics

TensorFlow Graphics: Differentiable Graphics Layers for TensorFlow
Python
2,744
star
26

ranking

Learning to Rank in TensorFlow
Python
2,735
star
27

federated

A framework for implementing federated learning
Python
2,281
star
28

tfx

TFX is an end-to-end platform for deploying production ML pipelines
Python
2,099
star
29

privacy

Library for training machine learning models with privacy for training data
Python
1,916
star
30

tflite-micro

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).
C++
1,887
star
31

fold

Deep learning with dynamic computation graphs in TensorFlow
Python
1,824
star
32

recommenders

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Python
1,816
star
33

quantum

Hybrid Quantum-Classical Machine Learning in TensorFlow
Python
1,798
star
34

mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1,720
star
35

addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
Python
1,690
star
36

mesh

Mesh TensorFlow: Model Parallelism Made Easier
Python
1,589
star
37

haskell

Haskell bindings for TensorFlow
Haskell
1,558
star
38

model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Python
1,486
star
39

workshops

A few exercises for use at events.
Jupyter Notebook
1,457
star
40

ecosystem

Integration of TensorFlow with other open-source frameworks
Scala
1,370
star
41

gnn

TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.
Python
1,320
star
42

model-analysis

Model analysis tools for TensorFlow
Python
1,250
star
43

community

Stores documents used by the TensorFlow developer community
C++
1,239
star
44

text

Making text a first-class citizen in TensorFlow.
C++
1,224
star
45

benchmarks

A benchmark framework for Tensorflow
Python
1,144
star
46

tfjs-node

TensorFlow powered JavaScript library for training and deploying ML models on Node.js.
TypeScript
1,048
star
47

similarity

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
Python
1,008
star
48

transform

Input pipeline framework
Python
984
star
49

neural-structured-learning

Training neural models with structured signals.
Python
982
star
50

gan

Tooling for GANs in TensorFlow
Jupyter Notebook
907
star
51

compression

Data compression in TensorFlow
Python
849
star
52

java

Java bindings for TensorFlow
Java
818
star
53

swift-apis

Swift for TensorFlow Deep Learning Library
Swift
794
star
54

deepmath

Experiments towards neural network theorem proving
C++
779
star
55

data-validation

Library for exploring and validating machine learning data
Python
756
star
56

runtime

A performant and modular runtime for TensorFlow
C++
754
star
57

tensorrt

TensorFlow/TensorRT integration
Jupyter Notebook
736
star
58

docs-l10n

Translations of TensorFlow documentation
Jupyter Notebook
716
star
59

io

Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
C++
698
star
60

tfjs-converter

Convert TensorFlow SavedModel and Keras models to TensorFlow.js
TypeScript
697
star
61

decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Python
656
star
62

swift-models

Models and examples built with Swift for TensorFlow
Jupyter Notebook
644
star
63

tcav

Code for the TCAV ML interpretability project
Jupyter Notebook
612
star
64

recommenders-addons

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
Cuda
590
star
65

tfjs-wechat

WeChat Mini-program plugin for TensorFlow.js
TypeScript
547
star
66

flutter-tflite

Dart
534
star
67

lattice

Lattice methods in TensorFlow
Python
519
star
68

model-card-toolkit

A toolkit that streamlines and automates the generation of model cards
Python
415
star
69

mlir-hlo

MLIR
388
star
70

tflite-support

TFLite Support is a toolkit that helps users to develop ML and deploy TFLite models onto mobile / ioT devices.
C++
374
star
71

cloud

The TensorFlow Cloud repository provides APIs that will allow to easily go from debugging and training your Keras and TensorFlow code in a local environment to distributed training in the cloud.
Python
374
star
72

custom-op

Guide for building custom op for TensorFlow
Smarty
373
star
73

tfjs-vis

A set of utilities for in browser visualization with TensorFlow.js
TypeScript
360
star
74

profiler

A profiling and performance analysis tool for TensorFlow
TypeScript
359
star
75

fairness-indicators

Tensorflow's Fairness Evaluation and Visualization Toolkit
Jupyter Notebook
341
star
76

moonlight

Optical music recognition in TensorFlow
Python
325
star
77

tfjs-tsne

TypeScript
309
star
78

estimator

TensorFlow Estimator
Python
300
star
79

embedding-projector-standalone

HTML
293
star
80

tfjs-layers

TensorFlow.js high-level layers API
TypeScript
283
star
81

build

Build-related tools for TensorFlow
Shell
275
star
82

tflite-micro-arduino-examples

C++
207
star
83

kfac

An implementation of KFAC for TensorFlow
Python
197
star
84

ngraph-bridge

TensorFlow-nGraph bridge
C++
137
star
85

profiler-ui

[Deprecated] The TensorFlow Profiler (TFProf) UI provides a visual interface for profiling TensorFlow models.
HTML
134
star
86

tensorboard-plugin-example

Python
134
star
87

tfx-addons

Developers helping developers. TFX-Addons is a collection of community projects to build new components, examples, libraries, and tools for TFX. The projects are organized under the auspices of the special interest group, SIG TFX-Addons. Join the group at http://goo.gle/tfx-addons-group
Jupyter Notebook
125
star
88

metadata

Utilities for passing TensorFlow-related metadata between tools
Python
102
star
89

networking

Enhanced networking support for TensorFlow. Maintained by SIG-networking.
C++
97
star
90

tfhub.dev

Python
75
star
91

java-ndarray

Java
71
star
92

java-models

Models in Java
Java
71
star
93

tfjs-website

WebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.
CSS
71
star
94

tfjs-data

Simple APIs to load and prepare data for use in machine learning models
TypeScript
66
star
95

tfx-bsl

Common code for TFX
Python
64
star
96

autograph

Python
50
star
97

model-remediation

Model Remediation is a library that provides solutions for machine learning practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
Python
42
star
98

codelabs

Jupyter Notebook
36
star
99

tensorstore

C++
25
star
100

swift-bindings

Swift
25
star