yhyu13/AlphaGOZero-python-tensorflow

Stars
337
Rank 124,560 (Top 3 %)
Language
Python
License
MIT License
Created almost 7 years ago
Updated almost 2 years ago

yhyu13/AlphaGOZero-python-tensorflow

yhyu13

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Congratulation to DeepMind! This is a reengineering implementation (on behalf of many other git repo in /support/) of DeepMind's Oct19th publication: [Mastering the Game of Go without Human Knowledge]. The supervised learning approach is more practical for individuals. (This repository has single purpose of education only)

AlphaGOZero (python tensorflow implementation)

This is a trial implementation of DeepMind's Oct19th publication: Mastering the Game of Go without Human Knowledge.

DeepMind release AlphaZero Teaching Go. It's a lot of fun!

From Paper

Pure RL has outperformed supervised learning+RL agent

SL evaluation

Download trained model

https://drive.google.com/drive/folders/1Xs8Ly3wjMmXjH2agrz25Zv2e5-yqQKaP?usp=sharing
Place under ./savedmodels/large20/

Set up

Install requirement

python 3.6 tensorflow/tensorflow-gpu (version 1.4, version >= 1.5 can't load trained models)

pip install -r requirement.txt

Download Dataset (kgs 4dan)

Under repo's root dir

cd data/download
chmod +x download.sh
./download.sh

Preprocess Data

It is only an example, feel free to assign your local dataset directory

python preprocess.py preprocess ./data/SGFs/kgs-*

Train A Model

python main.py --mode=train

Play Against An A.I.

python main.py --mode=gtp —-gtp_poliy=greedypolicy --model_path='./savedmodels/your_model.ckpt'

Play in Sabaki

In console:

which python

add result to the headline of main.py with #! prefix.

Add the path of main.py to Sabaki's manage Engine with argument --mode=gtp

TODO:

Credit (orderless):

*Brain Lee *Ritchie Ng *Samuel Graván *森下健 *yuanfengpang

C51-DDPG

This is a TensorFlow implementation of DeepMind's A Distributional Perspective on Reinforcement Learning.(C51-DDPG)

CapsNet-Gravitational-Lensing

Estimating parameters of strong gravitational lenses with Capsule networks

Feedback-Alignment

Feedback alignment is a backpropagation modification where the next layer weights become a fixed random matrix. [Lillicrap et al](https://www.nature.com/articles/ncomms13276) shows a FA is a regularizer where the next layer weights must learn to orient within 90 degree in order to perform effective training. The main obstacle of FA is to prove of general convergence under nonlinear dynamics.

Matrix-CapsNet-EM-routing-tensorflow

This is a trial implementation of Hinton group's [MATRIX CAPSULES WITH EM ROUTING](https://openreview.net/pdf?id=HJWLfGWRb) in TensorFlow and Python programming language. （仅供交流学习使用）

CapsNet-python-tensorflow

This is python TensorFlow implementation of [Dynamic Routing Between Capsules](https://arxiv.org/pdf/1710.09829.pdf) (仅供交流学习使用)

Emotivoice_TTS

Vulkan

My Vulkan Renderer

Engine2021

Custom game engine made in 2021, this engine is inspired from what I learnt at DigiPen Institute of Technology

AIResearchVault

neural-combinatorial-optimization-rl-tensorflow

Galaxy_Zoo_Capsule

Using capsule net to tell the existence of either spiral or ellpitical galaxy or both.

Jupyter Notebook

CS380_3D_Nav_UE4

FPSCppTemplate-4.21

This project aims to create a KZ Jump like game in UE4! glhf!

InterpreterProject

Paper-I-read

This repo contains scientific paper I read as a reminder to myself. Hope this is helpful to you too.

HolodeckNavigationTask

CA_MODs

PyTorch-YOLOv3-Overlapping-Galaxy