• Stars
    star
    282
  • Rank 142,003 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 4 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.

HandyRL

Quick to Start, Easy to Win

  • Prepare your own environment
  • Let’s start large-scale distributed reinforcement learning
  • Get your strong AI agent!

HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments. HandyRL focuses on a practicable algorithm and implementation to create a strong and winning AI in competitive games. For large scale training, HandyRL provides a controllable high parallelism power according to your environment.

HandyRL is updated at the beginning of every month except for important updates. We appreciate all contributions. Please let us know if you find a bug or have a suggestion by creating an issue and a PR.

More About HandyRL

HandyRL mainly provides a policy gradient algorithm with off-policy correction. From the perspective of stability and performance, the off-policy version policy gradient works fine in practice. So it’s a good first choice to create a baseline AI model. You can use some off-policy variants of update methods (targets of policy and value) from traditional ones (monte carlo, TD(Ξ»)) to novel ones (V-Trace, UPGO). These items can be changed in config.yaml.

As a training architecture, HandyRL adopts a learner-worker style architecture like IMPALA. The learner is a brain of training which updates a model and controls the workers. The workers have two roles. They asynchronously generate episodes (trajectories) and evaluate trained models. In episode generation, self-play is conducted as default.

Installation

Install dependencies

HandyRL supports Python3.7+. At first, copy or fork HandyRL repository to your environment. If you want to use this script in your private project, just copy the files to your project directory and modify it there.

git clone https://github.com/DeNA/HandyRL.git
cd HandyRL

Then, install additional libraries (e.g. numpy, pytorch). Or run it in a virtual environment or container (e.g. Docker).

pip3 install -r requirements.txt

To use games of kaggle environments (e.g. Hungry Geese) you can install also additional dependencies.

pip3 install -r handyrl/envs/kaggle/requirements.txt

Getting Started

Train AI Model for Tic-Tac-Toe

This section shows the training a model for Tic-Tac-Toe. Tic-Tac-Toe is a very simple game. You can play by googling "Tic-Tac-Toe".

Step 1: Set up configuration

Set config.yaml for your training configuration. When you run a training with Tic-Tac-Toe and batch size 64, set like the following:

env_args:
    env: 'TicTacToe'

train_args:
    ...
    batch_size: 64
    ...

NOTE: Here is the list of games implemented in HandyRL. All parameters are shown in Config Parameters.

Step 2: Train!

After creating the configuration, you can start training by running the following command. The trained models are saved in models folder every update_episodes described in config.yaml.

python main.py --train

Step 3: Evaluate

After training, you can evaluate the model against any models. The below code evaluate the model of epoch 1 for 100 games with 4 processes.

python main.py --eval models/1.pth 100 4

NOTE: Default opponent AI is random agent implemented in evaluation.py. You can change the agent with any of your agents.

Documentation

Use Cases

More Repositories

1

HandlerSocket-Plugin-for-MySQL

HandlerSocket is a NoSQL plugin for MySQL, working as a daemon inside the mysqld process, to accept tcp connections, and execute requests from clients. HandlerSocket does not support SQL queries; instead it supports simple CRUD operations on tables.
C++
1,132
star
2

PyTorch_YOLOv3

Implementation of YOLOv3 in PyTorch
Python
433
star
3

Chainer_Realtime_Multi-Person_Pose_Estimation

Chainer version of Realtime Multi-Person Pose Estiamtion
Python
431
star
4

PacketProxy

A local proxy written in Java
Java
429
star
5

SRCNNKit

CoreML and Keras implementation of Super-Resolution Convolutional Neural Network (SRCNN)
Python
387
star
6

DeClang

An anti-hacking compiler forked from the ollvm (https://github.com/obfuscator-llvm/obfuscator)
379
star
7

Chainer_Mask_R-CNN

Implementation of Mask R-CNN in Chainer
Python
140
star
8

nota

Web application for image and video labeling and annotation
JavaScript
112
star
9

Anjin

Autopilot tool for games made with Unity
C#
99
star
10

unity-meta-check

A tool to check problems about meta files of Unity
Go
81
star
11

techcon_app

TechCon App
Dart
57
star
12

HEVCPlayerView

C++
46
star
13

android-modern-architecture-test-handson

Kotlin
30
star
14

codelabs

DeNAγŒδ½œζˆγƒ»ε…¬ι–‹γ—γ¦γ„γ‚‹γ‚³γƒΌγƒ‰γƒ©γƒœγ§γ™γ€‚
Kotlin
28
star
15

cocoa-checker

COCOA(Covid-19 Exposure Notification System in Japan) Signal Checker / COCOA(Covid-19ζŽ₯触璺θͺγ‚’γƒ—γƒͺ)ε‹•δ½œγƒγ‚§γƒƒγ‚«γƒΌ
HTML
25
star
16

aelog

App Engine Logger
Go
24
star
17

ChainerPruner

ChainerPruner: Channel Pruning framework for Chainer
Python
21
star
18

devfarm

Tools to control iOS and Android mobile apps across several device farms
Go
20
star
19

Face2Speech

20
star
20

tflite-runtime-builder

Build TensorFlow Lite runtime with GitHub Actions
20
star
21

thrush

Some useful additions to bluebird for Node.js
JavaScript
16
star
22

punctual

Redis-backed Node.js task queue for delayed job processing
JavaScript
15
star
23

setup-job-workspace-action

An action creating a virtual workspace directory for each job
TypeScript
11
star
24

dworker

Distributed worker system.
JavaScript
11
star
25

capistrano-net_storage

Capistrano Plugin for Fast Deployment via Remote Storage
Ruby
10
star
26

cloud-datastore-interceptor

Interceptors for Cloud Datastore
Go
10
star
27

Dena.CodeAnalysis.Testing

TDD friendly test helpers for Microsoft.CodeAnalysis.Diagnostics.DiagnosticAnalyzer
C#
9
star
28

mysql_rewinder

Ruby
8
star
29

mobilize-server

Mobilize-Server includes deployment scripts via Capistrano and scheduling via whenever.
Shell
7
star
30

aehcl

App Engine Http Client
Go
7
star
31

FBStackableURLCache

A more pluggable version of Apple's NSURLCache. Implement a filtering webbrowser, or even your own version of Amazon Silk…
Objective-C
6
star
32

digdag-operator-bq-wait

Java
6
star
33

m_logger

Ruby
6
star
34

mobilize-base

Mobilize is a script deployment and data visualization framework with a Google Spreadsheets UI. Mobilize uses Resque for parallelization and queueuing, MongoDB for caching, and Google Drive for hosting, user input and display.
Ruby
6
star
35

FBFramedScrollableView

UIView subclass that manages any type of UIKit scrollable view, automatically animating a header and footer as you scroll.
Objective-C
6
star
36

rubycf

Ruby bindings for native Property List read/writing using Core Foundation or CFLite
C
5
star
37

asyncgraph

asyncgraph is a very simply module for controlling flow between asynchronous code.
JavaScript
5
star
38

IsarTutorial

Isabelle
4
star
39

PacketProxyPlugin

Plugins for PacketProxy
Java
4
star
40

ommonitor

Open Match Ticket Monitor
Go
4
star
41

mobilize-ssh

Mobilize-Ssh adds the power of ssh to mobilize-base.
Ruby
4
star
42

ubuntu22-mysql-q4m

Dockerfile
4
star
43

RoslynAnalyzerTemplate

C#
3
star
44

capistrano-deploy_locker

Capistrano Plugin to Lock Deployment
Ruby
3
star
45

redis_info

A Scout plugin to monitor redis by using the redis-cli info command
3
star
46

PacketProxyHub

Web service for sharing configs of PacketProxy
Java
3
star
47

Login-Toboggan-Android

Java
2
star
48

mobilize-hdfs

Adds hdfs support for mobilize-ssh
Ruby
2
star
49

kobold_ruby

Tools for working with and writing tests in Ruby, Rails and Sinatra
Ruby
2
star
50

capistrano-net_storage_demo

Example application for Capistrano::NetStorage
Ruby
1
star
51

aemw

App Engine Middleware
1
star
52

mobilize-hive

adds hive support to mobilize-hdfs
Ruby
1
star
53

unity-meta-check-bins

Pre-built binaries of unity-meta-check for Windows/Linux/macOS
Shell
1
star
54

capistrano-net_storage-s3

Capistrano::NetStorage Plugin for Deployment via Amazon S3
Ruby
1
star
55

Login-Toboggan-iOS

Objective-C
1
star
56

mono-login-sample

C#
1
star