• Stars
    star
    305
  • Rank 136,879 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

xingtian is a componentized library for the development and verification of reinforcement learning algorithms

δΈ­ζ–‡

Introduction

License: MIT

XingTian (εˆ‘ε€©) is a componentized library for the development and verification of reinforcement learning algorithms. It supports multiple algorithms, including DQN, DDPG, PPO, and IMPALA etc, which could training agents in multiple environments, such as Gym, Atari, Torcs, StarCraftII and so on. To meet users' requirements for quick verification and solving RL problems, four modules are abstracted: Algorithm, Model, Agent, and Environment. They work in a similar way as the combination of `Lego' building blocks. For details about the architecture, please see the Architecture introduction.

Dependencies

# ubuntu 18.04
sudo apt-get install python3-pip libopencv-dev -y
pip3 install opencv-python

# run with tensorflow 1.15.0 or tensorflow 2.3.1
pip3 install zmq h5py gym[atari] tqdm imageio matplotlib==3.0.3 Ipython pyyaml tensorflow==1.15.0 pyarrow lz4 fabric2 absl-py psutil tensorboardX setproctitle

or, using pip3 install -r requirements.txt

If your want to used PyTorch as the backend, please install it by yourself. Ref Pytorch

Installation

# cd PATH/TO/XingTian 
pip3 install -e .

After installation, you could use import xt; print(xt.__Version__) to check whether the installation is successful.

In [1]: import xt

In [2]: xt.__version__
Out[2]: '0.3.0'

Quick Start


Setup configuration

Follow's configuration shows a minimal example with Cartpole environment. More detailed description with the parameters of agent, algorithm and environment could been find in the User guide .

alg_para:
  alg_name: PPO
  alg_config:
    process_num: 1
    save_model: True  # default False
    save_interval: 100

env_para:
  env_name: GymEnv
  env_info:
    name: CartPole-v0
    vision: False

agent_para:
  agent_name: PPO
  agent_num : 1
  agent_config:
    max_steps: 200
    complete_step: 1000000
    complete_episode: 3550

model_para:
  actor:
    model_name: PpoMlp
    state_dim: [4]
    action_dim: 2
    input_dtype: float32
    model_config:
      BATCH_SIZE: 200
      CRITIC_LOSS_COEF: 1.0
      ENTROPY_LOSS: 0.01
      LR: 0.0003
      LOSS_CLIPPING: 0.2
      MAX_GRAD_NORM: 5.0
      NUM_SGD_ITER: 8
      SUMMARY: False
      VF_SHARE_LAYERS: False
      activation: tanh
      hidden_sizes: [64, 64]

env_num: 10

In addition, your could find more configuration sets in examples directory.

Start training task

python3 xt/main.py -f examples/cartpole_ppo.yaml -t train

img

Evaluate local trained model

Set benchmark.eval.model_path for evaluation within the YOUR_CONFIG_FILE.yaml

benchmark:
  eval:
    model_path: /YOUR/PATH/TO/EVAL/models
    gap: 10           # index gap of eval model
    evaluator_num: 1  # the number of evaluator instance

# run command
python3 xt/main.py -f examples/cartpole_ppo.yaml -t evaluate

NOTE: XingTian start with -t train as default.

Run with CLI

# Could replace `python3 xt/main.py` with `xt_main` command!
xt_main -f examples/cartpole_ppo.yaml -t train

# train with evaluate
xt_main -f examples/cartpole_ppo.yaml -t train_with_evaluate

Develop with Custom case

  1. Write custom module, and register it. More detail guidance on custom module can be found in the Developer Guide
  2. Add YOUR-CUSTOM-MODULE name into your_train_configure.yaml
  3. Start training with xt_main -f path/to/your_train_configure.yaml :)

Reference Results

Episode Reward Average

  1. DQN Reward after 10M time-steps (40M frames).

    env XingTian Basic DQN RLlib Basic DQN Hessel et al. DQN
    BeamRider 6706 2869 ~2000
    Breakout 352 287 ~150
    QBert 14087 3921 ~4000
    SpaceInvaders 947 650 ~500
  2. PPO Reward after 10M time-steps (40M frames).

    env XingTian PPO RLlib PPO Baselines PPO
    BeamRider 4877 2807 ~1800
    Breakout 341 104 ~250
    QBert 14771 11085 ~14000
    SpaceInvaders 1025 671 ~800
  3. IMPALA Reward after 10M time-steps (40M frames).

    env XingTian IMPALA RLlib IMPALA
    BeamRider 2313 2071
    Breakout 334 385
    QBert 12205 4068
    SpaceInvaders 742 719

Throughput

  1. DQN

    env XingTian Basic DQN RLlib Basic DQN
    BeamRider 129 109
    Breakout 117 113
    QBert 111 90
    SpaceInvaders 115 100
  2. PPO

    env XingTian PPO RLlib PPO
    BeamRider 2422 1618
    Breakout 2497 1535
    QBert 2436 1617
    SpaceInvaders 2438 1608
  3. IMPALA

    env XingTian IMPALA RLlib IMPALA
    BeamRider 8756 3637
    Breakout 8814 3525
    QBert 8249 3471
    SpaceInvaders 8463 3555

Experiment condition: 72 Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz with single Tesla V100

Ray's reward data come from https://github.com/ray-project/rl-experiments, and Throughout from ray 0.8.6 with the same machine condition.

Acknowledgement

XingTian refers to the following projects: DeepMind/scalable_agent, baselines, ray.

License

The MIT License(MIT)

More Repositories

1

Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Python
4,021
star
2

HEBO

Bayesian optimisation & Reinforcement Learning library developped by Huawei Noah's Ark Lab
Jupyter Notebook
3,266
star
3

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Python
2,961
star
4

Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab
Jupyter Notebook
1,116
star
5

AdderNet

Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
Python
952
star
6

trustworthyAI

Trustworthy AI related projects
Python
949
star
7

SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving
Python
922
star
8

bolt

Bolt is a deep learning library with high performance and heterogeneous flexibility.
C++
896
star
9

noah-research

Noah Research
Python
867
star
10

vega

AutoML tools chain
Python
840
star
11

VanillaNet

Python
810
star
12

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Jupyter Notebook
547
star
13

streamDM

Stream Data Mining Library for Spark Streaming
Scala
492
star
14

Pretrained-IPT

Python
406
star
15

benchmark

HTML
274
star
16

Disout

Code for AAAI 2020 paper, Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks (Disout).
Python
219
star
17

BGCN

A Tensorflow implementation of "Bayesian Graph Convolutional Neural Networks" (AAAI 2019).
Python
152
star
18

BHT-ARIMA

Code for paper: Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting (AAAI-20)
Python
97
star
19

multi_hyp_cc

[CVPR2020] A Multi-Hypothesis Approach to Color Constancy
Python
82
star
20

Efficient-NLP

Python
79
star
21

streamDM-Cpp

stream Machine Learning in C++
C++
68
star
22

Federated-Learning

Python
15
star