• Stars
    star
    398
  • Rank 107,662 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An extension of the PyMARL codebase that includes additional algorithms and environment support

Extended Python MARL framework - EPyMARL

EPyMARL is an extension of PyMARL, and includes

  • Additional algorithms (IA2C, IPPO, MADDPG, MAA2C and MAPPO)
  • Support for Gym environments (on top of the existing SMAC support)
  • Option for no-parameter sharing between agents (original PyMARL only allowed for parameter sharing)
  • Flexibility with extra implementation details (e.g. hard/soft updates, reward standarization, and more)
  • Consistency of implementations between different algorithms (fair comparisons)

See our blog post here: https://agents.inf.ed.ac.uk/blog/epymarl/

Table of Contents

Installation & Run instructions

For information on installing and using this codebase with SMAC, we suggest visiting and reading the original PyMARL README. Here, we maintain information on using the extra features EPyMARL offers. To install the codebase, clone this repo and install the requirements.txt.

Installing LBF, RWARE, and MPE

In Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks we introduce and benchmark algorithms in Level-Based Foraging, Multi-Robot Warehouse and Multi-agent Particle environments. To install these please visit:

Example of using LBF:

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=25 env_args.key="lbforaging:Foraging-8x8-2p-3f-v1"

Example of using RWARE:

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=500 env_args.key="rware:rware-tiny-2ag-v1"

For MPE, our fork is needed. Essentially all it does (other than fixing some gym compatibility issues) is i) registering the environments with the gym interface when imported as a package and ii) correctly seeding the environments iii) makes the action space compatible with Gym (I think MPE originally does a weird one-hot encoding of the actions).

The environments names in MPE are:

...
    "multi_speaker_listener": "MultiSpeakerListener-v0",
    "simple_adversary": "SimpleAdversary-v0",
    "simple_crypto": "SimpleCrypto-v0",
    "simple_push": "SimplePush-v0",
    "simple_reference": "SimpleReference-v0",
    "simple_speaker_listener": "SimpleSpeakerListener-v0",
    "simple_spread": "SimpleSpread-v0",
    "simple_tag": "SimpleTag-v0",
    "simple_world_comm": "SimpleWorldComm-v0",
...

Therefore, after installing them you can run it using:

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=25 env_args.key="mpe:SimpleSpeakerListener-v0"

The pretrained agents are included in this repo here. You can use them with:

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=25 env_args.key="mpe:SimpleAdversary-v0" env_args.pretrained_wrapper="PretrainedAdversary"

and

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=25 env_args.key="mpe:SimpleTag-v0" env_args.pretrained_wrapper="PretrainedTag"

Using A Custom Gym Environment

EPyMARL supports environments that have been registered with Gym. The only difference with the Gym framework would be that the returned rewards should be a tuple (one reward for each agent). In this cooperative framework we sum these rewards together.

Environments that are supported out of the box are the ones that are registered in Gym automatically. Examples are: Level-Based Foraging and RWARE.

To register a custom environment with Gym, use the template below (taken from Level-Based Foraging).

from gym.envs.registration import registry, register, make, spec
register(
  id="Foraging-8x8-2p-3f-v1",                     # Environment ID.
  entry_point="lbforaging.foraging:ForagingEnv",  # The entry point for the environment class
  kwargs={
            ...                                   # Arguments that go to ForagingEnv's __init__ function.
        },
    )

Run an experiment on a Gym environment

python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=50 env_args.key="lbforaging:Foraging-8x8-2p-3f-v1"

In the above command --env-config=gymma (in constrast to sc2 will use a Gym compatible wrapper). env_args.time_limit=50 sets the maximum episode length to 50 and env_args.key="..." provides the Gym's environment ID. In the ID, the lbforaging: part is the module name (i.e. import lbforaging will run automatically).

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

All results will be stored in the Results folder.

Run a hyperparameter search

We include a script named search.py which reads a search configuration file (e.g. the included search.config.example.yaml) and runs a hyperparameter search in one or more tasks. The script can be run using

python search.py run --config=search.config.example.yaml --seeds 5 locally

In a cluster environment where one run should go to a single process, it can also be called in a batch script like:

python search.py run --config=search.config.example.yaml --seeds 5 single 1

where the 1 is an index to the particular hyperparameter configuration and can take values from 1 to the number of different combinations.

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Citing EPyMARL and PyMARL

The Extended PyMARL (EPyMARL) codebase was used in Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks.

Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, & Stefano V. Albrecht. Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS), 2021

In BibTeX format:

@inproceedings{papoudakis2021benchmarking,
   title={Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks},
   author={Georgios Papoudakis and Filippos Christianos and Lukas Schäfer and Stefano V. Albrecht},
   booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS)},
   year={2021},
   url = {http://arxiv.org/abs/2006.07869},
   openreview = {https://openreview.net/forum?id=cIrPX-Sn5n},
   code = {https://github.com/uoe-agents/epymarl},
}

If you use the original PyMARL in your research, please cite the SMAC paper.

M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Farquhar, N. Nardelli, T.G.J. Rudner, C.-M. Hung, P.H.S. Torr, J. Foerster, S. Whiteson. The StarCraft Multi-Agent Challenge, CoRR abs/1902.04043, 2019.

In BibTeX format:

@article{samvelyan19smac,
  title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
  author = {Mikayel Samvelyan and Tabish Rashid and Christian Schroeder de Witt and Gregory Farquhar and Nantas Nardelli and Tim G. J. Rudner and Chia-Man Hung and Philiph H. S. Torr and Jakob Foerster and Shimon Whiteson},
  journal = {CoRR},
  volume = {abs/1902.04043},
  year = {2019},
}

License

All the source code that has been taken from the PyMARL repository was licensed (and remains so) under the Apache License v2.0 (included in LICENSE file). Any new code is also licensed under the Apache License v2.0

More Repositories

1

IGP2

Official Repository for "Interpretable Goal-based Prediction and Planning for Autonomous Driving" (ICRA 2021)
Python
81
star
2

LIAM

Official Repository for "Agent Modelling under Partial Observability for Deep Reinforcement Learning"
Python
30
star
3

smaclite

The Starcraft Multi-Agent challenge lite
Python
30
star
4

GPL

Codebase for the Graph-based Policy Learning algorithm, which is designed for learning policies to solve the open ad hoc teamwork problem.
Python
29
star
5

derl

The official repository of Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration" (AAMAS 2022)
Python
24
star
6

revisiting-maddpg

Revisiting Discrete Gradient Estimation in MADDPG
Python
23
star
7

GRIT

Verifiable Goal Recognition for Autonomous Driving using Decision Trees
Jupyter Notebook
20
star
8

Building-a-Complete-RL-System_Demonstration

"Building a Complete RL System" demonstration code to go with University of Edinburgh RL lecture
Jupyter Notebook
19
star
9

BRDiv

Codebase for BRDiv: Diverse teammate generation for ad hoc teamwork
Python
12
star
10

MATE

Learning Task Embeddings for Teamwork Adaptation in Multi-Agent Reinforcement Learning
Jupyter Notebook
11
star
11

reading-group

Propose & vote on reading group papers in the "Discussions" tab.
11
star
12

pressureplate

Repo for the multi-agent PressurePlate environment
Python
10
star
13

PO-GPL

Official code for "A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning"
Python
10
star
14

TED

Official repository for "Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning".
Python
9
star
15

OGRIT

Verifiable Goal Recognition for Autonomous Driving with Occlusions
Jupyter Notebook
8
star
16

uoe-rl2021

Codebase for the coursework of the Reinforcement Learning (2020-21) course at the University of Edinburgh
Python
6
star
17

uoe-rl2022

Codebase for the coursework of the Reinforcement Learning (2021-22) course at the University of Edinburgh
Python
6
star
18

Expressivity-of-Emergent-Languages

The official code base of Expressivity of Emergent Languages (ICLR-2022).
Python
6
star
19

CMID

Python
5
star
20

non_conventional_value_function_approximation

Comparative Evaluation of Non-Conventional Value Function Approximation Methods in Reinforcement Learning
Jupyter Notebook
4
star
21

CEMA

Causal Explanations for Sequential Decision-Making in Multi-Agent Systems
Python
4
star
22

xavi-ai4ad

Implementation of the method from the workshop paper "A Human-Centric Method for Generating Causal Explanations in Natural Language for Autonomous Vehicle Motion Planning"
Python
4
star
23

robust_onpolicy_data_collection

Official Repository for "Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning" (NeurIPS 2022).
Jupyter Notebook
4
star
24

uoe-rl2024-coursework

Python
3
star
25

MEDoE

Learning Complex Teamwork Tasks using a Sub-task Curriculum
Python
3
star
26

uoe-rl2023-coursework

Code repository for the coursework of the Reinforcement Learning course at the University of Edinburgh (2022-2023 edition).).
Python
3
star
27

SMAClite-Python-RVO2

C++
2
star
28

ksl

Official repository for "Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning"
Jupyter Notebook
1
star
29

dred

WIP
Python
1
star
30

pytorch-demo

Introduction to Pytorch, demonstrated during the second laboratory session of UoE's Reinforcement Learning course.
Jupyter Notebook
1
star
31

task-assignment-robotic-warehouse

Task-Assignment Multi-Robot Warehouse (TA-RWARE): A multi-agent reinforcement learning warehouse environment for task-assignment optimisation
Python
1
star
32

feedback-dt

Python
1
star