• Stars
    star
    600
  • Rank 74,640 (Top 2 %)
  • Language
    Python
  • Created about 6 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Hello, I pushed some python environments for Multi Agent Reinforcement Learning.

Multi-Agent-Learning-Environments

Hello, I pushed some python environments for Multi Agent Reinforcement Learning. Some are single agent version that can be used for algorithm testing. I provide documents for each environment, you can check the corresponding pdf files in each directory. These are just toy problems, though some of them are still hard to solve. Some environments are like:

Multi Agent Soccer Game

image

Multi Agent Rescue

image

Multi Agent Cleaner

image

Multi Agent Move Box

image

Multi Agent Catching Pig

image

Multi Drones Monitoring

image

Multi Agent Maze Running

image

Multi Agent Find Treasure

image

Firefighters

image

Go Together

image

Warehouse

image

Opposite

image

Dependency

OpenCV, swig

Multi-Agent Environment Standard

Assumption:

Each agent works synchronously.

Member Functions

reset()

reward_list, done = step(action_list)

obs_list = get_obs()

reward_list records the single step reward for each agent, it should be a list like [reward1, reward2,......]. The length should be the same as the number of agents. Each element in the list should be a integer.

done True/False, mark when an episode finishes.

action_list records the single step action instruction for each agent, it should be a list like [action1, action2,...]. The length should be the same as the number of agents. Each element in the list should be a non-negative integer.

obs_list records the single step observation for each agent, it should be a list like [obs1, obs2,...]. The length should be the same as the number of agents. Each element in the list can be any form of data, but should be in same dimension, usually a list of variables or an image.

Typical Monte Carlo Procedures

reset environment by calling reset() get initial observation get_obs() for i in range(max_MC_iter): get action_list from controller apply action by step() record returned reward list record new observation by get_obs()

Citation

Cite the environment of the following paper as:

@inproceedings{jiang2021multi,
 title={Multi-agent reinforcement learning with directed exploration and selective memory reuse},
 author={Jiang, Shuo and Amato, Christopher},
 booktitle={Proceedings of the 36th Annual ACM Symposium on Applied Computing},
 pages={777--784},
 year={2021}
}