Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Perl

Crystal

Groovy

TypeScript

Solidity

CSS

Zig

HTML

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

C

Ada

Dart

Objective-C

Swift

Erlang

PHP

Elixir

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇷🇴 Romania

🇬🇩 Grenada

🇩🇲 Dominica

🇪🇬 Egypt

🇾🇪 Yemen

🇹🇩 Chad

🇸🇱 Sierra Leone

🇹🇯 Tajikistan

All Countries Compare Countries

iassael/learning-to-communicate

Stars
435
Rank 100,085 (Top 2 %)
Language
Lua
License
Apache License 2.0
Created over 8 years ago
Updated almost 6 years ago

iassael/learning-to-communicate

iassael

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson

PyTorch

- PyTorch Implementation by @minqi

- Simplified PyTorch implementation in a colab by @JainMoksh

Abstract

We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.

Links

- Montreal Deep Learning Summer School 2016 talk

Execution

$ # Requirements: nvidia-docker
$ # Build docker instance (takes a while)
$ ./build.sh
$ # Run docker instance
$ ./run.sh
$ # Run experiment e.g.
$ ./run_switch_3-dial.sh

Bibtex

@inproceedings{foerster2016learning,
    title={Learning to communicate with deep multi-agent reinforcement learning},
    author={Foerster, Jakob and Assael, Yannis M and de Freitas, Nando and Whiteson, Shimon},
    booktitle={Advances in Neural Information Processing Systems},
    pages={2137--2145},
    year={2016} 
}

License

Code licensed under the Apache License v2.0

torch-bnlstm

Batch-Normalized LSTM (Recurrent Batch Normalization) implementation in Torch.

torch-policy-gradient

Deterministic Policy Gradient using torch7

Jupyter Notebook

torch-ddcnn

From Pixels to Torques: Policy Learning using Deep Dynamical Convolutional Neural Networks (DDCNN)

Jupyter Notebook

torch-e2c

Torch7 impementation of: Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

Jupyter Notebook

torch-bootstrapped-dqn

Torch implementation of "Deep Exploration via Bootstrapped DQN"

torch-dropconnect

Torch7 implementation of "Regularization of Neural Networks using DropConnect"

torch-dqn

Simple PuddleWorld DQN example using torch7

torch-decomposition

Component Analysis using Torch7 (PCA, Whitened PCA, LDA, LPP, NPP, FastICA)

torch-linearo

Torch Linear Unit with Orthogonal Weight Initialization

cuda-aho-corasick-wu-manber

A Hybrid Parallel Implementation of the Aho-Corasick and Wu-Manber Algorithms Using NVIDIA CUDA and MPI Evaluated on a Biological Sequence Database. Charalampos S. Kouzinopoulos, Yannis M. Assael, Themistoklis K. Pyrgiotis, and Konstantinos G. Margaritis

tax-evasion-torch

torch-elu

Torch implementation of "Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)"

google-drive-trash-cleaner

Automatically clean your Google Drive's trash.

csoxcal

Oxford University, Computer Science Calendar Filter for Google Calendar use

findtheword

Find the greek word from the given letters. Application to cheat and solve games like "Τηλεκύβος" and "Βρες τη λέξη"

bo-benchmark-rkhs

RKHS 1D Function for Bayesian Optimization tasks

DEARanking

Proposing a hybrid DEA/Polynomial Interpolation (DEA/PI) algorithm for the raking of protected areas: An application in Greece