• Stars
    star
    272
  • Rank 151,235 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 6 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Find best-response to a fixed policy in multi-agent RL

CircleCI codecov

Codebase to train, evaluate and analyze adversarial policies: policies attacking a fixed victim agent in a multi-agent system. See paper for more information.

Installation

The easiest way to install the code is to build the Docker image in the Dockerfile. This will install all necessary binary and Python dependencies. Build the image by:

$ docker build .

You can also pull a Docker image for the latest master commit from humancompatibleai/adversarial_policies:latest. Once you have built the image, run it by:

docker run -it --env MUJOCO_KEY=URL_TO_YOUR_MUJOCO_KEY \
       humancompatibleai/adversarial_policies:latest /bin/bash  # change tag if built locally

If you want to run outside of Docker (for example, for ease of development), read on.

This codebase uses Python 3.7. The main binary dependencies are MuJoCo (version 1.3.1, for gym_compete environments, and 2.0 for the others). You may also need to install some other libraries, such as OpenMPI.

Create a virtual environment by running ci/build_venv.sh. Activate it by . ./venv/bin/activate. Finally, run pip install -e . to install an editable version of this package.

Reproducing Results

Note we use Sacred for experiment configuration.

Training adversarial policies

aprl.train trains a single adversarial policy. By default it will train on SumoAnts for a brief period of time. You can override any of config parameters, defined in train_config, at the command line. For example, to replicate one of the experiments in the paper, run:

# Train on Sumo Humans for 20M timesteps
python -m aprl.train with env_name=multicomp/SumoHumans-v0 paper

aprl.multi.train trains multiple adversarial policies, using Ray (see below) for parallelization. To replicate the results in the paper (there may be slight differences due to randomness not captured in the seeding), run python -m aprl.multi.train with paper. To run the hyperparameter sweep, run python -m aprl.multi.train with hyper.

You can find results from our training run on s3://adversarial-policies-public/multi_train/paper. This includes TensorBoard logs, final model weights, checkpoints, and individual policy configs. Run experiments/pull_public_s3.sh to sync this and other data to data/aws-public/.

Evaluating adversarial policies

aprl.score_agent evaluates a pair of policies, for example an adversary and a victim. It outputs the win rate for each agent and the number of ties. It can also render to the screen or produce videos.

We similarly use aprl.multi.score to evaluate multiple pairs of policies in parallel. To reproduce all the evaluations used in the paper, run the following bash scripts, which call aprl.multi.score internally:

  • experiments/modelfree/baselines.sh: fixed baselines (no adversarial policies).
  • experiments/modelfree/attack_transfer.sh <path-to-trained-adversaries>. To use our pre-trained policies, use the path data/aws-public/multi_train/paper/20190429_011349 after syncing against S3.

Visualizing Results

Most of the visualization code lives in the aprl.visualize package. To reproduce the figures in the paper, use paper_config; for those in the appendix, use supplementary_config. So:

  python -m aprl.visualize.scores with paper_config  # heatmaps in the paper
  python -m aprl.visualize.training with supplementary_config  # training curves in appendix

To re-generate all the videos, use aprl.visualize.make_videos. We would recommend running in Docker, in which case it will render using Xdummy. This avoids rendering issues with many graphics drivers.

Note you will likely need to change the default paths in the config to point at your evaluation results from the previous section, and desired output directory. For example:

python -m aprl.visualize.scores with tb_dir=<path/to/trained/models> \
                                     transfer_path=<path/to/multi_score/output>
python -m aprl.visualize.make_videos with adversary_path=<path/to/best_adversaries.json>

Additional Analysis

The density modeling can be run by experiments/aprl/density.sh, or with custom configurations via aprl.density.pipeline.

The t-SNE visualizations can be replicated with aprl.tsne.pipeline.

Using Ray

Many of the experiments are computationally intensive. You can run them on a single machine, but it might take several weeks. We use Ray to run distributed experiments. We include example configs in src/aprl/configs/ray/. To use aws.yaml you will need to, at a minimum, edit the config to use your own AMI (anything with Docker should work) and private key. Then just run ray up <path-to-config> and it will start a cluster. SSH into the head node, start a shell in Docker, and then follow the above instructions. The script should automatically detect it is part of a Ray cluster and run on the existing Ray server, rather than starting a new one.

Contributions

The codebase follows PEP8, with a 100-column maximum line width. Docstrings should be in reST.

Please run the ci/code_checks.sh before committing. This runs several linting steps. These are also run as a continuous integration check.

I like to use Git commit hooks to prevent bad commits from happening in the first place:

ln -s ../../ci/code_checks.sh .git/hooks/pre-commit

More Repositories

1

imitation

Clean PyTorch implementations of imitation and reward learning algorithms
Python
1,294
star
2

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.
Jupyter Notebook
706
star
3

human_aware_rl

Code for "On the Utility of Learning about Humans for Human-AI Coordination"
Python
107
star
4

evaluating-rewards

Library to compare and evaluate reward functions
Python
61
star
5

overcooked-demo

Web application where humans can play Overcooked with AI agents.
JavaScript
55
star
6

seals

Benchmark environments for reward modelling and imitation learning algorithms.
Python
44
star
7

rlsp

Reward Learning by Simulating the Past
Python
43
star
8

tensor-trust

A prompt injection game to collect data for robust ML research
Python
40
star
9

eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
Python
36
star
10

tensor-trust-data

Dataset for the Tensor Trust project
Jupyter Notebook
31
star
11

go_attack

Python
31
star
12

ranking-challenge

Testing ranking algorithms to improve social cohesion
Python
27
star
13

atari-irl

Python
26
star
14

deep-rlsp

Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
Python
26
star
15

population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
Python
25
star
16

learning_biases

Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.
Jupyter Notebook
22
star
17

human_ai_robustness

Python
21
star
18

learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
Python
21
star
19

overcooked-hAI-exp

Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)
JavaScript
12
star
20

leela-interp

Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
Jupyter Notebook
11
star
21

better-adversarial-defenses

Training in bursts for defending against adversarial policies
Python
11
star
22

interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.
Jupyter Notebook
9
star
23

nn-clustering-pytorch

Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.
Python
6
star
24

reward-preprocessing

Preprocessing reward functions to make them more interpretable
Python
5
star
25

recon-email

Script for automatically creating the reconnaissance email.
HTML
5
star
26

assistance-games

Supporting code for Assistance Games as a Framework paper
Python
3
star
27

KataGo-custom

Child repository of https://github.com/HumanCompatibleAI/go_attack.
C++
3
star
28

reducing-exploitability

Python
3
star
29

KataGoVisualizer

Jupyter Notebook
2
star
30

multi-agent

Python
2
star
31

derail

Supporting code for diagnostic seals paper
Python
2
star
32

epic

Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.
Python
1
star
33

cs294-149-fa18-notes

LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control
TeX
1
star
34

simulation-awareness

(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world
Python
1
star
35

logical-active-classification

Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.
Python
1
star
36

reward-function-interpretability

Jupyter Notebook
1
star