• Stars
    star
    43
  • Rank 645,449 (Top 13 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Reward Learning by Simulating the Past

More Repositories

1

imitation

Clean PyTorch implementations of imitation and reward learning algorithms
Python
1,294
star
2

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.
Jupyter Notebook
706
star
3

adversarial-policies

Find best-response to a fixed policy in multi-agent RL
Python
272
star
4

human_aware_rl

Code for "On the Utility of Learning about Humans for Human-AI Coordination"
Python
107
star
5

evaluating-rewards

Library to compare and evaluate reward functions
Python
61
star
6

overcooked-demo

Web application where humans can play Overcooked with AI agents.
JavaScript
55
star
7

seals

Benchmark environments for reward modelling and imitation learning algorithms.
Python
44
star
8

tensor-trust

A prompt injection game to collect data for robust ML research
Python
40
star
9

eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
Python
36
star
10

tensor-trust-data

Dataset for the Tensor Trust project
Jupyter Notebook
31
star
11

go_attack

Python
31
star
12

ranking-challenge

Testing ranking algorithms to improve social cohesion
Python
27
star
13

atari-irl

Python
26
star
14

deep-rlsp

Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
Python
26
star
15

population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
Python
25
star
16

learning_biases

Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.
Jupyter Notebook
22
star
17

human_ai_robustness

Python
21
star
18

learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
Python
21
star
19

overcooked-hAI-exp

Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)
JavaScript
12
star
20

leela-interp

Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
Jupyter Notebook
11
star
21

better-adversarial-defenses

Training in bursts for defending against adversarial policies
Python
11
star
22

interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.
Jupyter Notebook
9
star
23

nn-clustering-pytorch

Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.
Python
6
star
24

reward-preprocessing

Preprocessing reward functions to make them more interpretable
Python
5
star
25

recon-email

Script for automatically creating the reconnaissance email.
HTML
5
star
26

assistance-games

Supporting code for Assistance Games as a Framework paper
Python
3
star
27

KataGo-custom

Child repository of https://github.com/HumanCompatibleAI/go_attack.
C++
3
star
28

reducing-exploitability

Python
3
star
29

KataGoVisualizer

Jupyter Notebook
2
star
30

multi-agent

Python
2
star
31

derail

Supporting code for diagnostic seals paper
Python
2
star
32

epic

Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.
Python
1
star
33

cs294-149-fa18-notes

LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control
TeX
1
star
34

simulation-awareness

(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world
Python
1
star
35

logical-active-classification

Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.
Python
1
star
36

reward-function-interpretability

Jupyter Notebook
1
star