• Stars
    star
    6
  • Rank 2,526,808 (Top 51 %)
  • Language
    Python
  • Created almost 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.

More Repositories

1

imitation

Clean PyTorch implementations of imitation and reward learning algorithms
Python
1,264
star
2

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.
Jupyter Notebook
685
star
3

adversarial-policies

Find best-response to a fixed policy in multi-agent RL
Python
272
star
4

human_aware_rl

Code for "On the Utility of Learning about Humans for Human-AI Coordination"
Python
107
star
5

evaluating-rewards

Library to compare and evaluate reward functions
Python
61
star
6

overcooked-demo

Web application where humans can play Overcooked with AI agents.
JavaScript
55
star
7

seals

Benchmark environments for reward modelling and imitation learning algorithms.
Python
44
star
8

rlsp

Reward Learning by Simulating the Past
Python
43
star
9

tensor-trust

A prompt injection game to collect data for robust ML research
Python
39
star
10

eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
Python
36
star
11

go_attack

Python
31
star
12

tensor-trust-data

Dataset for the Tensor Trust project
Jupyter Notebook
29
star
13

atari-irl

Python
26
star
14

deep-rlsp

Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
Python
26
star
15

ranking-challenge

Testing ranking algorithms to improve social cohesion
Python
25
star
16

population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
Python
25
star
17

learning_biases

Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.
Jupyter Notebook
22
star
18

human_ai_robustness

Python
21
star
19

learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
Python
21
star
20

overcooked-hAI-exp

Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)
JavaScript
12
star
21

leela-interp

Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
Jupyter Notebook
11
star
22

better-adversarial-defenses

Training in bursts for defending against adversarial policies
Python
11
star
23

interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.
Jupyter Notebook
9
star
24

reward-preprocessing

Preprocessing reward functions to make them more interpretable
Python
5
star
25

recon-email

Script for automatically creating the reconnaissance email.
HTML
5
star
26

assistance-games

Supporting code for Assistance Games as a Framework paper
Python
3
star
27

KataGo-custom

Child repository of https://github.com/HumanCompatibleAI/go_attack.
C++
3
star
28

reducing-exploitability

Python
3
star
29

KataGoVisualizer

Jupyter Notebook
2
star
30

multi-agent

Python
2
star
31

derail

Supporting code for diagnostic seals paper
Python
2
star
32

epic

Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.
Python
1
star
33

cs294-149-fa18-notes

LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control
TeX
1
star
34

simulation-awareness

(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world
Python
1
star
35

logical-active-classification

Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.
Python
1
star
36

reward-function-interpretability

Jupyter Notebook
1
star