Explore @HumanCompatibleAI Open Source projects

Center for Human-Compatible AI (@HumanCompatibleAI)

HumanCompatibleAI

Stars
2,888
Global Org. Rank 7,217 (Top 3 %)
Registered about 7 years ago
Most used languages

Python
67.6 %

Jupyter Notebook
18.9 %

JavaScript
5.4 %

C++
2.7 %

HTML
2.7 %

TeX
2.7 %

imitation

Clean PyTorch implementations of imitation and reward learning algorithms

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

Jupyter Notebook

adversarial-policies

Find best-response to a fixed policy in multi-agent RL

human_aware_rl

Code for "On the Utility of Learning about Humans for Human-AI Coordination"

evaluating-rewards

Library to compare and evaluate reward functions

overcooked-demo

Web application where humans can play Overcooked with AI agents.

seals

Benchmark environments for reward modelling and imitation learning algorithms.

rlsp

Reward Learning by Simulating the Past

tensor-trust

A prompt injection game to collect data for robust ML research

eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21

tensor-trust-data

Dataset for the Tensor Trust project

Jupyter Notebook

go_attack

ranking-challenge

Testing ranking algorithms to improve social cohesion

atari-irl

deep-rlsp

Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.

population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards

learning_biases

Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.

Jupyter Notebook

human_ai_robustness

learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

overcooked-hAI-exp

Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)

leela-interp

Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"

Jupyter Notebook

better-adversarial-defenses

Training in bursts for defending against adversarial policies

interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.

Jupyter Notebook

nn-clustering-pytorch

Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.

reward-preprocessing

Preprocessing reward functions to make them more interpretable

recon-email

Script for automatically creating the reconnaissance email.

assistance-games

Supporting code for Assistance Games as a Framework paper

KataGo-custom

Child repository of https://github.com/HumanCompatibleAI/go_attack.

reducing-exploitability

KataGoVisualizer

Jupyter Notebook

multi-agent

derail

Supporting code for diagnostic seals paper

epic

Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.

cs294-149-fa18-notes

LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control

simulation-awareness

(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world

logical-active-classification

Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.

reward-function-interpretability

Jupyter Notebook