There are no reviews yet. Be the first to send feedback to the community and the maintainers!
imitation
Clean PyTorch implementations of imitation and reward learning algorithmsovercooked_ai
A benchmark environment for fully cooperative human-AI performance.adversarial-policies
Find best-response to a fixed policy in multi-agent RLhuman_aware_rl
Code for "On the Utility of Learning about Humans for Human-AI Coordination"evaluating-rewards
Library to compare and evaluate reward functionsovercooked-demo
Web application where humans can play Overcooked with AI agents.seals
Benchmark environments for reward modelling and imitation learning algorithms.rlsp
Reward Learning by Simulating the Pasteirli
An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21tensor-trust-data
Dataset for the Tensor Trust projectgo_attack
ranking-challenge
Testing ranking algorithms to improve social cohesionatari-irl
deep-rlsp
Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewardslearning_biases
Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.human_ai_robustness
learning-from-human-preferences
Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"overcooked-hAI-exp
Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)leela-interp
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"better-adversarial-defenses
Training in bursts for defending against adversarial policiesinterpreting-rewards
Experiments in applying interpretability techniques to learned reward functions.nn-clustering-pytorch
Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.reward-preprocessing
Preprocessing reward functions to make them more interpretablerecon-email
Script for automatically creating the reconnaissance email.assistance-games
Supporting code for Assistance Games as a Framework paperKataGo-custom
Child repository of https://github.com/HumanCompatibleAI/go_attack.reducing-exploitability
KataGoVisualizer
multi-agent
derail
Supporting code for diagnostic seals paperepic
Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.cs294-149-fa18-notes
LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Controlsimulation-awareness
(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real worldlogical-active-classification
Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.reward-function-interpretability
Love Open Source and this site? Check out how you can help us