Discover HumanCompatibleAI/nn-clustering-pytorch Open Source project

Stars
6
Rank 2,539,965 (Top 51 %)
Language
Python
Created about 4 years ago
Updated over 1 year ago

HumanCompatibleAI

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.

imitation

Clean PyTorch implementations of imitation and reward learning algorithms

Python

1,294

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

Jupyter Notebook

706

adversarial-policies

Find best-response to a fixed policy in multi-agent RL

Python

272

human_aware_rl

Code for "On the Utility of Learning about Humans for Human-AI Coordination"

Python

107

evaluating-rewards

Library to compare and evaluate reward functions

Python

overcooked-demo

Web application where humans can play Overcooked with AI agents.

JavaScript

seals

Benchmark environments for reward modelling and imitation learning algorithms.

Python

rlsp

Reward Learning by Simulating the Past

Python

tensor-trust

A prompt injection game to collect data for robust ML research

Python

eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21

Python

tensor-trust-data

Dataset for the Tensor Trust project

Jupyter Notebook

go_attack

Python

ranking-challenge

Testing ranking algorithms to improve social cohesion

Python

atari-irl

Python

deep-rlsp

Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.

Python

population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards

Python

learning_biases

Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.

Jupyter Notebook

human_ai_robustness

Python

learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

Python

overcooked-hAI-exp

Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)

JavaScript

leela-interp

Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"

Jupyter Notebook

better-adversarial-defenses

Training in bursts for defending against adversarial policies

Python

interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.

Jupyter Notebook

reward-preprocessing

Preprocessing reward functions to make them more interpretable

Python

recon-email

Script for automatically creating the reconnaissance email.

HTML

assistance-games

Supporting code for Assistance Games as a Framework paper

Python

KataGo-custom

Child repository of https://github.com/HumanCompatibleAI/go_attack.

C++

reducing-exploitability

Python

KataGoVisualizer

Jupyter Notebook

multi-agent

Python

derail

Supporting code for diagnostic seals paper

Python

epic

Implements the Equivalent-Policy Invariant Comparison (EPIC) distance for reward functions.

Python

cs294-149-fa18-notes

LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control

TeX

simulation-awareness

(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world

Python

logical-active-classification

Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.

Python

reward-function-interpretability

Jupyter Notebook

HumanCompatibleAI/nn-clustering-pytorch

HumanCompatibleAI

Reviews

Repository Details

More Repositories