mila-iqia/spr

Stars
155
Rank 240,864 (Top 5 %)
Language
Python
License
MIT License
Created over 5 years ago
Updated almost 3 years ago

mila-iqia/spr

mila-iqia

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Code for "Data-Efficient Reinforcement Learning with Self-Predictive Representations"

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer*, Ankesh Anand*, Rishab Goel, R Devon Hjelm, Aaron Courville, Philip Bachman

This repo provides code for implementing the SPR paper

📦 Install -- Install relevant dependencies and the project
🔧 Usage -- Commands to run different experiments from the paper

Install

To install the requirements, follow these steps:

# PyTorch
conda install pytorch torchvision -c pytorch
export LC_ALL=C.UTF-8
export LANG=C.UTF-8

# Install requirements
pip install -r requirements.txt

# Finally, clone the project
git clone https://github.com/mila-iqia/spr

Usage:

The default branch for the latest and stable changes is release.

To run SPR with augmentation

python -m scripts.run --public --game pong --momentum-tau 1.

To run SPR without augmentation

python -m scripts.run --public --game pong --augmentation none --target-augmentation 0 --momentum-tau 0.01 --dropout 0.5

When reporting scores, we average across 10 seeds.

What does each file do?

.
├── scripts
│   └── run.py                # The main runner script to launch jobs.
├── src                     
│   ├── agent.py              # Implements the Agent API for action selection 
│   ├── algos.py              # Distributional RL loss
│   ├── models.py             # Network architecture and forward passes.
│   ├── rlpyt_atari_env.py    # Slightly modified Atari env from rlpyt
│   ├── rlpyt_utils.py        # Utility methods that we use to extend rlpyt's functionality
│   └── utils.py              # Command line arguments and helper functions 
│
└── requirements.txt          # Dependencies

blocks

A Theano framework for building and training neural networks

welcome_tutorials

Various tutorials given for welcoming new students at MILA.

Jupyter Notebook

fuel

A data pipeline framework for machine learning

babyai

BabyAI platform. A testbed for training agents to understand and execute language commands.

myia

Myia prototyping

summerschool2015

Slides and exercises for the Deep Learning Summer School 2015 programming tutorials

Jupyter Notebook

atari-representation-learning

Code for "Unsupervised State Representation Learning in Atari"

platoon

Multi-GPU mini-framework for Theano

blocks-examples

Examples and scripts using Blocks

summerschool2016

Montréal Deep Learning Summer School 2016 material

Jupyter Notebook

paperoni

Search for scientific papers on the command line

summerschool2017

Material for the Montréal Deep Learning Summer School 2017

Jupyter Notebook

gene-graph-conv

Towards Gene Expression Convolutions using Gene Interaction Graphs

Jupyter Notebook

milatools

Tools to connect to and interact with the Mila cluster

Conscious-Planning

Implementation for paper "A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning".

SGI

Official code for "Pretraining Representations For Data-Efficient Reinforcement Learning" (NeurIPS 2021)

ddxplus

picklable-itertools

itertools. But picklable.

climate-cooperation-competition

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N. ai4climatecoop.org

ivado-mila-dl-school-2019

IVADO/ Mila's Summer Deep Learning School

Jupyter Notebook

ivado-mila-dl-school-2021

Jupyter Notebook

blocks-extras

A collection of extensions to the Blocks framework

DeepDrummer

Making the world a better place through AI-generated beats & grooves

covid_p2p_risk_prediction

COVID19 P2P Risk Prediction Model & Dataset

COVI-AgentSim

Covid-19 spread simulator with human mobility and intervention modeling.

Jupyter Notebook

Skipper

A PyTorch Implementation of Skipper

cookiecutter-pyml

milabench

Repository of machine learning benchmarks

snektalk

dlschool-ivadofr-a18

Ecole Mila/IVADO

Jupyter Notebook

COVI-ML

Risk model training code for Covid-19 tracing application.

teamgrid

Multiagent gridworld for the TEAM project based on gym-minigrid

ivado-mila-dl-school-2019-vancouver

Jupyter Notebook

mila-paper-webpage

Webpage template for MILA-affiliated papers

dlschool-ivadofr-h18

Ivado École d'hiver IVADO/MILA en apprentissage profond 2018

Jupyter Notebook

giving

Reactive logging

training

mila-docs

Mila technical documentation

Casande-RL

hardpicks

Deep learning dataset and benchmark for first-break detection from hardrock seismic reflection data

ptera

Query and override internal variables in your programs

ResearchTemplate

WIP: Research Template Repository

mila_datamodules

Efficient Datamodules Customized for the Mila / CC clusters

digit-detection

IFT6759 - Advanced projects in machine learning (Door Number Detection project)

Humanitarian_R-D

Jupyter Notebook

ansible-role-clockwork

Ansible role to install and configure clockwork

SARC

ansible-role-cobbler

Install and configure Cobbler service

slurm-queue-time-pred

Slurm wait time prediction

diffusion_for_multi_scale_molecular_dynamics

cableinspect-ad-code

Code to prepare data and reproduce results from CableInspect-AD paper

clockwork

Simple metrics to monitor slurm and produce reports.

ansible-role-infiniband

Ansible role to configure InfiniBand interfaces

tensorflow_dataloader

bcachefs

C implementation with Python 3.7 bindings of the BCacheFS

ansible-collection-proxmox

Ansible Collection to manage containers and virtual machines with Proxmox VE

mila-docs-chatbot