• Stars
    star
    1,256
  • Rank 37,419 (Top 0.8 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A collection of reference environments for offline reinforcement learning

D4RL is an open-source benchmark for offline reinforcement learning. It provides standardized environments and datasets for training and benchmarking algorithms. A supplementary whitepaper and website are also available.

The current maintenance plan for this library is:

  1. Pull the majority of the environments out of D4RL, fix the long standing bugs, and have them depend on the new MuJoCo bindings. The majority of the environments housed in D4RL were already maintained projects in Farama, and all the ones that aren't will be going into Gymnasium-Robotics, a standard library for housing many different Robotics environments. There are some envrionments that we don't plan to maintain, noteably the PyBullet ones (MuJoCo is not maintained and open source and PyBullet is now longer maintained) and Flow (it was never really used and the original author's don't view it as especially valuable).
  2. Recreate all the datasets in D4RL given the revised versions of environments, and host them in a standard offline RL dataset repostiry we're working on called Minari.

Setup

D4RL can be installed by cloning the repository as follows:

git clone https://github.com/Farama-Foundation/d4rl.git
cd d4rl
pip install -e .

Or, alternatively:

pip install git+https://github.com/Farama-Foundation/d4rl@master#egg=d4rl

The control environments require MuJoCo as a dependency. You may need to obtain a license and follow the setup instructions for mujoco_py. This mostly involves copying the key to your MuJoCo installation folder.

The Flow and CARLA tasks also require additional installation steps:

  • Instructions for installing CARLA can be found here
  • Instructions for installing Flow can be found here. Make sure to install using the SUMO simulator, and add the flow repository to your PYTHONPATH once finished.

Using d4rl

d4rl uses the OpenAI Gym API. Tasks are created via the gym.make function. A full list of all tasks is available here.

Each task is associated with a fixed offline dataset, which can be obtained with the env.get_dataset() method. This method returns a dictionary with:

  • observations: An N by observation dimensional array of observations.
  • actions: An N by action dimensional array of actions.
  • rewards: An N dimensional array of rewards.
  • terminals: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.
  • timeouts: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.
  • infos: Contains optional task-specific debugging information.

You can also load data using d4rl.qlearning_dataset(env), which formats the data for use by typical Q-learning algorithms by adding a next_observations key.

import gym
import d4rl # Import required to register environments, you may need to also import the submodule

# Create the environment
env = gym.make('maze2d-umaze-v1')

# d4rl abides by the OpenAI gym interface
env.reset()
env.step(env.action_space.sample())

# Each task is associated with a dataset
# dataset contains observations, actions, rewards, terminals, and infos
dataset = env.get_dataset()
print(dataset['observations']) # An N x dim_observation Numpy array of observations

# Alternatively, use d4rl.qlearning_dataset which
# also adds next_observations.
dataset = d4rl.qlearning_dataset(env)

Datasets are automatically downloaded to the ~/.d4rl/datasets directory when get_dataset() is called. If you would like to change the location of this directory, you can set the $D4RL_DATASET_DIR environment variable to the directory of your choosing, or pass in the dataset filepath directly into the get_dataset method.

Normalizing Scores

You can use the env.get_normalized_score(returns) function to compute a normalized score for an episode, where returns is the undiscounted total sum of rewards accumulated during an episode.

The individual min and max reference scores are stored in d4rl/infos.py for reference.

Algorithm Implementations

We have aggregated implementations of various offline RL algorithms in a separate repository.

Off-Policy Evaluations

D4RL currently has limited support for off-policy evaluation methods, on a select few locomotion tasks. We provide trained reference policies and a set of performance metrics. Additional details can be found in the wiki.

Recent Updates

2-12-2020

  • Added new Gym-MuJoCo datasets (labeled v2) which fixed Hopper's performance and the qpos/qvel fields.
  • Added additional wiki documentation on generating datasets.

Acknowledgements

D4RL builds on top of several excellent domains and environments built by various researchers. We would like to thank the authors of:

Citation

Please use the following bibtex for citations:

@misc{fu2020d4rl,
    title={D4RL: Datasets for Deep Data-Driven Reinforcement Learning},
    author={Justin Fu and Aviral Kumar and Ofir Nachum and George Tucker and Sergey Levine},
    year={2020},
    eprint={2004.07219},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Licenses

Unless otherwise noted, all datasets are licensed under the Creative Commons Attribution 4.0 License (CC BY), and code is licensed under the Apache 2.0 License.

More Repositories

1

Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Python
6,383
star
2

PettingZoo

An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
Python
2,553
star
3

HighwayEnv

A minimalist environment for decision-making in autonomous driving
Python
2,506
star
4

Arcade-Learning-Environment

The Arcade Learning Environment (ALE) -- a platform for AI research.
C++
2,106
star
5

Minigrid

Simple and easily configurable grid world environments for reinforcement learning
Python
2,051
star
6

ViZDoom

Reinforcement Learning environments based on the 1993 game Doom :godmode:
C++
1,723
star
7

chatarena

ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Python
1,344
star
8

Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
Python
1,178
star
9

Miniworld

Simple and easily configurable 3D FPS-game-like environments for reinforcement learning
Python
683
star
10

Gymnasium-Robotics

A collection of robotics simulation environments for reinforcement learning
Python
489
star
11

SuperSuit

A collection of wrappers for Gymnasium and PettingZoo environments (being merged into gymnasium.wrappers and pettingzoo.wrappers
Python
449
star
12

MO-Gymnasium

Multi-objective Gymnasium environments for reinforcement learning
Python
282
star
13

miniwob-plusplus

MiniWoB++: a web interaction benchmark for reinforcement learning
HTML
276
star
14

MicroRTS

A simple and highly efficient RTS-game-inspired environment for reinforcement learning
Java
271
star
15

Minari

A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities
Python
268
star
16

MicroRTS-Py

A simple and highly efficient RTS-game-inspired environment for reinforcement learning (formerly Gym-MicroRTS)
Python
219
star
17

MAgent2

An engine for high performance multi-agent environments with very large numbers of agents, along with a set of reference environments
C++
202
star
18

D4RL-Evaluations

Python
187
star
19

stable-retro

Retro games for Reinforcement Learning
C
146
star
20

Shimmy

An API conversion tool for popular external reinforcement learning environments
Python
129
star
21

AutoROM

A tool to automate installing Atari ROMs for the Arcade Learning Environment
Python
75
star
22

gym-examples

Example code for the Gym documentation
Python
68
star
23

momaland

Benchmarks for Multi-Objective Multi-Agent Decision Making
Python
58
star
24

Jumpy

On-the-fly conversions between Jax and NumPy tensors
Python
45
star
25

gym-docs

Code for Gym documentation website
41
star
26

Procgen2

Fast and procedurally generated side-scroller-game-like graphical environments (formerly Procgen)
C++
27
star
27

CrowdPlay

A web based platform for collecting human actions in reinforcement learning environments
Jupyter Notebook
26
star
28

TinyScaler

A small and fast image rescaling library with SIMD support
C
19
star
29

minari-dataset-generation-scripts

Scripts to recreate the D4RL datasets with Minari
Python
15
star
30

rlay

A relay between Gymnasium and any software
Rust
8
star
31

gymnasium-env-template

A template gymnasium environment for users to build upon
Jinja
7
star
32

A2Perf

A2Perf is a benchmark for evaluating agents on sequential decision problems that are relevant to the real world. This repository contains code for running and evaluating participant's submissions on the benchmark platform.
Python
4
star
33

farama.org

HTML
2
star
34

gym-notices

Python
1
star
35

Celshast

Sass
1
star
36

MPE2

A set of communication oriented environments
Python
1
star
37

Farama-Notifications

Allows for providing notifications on import to all Farama Packages
Python
1
star
38

a2perf-circuit-training

Python
1
star
39

a2perf-benchmark-submission

Python
1
star
40

a2perf-web-nav

HTML
1
star
41

a2perf-quadruped-locomotion

Python
1
star
42

a2perf-reliability-metrics

Python
1
star
43

a2perf-code-carbon

Jupyter Notebook
1
star