• Stars
    star
    603
  • Rank 73,782 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Mastering Diverse Domains through World Models

Mastering Diverse Domains through World Models

A reimplementation of DreamerV3, a scalable and general reinforcement learning algorithm that masters a wide range of applications with fixed hyperparameters.

DreamerV3 Tasks

If you find this code useful, please reference in your paper:

@article{hafner2023dreamerv3,
  title={Mastering Diverse Domains through World Models},
  author={Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy},
  journal={arXiv preprint arXiv:2301.04104},
  year={2023}
}

To learn more:

DreamerV3

DreamerV3 learns a world model from experiences and uses it to train an actor critic policy from imagined trajectories. The world model encodes sensory inputs into categorical representations and predicts future representations and rewards given actions.

DreamerV3 Method Diagram

DreamerV3 masters a wide range of domains with a fixed set of hyperparameters, outperforming specialized methods. Removing the need for tuning reduces the amount of expert knowledge and computational resources needed to apply reinforcement learning.

DreamerV3 Benchmark Scores

Due to its robustness, DreamerV3 shows favorable scaling properties. Notably, using larger models consistently increases not only its final performance but also its data-efficiency. Increasing the number of gradient steps further increases data efficiency.

DreamerV3 Scaling Behavior

Instructions

Package

If you just want to run DreamerV3 on a custom environment, you can pip install dreamerv3 and copy example.py from this repository as a starting point.

Docker

If you want to make modifications to the code, you can either use the provided Dockerfile that contains instructions or follow the manual instructions below.

Manual

Install JAX and then the other dependencies:

pip install -r requirements.txt

Simple training script:

python example.py

Flexible training script:

python dreamerv3/train.py \
  --logdir ~/logdir/$(date "+%Y%m%d-%H%M%S") \
  --configs crafter --batch_size 16 --run.train_ratio 32

Tips

  • All config options are listed in configs.yaml and you can override them from the command line.
  • The debug config block reduces the network size, batch size, duration between logs, and so on for fast debugging (but does not learn a good model).
  • By default, the code tries to run on GPU. You can switch to CPU or TPU using the --jax.platform cpu flag. Note that multi-GPU support is untested.
  • You can run with multiple config blocks that will override defaults in the order they are specified, for example --configs crafter large.
  • By default, metrics are printed to the terminal, appended to a JSON lines file, and written as TensorBoard summaries. Other outputs like WandB can be enabled in the training script.
  • If you get a Too many leaves for PyTreeDef error, it means you're reloading a checkpoint that is not compatible with the current config. This often happens when reusing an old logdir by accident.
  • If you are getting CUDA errors, scroll up because the cause is often just an error that happened earlier, such as out of memory or incompatible JAX and CUDA versions.
  • You can use the small, medium, large config blocks to reduce memory requirements. The default is xlarge. See the scaling graph above to see how this affects performance.
  • Many environments are included, some of which require installating additional packages. See the installation scripts in scripts and the Dockerfile for reference.
  • When running on custom environments, make sure to specify the observation keys the agent should be using via encoder.mlp_keys, encode.cnn_keys, decoder.mlp_keys and decoder.cnn_keys.
  • To log metrics from environments without showing them to the agent or storing them in the replay buffer, return them as observation keys with log_ prefix and enable logging via the run.log_keys_... options.
  • To continue stopped training runs, simply run the same command line again and make sure that the --logdir points to the same directory.

Disclaimer

This repository contains a reimplementation of DreamerV3 based on the open source DreamerV2 code base. It is unrelated to Google or DeepMind. The implementation has been tested to reproduce the official results on a range of environments.

More Repositories

1

handout

Turn Python scripts into handouts with Markdown and figures
Python
1,994
star
2

dreamerv2

Mastering Atari with Discrete World Models
Python
770
star
3

dreamer

Dream to Control: Learning Behaviors by Latent Imagination
Python
456
star
4

crafter

Benchmarking the Spectrum of Agent Capabilities
Python
279
star
5

layered

Clean implementation of feed forward neural networks
Python
237
star
6

mindpark

Testbed for deep reinforcement learning
Python
161
star
7

daydreamer

DayDreamer: World Models for Physical Robot Learning
Jupyter Notebook
141
star
8

director

Deep Hierarchical Planning from Pixels
Python
60
star
9

embodied

Fast reinforcement learning research
Python
50
star
10

ninjax

General Modules for JAX
Python
45
star
11

computer-game

Data-oriented voxel game engine
C++
37
star
12

elements

Building blocks for productive research
Python
36
star
13

crafter-baselines

Docker containers of baseline agents for the Crafter environment
Python
25
star
14

sets

Read datasets in a standard way
Python
19
star
15

diamond_env

Standardized Minecraft Diamond Environment for Reinforcement Learning
Python
18
star
16

voxel-smoothing-2d

Orientation independent bรฉzier smoothing of voxel grids
C++
17
star
17

course-machine-intelligence-2

Jupyter Notebook
13
star
18

npgame

Write simple games in Numpy!
Python
12
star
19

dotfiles

My Linux and Mac configuration
Perl
12
star
20

semantic

Python
10
star
21

training-py

My solutions to programming puzzles
Python
8
star
22

imptools

Tools for improving Python imports
Python
8
star
23

bridgewalk

Visual reinforcement learning benchmark for controllability
Python
6
star
24

cowherd

Partially-observed visual reinforcement learning domain
Python
6
star
25

definitions

Load and validate YAML definitions against a schema
Python
5
star
26

map-pdf

Generate printable PDF documents from Leaflet maps
JavaScript
4
star
27

modurale

Modular real time engine for computer graphics applications
CMake
4
star
28

seminar-knowledge-mining

Wikimedia image classification and suggestings for article authors
Python
3
star
29

couse-ml-stanford

Programming assignments for the Stanford Machine Learning course by Andrew Ng
MATLAB
3
star
30

invoicepad

Freelancer solution covering time tracking, invoice generation and archiving
JavaScript
3
star
31

teleport

Efficiently send large arrays across machines
Python
2
star
32

training-ml

Python
2
star
33

chunkedfile

Save file writes into multiple chunks
Python
1
star
34

notebook-big-data

Jupyter Notebook
1
star
35

course-ml-fuberlin

Python
1
star
36

bookmarks-switcher

Chrome plugin to select which bookmarks folder to show as the bookmarks bar
JavaScript
1
star
37

training-cpp

My solutions to programming puzzles
C++
1
star
38

scope

Metrics logging and analysis
Python
1
star
39

jumper

Platformer and puzzle solving game written in Python
Python
1
star