miyosuda/unreal

Stars
415
Rank 104,301 (Top 3 %)
Language
Python
License
Other
Created about 8 years ago
Updated almost 6 years ago

miyosuda/unreal

miyosuda

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Reinforcement learning with unsupervised auxiliary tasks

UNREAL

About

Replicating UNREAL algorithm described in Google Deep Mind's paper "Reinforcement learning with unsupervised auxiliary tasks."

https://arxiv.org/pdf/1611.05397.pdf

Implemented with TensorFlow and DeepMind Lab environment.

Preview

seekavoid_arena_01

stairway_to_melon

nav_maze_static_01

Network

All weights of convolution layers and LSTM layer are shared.

Requirements

TensorFlow (Tested with r1.0)
DeepMind Lab
numpy
cv2
pygame
matplotlib

Result

"seekavoid_arena_01" Level

"nav_maze_static_01" Level

How to train

First, download and install DeepMind Lab

$ git clone https://github.com/deepmind/lab.git

Then build it following the build instruction. https://github.com/deepmind/lab/blob/master/docs/build.md

Clone this repo in lab directory.

$ cd lab
$ git clone https://github.com/miyosuda/unreal.git

Add this bazel instruction at the end of lab/BUILD file

package(default_visibility = ["//visibility:public"])

Then run bazel command to run training.

bazel run //unreal:train --define headless=glx

--define headlesss=glx uses GPU rendering and it requires display not to sleep. (We need to disable display sleep.)

If you have any trouble with GPU rendering, please use software rendering with --define headless=osmesa option.

How to show result

To show result after training, run this command.

bazel run //unreal:display --define headless=glx

async_deep_reinforce

Asynchronous Methods for Deep Reinforcement Learning

TensorFlowAndroidDemo

TensorFlow Android stand-alone demo

TensorFlowAndroidMNIST

Tensorflow MNIST demo on Android

disentangled_vae

Replicating "Understanding disentangling in β-VAE"

scan

SCAN: Learning Abstract Hierarchical Compositional Visual Concepts

heartrate-monitor

Heart rate variability (HRV) analysis tool to detect autonomic nerve state

predictive_coding

Predictive Coding in the Visual Cortex: a Functional Interpretation of Some Extra-classical Receptive-field Effects

Jupyter Notebook

rat_grid

Vector-based navigation using grid-like representations in artificial agents

Jupyter Notebook

rodentia

3D learning environment with rigid body simulation for Linux/MacOSX

episodic_control

Model-Free Episodic Control

evolution_and_ai

intro-to-dl-android

Jinnan Android Meetup Vol.1 "Androidで動かすはじめてのDeep Learning"

narr-note

Fast math note-taking tool with Tex notation for MacOSX.

can

Continous Attractor Network Model

Jupyter Notebook

dendritic_bp

Dendritic error backpropagation in deep cortical microcircuits

Jupyter Notebook

snmf

Hebbian/Anti-Hebbian Network for Online NMF

Jupyter Notebook

narr-map

A minimal mind map editor

hvrnn

Hierarchical variational autoencoder

reinforcement_learning_samples

samples of reinforcement learning

manimalai

Easy-to-use Animal-AI clone environment

tinymac

Tiny old mac (Macintosh 128K) like hardware

intro-to-dl2