rllab++
rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:
The codes are experimental, and may require tuning or modifications to reach the best reported performances.
Installation
Please follow the basic installation instructions in rllab documentation.
Examples
From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:
python algo_gym_stub.py --exp=<exp_name>
Flags include:
- algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), etc. See launcher_utils.py for more variants.
- env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.
The experiment will be saved in /data/local/<exp_name>.
Citations
If you use rllab++ for academic research, you are highly encouraged to cite the following papers:
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schoelkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". arXiv:1706.00387 [cs.LG], 2017.
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic" Proceedings of the International Conference on Learning Representations (ICLR), 2017.
- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.