Normalized Advantage Functions (NAF) in TensorFlow
TensorFlow implementation of Continuous Deep q-Learning with Model-based Acceleration.
Requirements
- Python 2.7
- gym
- TensorFlow 0.9+
Usage
First, install prerequisites with:
$ pip install tqdm gym[all]
To train a model for an environment with a continuous action space:
$ python main.py --env_name=Pendulum-v0 --is_train=True
$ python main.py --env_name=Pendulum-v0 --is_train=True --display=True
To test and record the screens with gym:
$ python main.py --env_name=Pendulum-v0 --is_train=False
$ python main.py --env_name=Pendulum-v0 --is_train=False --display=True
Results
Training details of Pendulum-v0
with different hyperparameters.
$ python main.py --env_name=Pendulum-v0 # dark green
$ python main.py --env_name=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env_name=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env_name=Pendulum-v0 --use_seperate_networks=True # green
References
Author
Taehoon Kim / @carpedm20