Deep Deterministic Policy Gradient on PyTorch
Overview
The is the implementation of Deep Deterministic Policy Gradient (DDPG) using PyTorch. Part of the utilities functions such as replay buffer and random process are from keras-rl repo. Contributes are very welcome.
Dependencies
- Python 3.4
- PyTorch 0.1.9
- OpenAI Gym
Run
Training : results of two environment and their training curves:
- Pendulum-v0
$ ./main.py --debug
- MountainCarContinuous-v0
$ ./main.py --env MountainCarContinuous-v0 --validate_episodes 100 --max_episode_length 2500 --ou_sigma 0.5 --debug
Testing :
$ ./main.py --mode test --debug