PPO
PPO implementation for OpenAI gym environment based on Unity ML Agents: https://github.com/Unity-Technologies/ml-agents
Notable changes include:
- Ability to continuously display progress with non-stochastic policy during training
- Works with OpenAI environments
- Option to record episodes
- State normalization for given number of frames
- Frame skip
- Faster reward discounting etc.