Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Scala

Ruby

CoffeeScript

Go

Lua

C++

PHP

Nix

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

JavaScript

Nix

C

Kotlin

Java

Groovy

Dart

Erlang

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇦🇩 Andorra

🇵🇪 Peru

🇹🇳 Tunisia

🇳🇴 Norway

🇮🇷 Iran

🇪🇭 Western Sahara

🇷🇸 Serbia

🇧🇴 Bolivia

All Countries Compare Countries

uvipen/Contra-PPO-pytorch

Stars
132
Rank 274,205 (Top 6 %)
Language
Python
Created over 5 years ago
Updated about 1 year ago

uvipen/Contra-PPO-pytorch

uvipen

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Proximal Policy Optimization (PPO) algorithm for Contra

[PYTORCH] Proximal Policy Optimization (PPO) for Contra Nes

Introduction

Here is my python source code for training an agent to play contra nes. By using Proximal Policy Optimization (PPO) algorithm introduced in the paper Proximal Policy Optimization Algorithms paper.

For your information, PPO is the algorithm proposed by OpenAI and used for training OpenAI Five, which is the first AI to beat the world champions in an esports game. Specifically, The OpenAI Five dispatched a team of casters and ex-pros with MMR rankings in the 99.95th percentile of Dota 2 players in August 2018.

Sample result

Motivation

It has been a while since I have released my A3C implementation (A3C code) and PPO implementation (PPO code) for training an agent to play super mario bros. Since PPO outperforms A3C in the number of levels completed, as a next step, I want to see how the former performs in another famous NES game: Contra

How to use my code

With my code, you can:

Train your model by running python train.py. For example: python train.py --level 1 --lr 1e-4
Test your trained model by running python test.py. For example: python test.py --level 1

Docker

For being convenient, I provide Dockerfile which could be used for running training as well as test phases

Assume that docker image's name is ppo. You only want to use the first gpu. You already clone this repository and cd into it.

Build:

sudo docker build --network=host -t ppo .

Run:

docker run --runtime=nvidia -it --rm --volume="$PWD"/../Contra-PPO-pytorch:/Contra-PPO-pytorch --gpus device=0 ppo

Then inside docker container, you could simply run train.py or test.py scripts as mentioned above.

Note: There is a bug for rendering when using docker. Therefore, when you train or test by using docker, please comment line env.render() on script src/process.py for training or test.py for test. Then, you will not be able to see the window pop up for visualization anymore. But it is not a big problem, since the training process will still run, and the test process will end up with an output mp4 file for visualization

ASCII-generator

ASCII generator (image to text, image to image, video to video)

Super-mario-bros-PPO-pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros

Super-mario-bros-A3C-pytorch

Asynchronous Advantage Actor-Critic (A3C) algorithm for Super Mario Bros

QuickDraw

Implementation of Quickdraw - an online game developed by Google

Flappy-bird-deep-Q-learning-pytorch

Deep Q-learning for playing flappy bird game

Tetris-deep-Q-learning-pytorch

Deep Q-learning for playing tetris game

AirGesture

Play games without touching keyboard

Hierarchical-attention-networks-pytorch

Hierarchical Attention Networks for document classification

Yolo-v2-pytorch

YOLO for object detection tasks

Photomosaic-generator

photomosaic generator (image to image, video to video)

SSD-pytorch

SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity

Street-fighter-A3C-ICM-pytorch

Curiosity-driven Exploration by Self-supervised Prediction for Street Fighter III Third Strike

Lego-generator

QuickDraw-AirGesture-tensorflow

Implementation of QuickDraw - an online game developed by Google, combined with AirGesture - a simple gesture recognition application

Chrome-dino-deep-Q-learning-pytorch

Deep Q-learning for playing chrome dino game

Deeplab-pytorch

Deeplab for semantic segmentation tasks

Character-level-cnn-pytorch

Character-level CNN for text classification

Very-deep-cnn-pytorch

Very deep CNN for text classification

Character-level-cnn-tensorflow

Character-level CNN for text classification

Sonic-PPO-pytorch

Proximal Policy Optimization (PPO) algorithm for Sonic the Hedgehog

uvipen

Very-deep-cnn-tensorflow

Very deep CNN for text classification

Color-lines-deep-Q-learning-pytorch

MathFun

The-beauty-of-Math

Detectors

Vietnam-time-use-visualization