Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

OCaml

Crystal

CSS

Nix

Shell

Zig

R

CoffeeScript

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Elixir

Elm

Scala

Rust

Shell

Ruby

Perl

C

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇪🇹 Ethiopia

🇷🇼 Rwanda

🇸🇮 Slovenia

🇳🇷 Nauru

🇼🇸 Samoa

🇸🇦 Saudi Arabia

🇧🇦 Bosnia and Herzegovina

🇲🇰 North Macedonia

All Countries Compare Countries

vwxyzjn/invalid-action-masking

Stars
88
Rank 375,465 (Top 8 %)
Language
Python
License
MIT License
Created over 4 years ago
Updated over 1 year ago

vwxyzjn/invalid-action-masking

vwxyzjn

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

portwarden

Create Encrypted Backups of Your Bitwarden Vault with Attachments

ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

lm-human-preference-details

RLHF implementation details of OAI's 2019 codebase

cleanba

CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

summarize_from_feedback_details

PPO-Implementation-Deep-Dive

DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details

gym-microrts-paper

The source code for the gym-microrts paper.

a2c_is_a_special_case_of_ppo

A2C is a special case of PPO!

SC2AI

Integrated Tensorforce and OpenAI Gym to train SC II game agents.

Jupyter Notebook

jupyter_disqus

Add Disqus to your Jupyter notebook.

gym-pysc2

Gym wrapper for pysc2

envpool-cleanrl

action-guidance

ppo-atari-metrics

vectorized-value-methods

[WIP] Vectorized architecture for value-based methods such as DQN and DDPG

entity-ppo-demo

CS583FinalProject

Resume-master

minimal-adam-layer-norm-bug-repro

embedding_projector

RLControlSkipFrames

launcha

Launcha is a simple Docker-based cloud job launcher.

gym_minigrid

CS618

Jupyter Notebook

validate-new-gym-mujoco-envs

vuetify-parallax-starter2

envpool-xla-cleanrl

cleanba-test

envpool_bug

Sentiment-Analysis-LSTM

Used neural network to classify movie reviews based on sentiment

Jupyter Notebook

aws-sagemaker-example

Jupyter Notebook

LP_optimization_python

Linear Programming for Optimal Scheduling by Using Gurobipy

CS583