There are no reviews yet. Be the first to send feedback to the community and the maintainers!
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)portwarden
Create Encrypted Backups of Your Bitwarden Vault with Attachmentsppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimizationlm-human-preference-details
RLHF implementation details of OAI's 2019 codebasecleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRLinvalid-action-masking
Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithmssummarize_from_feedback_details
PPO-Implementation-Deep-Dive
DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-detailsgym-microrts-paper
The source code for the gym-microrts paper.a2c_is_a_special_case_of_ppo
A2C is a special case of PPO!SC2AI
Integrated Tensorforce and OpenAI Gym to train SC II game agents.jupyter_disqus
Add Disqus to your Jupyter notebook.gym-pysc2
Gym wrapper for pysc2envpool-cleanrl
action-guidance
ppo-atari-metrics
vectorized-value-methods
[WIP] Vectorized architecture for value-based methods such as DQN and DDPGentity-ppo-demo
CS583FinalProject
Resume-master
minimal-adam-layer-norm-bug-repro
embedding_projector
RLControlSkipFrames
launcha
Launcha is a simple Docker-based cloud job launcher.gym_minigrid
CS618
validate-new-gym-mujoco-envs
vuetify-parallax-starter2
envpool-xla-cleanrl
cleanba-test
envpool_bug
Sentiment-Analysis-LSTM
Used neural network to classify movie reviews based on sentimentaws-sagemaker-example
LP_optimization_python
Linear Programming for Optimal Scheduling by Using GurobipyLove Open Source and this site? Check out how you can help us