There are no reviews yet. Be the first to send feedback to the community and the maintainers!
deep_rl_zoo
A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.alpha_zero
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board gamesInstructLLaMA
Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.muzero
A PyTorch implementation of DeepMind's MuZero agentSAP-UI5-Development-Re-Introduction
This is the official source code for Udemy course SAP UI5 Development Re-IntroductionminiGPT
Try to implement pre-training and fine-tuning GPT-2 model for research and education purpose.QLoRA-LLM
A simple custom QLoRA implementation for fine-tuning a language model (LLM) with basic tools such as PyTorch and Bitsandbytes, completely decoupled from Hugging Face.DPO-LLaMA
A clean implementation of direct preference optimization (DPO) to train the LLaMA 2 model to align with human preferences.Love Open Source and this site? Check out how you can help us