There are no reviews yet. Be the first to send feedback to the community and the maintainers!
deep_rl_zoo
A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.InstructLLaMA
Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.muzero
A PyTorch implementation of DeepMind's MuZero agentSAP-UI5-Development-Re-Introduction
This is the official source code for Udemy course SAP UI5 Development Re-IntroductionminiGPT
Try to implement pre-training and fine-tuning GPT-2 model for research and education purpose.MM-LLaMA
Bring multimodality to the LLaMA model by leveraging ImageBind as the modal encoder. This project supports vision input (both images and short videos) to the LLaMA model, with text output generated by LLaMA.QLoRA-LLM
A simple custom QLoRA implementation for fine-tuning a language model (LLM) with basic tools such as PyTorch and Bitsandbytes, completely decoupled from Hugging Face.DPO-LLaMA
A clean implementation of direct preference optimization (DPO) to train the LLaMA 2 model to align with human preferences.Love Open Source and this site? Check out how you can help us