• Stars
    star
    2
  • Language
    Python
  • License
    MIT License
  • Created 11 months ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fused 4bit AdamW in Cuda

More Repositories

1

Ranger-Deep-Learning-Optimizer

Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase
Python
1,154
star
2

Ranger21

Ranger deep learning optimizer rewrite to use newest components
Python
320
star
3

Best-Deep-Learning-Optimizers

Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable
Jupyter Notebook
201
star
4

mish

Mish Deep Learning Activation Function for PyTorch / FastAI
Jupyter Notebook
161
star
5

res2net-plus

Res2Net architecture with improved stem and Mish activation function
Python
136
star
6

Ranger-Mish-ImageWoof-5

Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs
Jupyter Notebook
116
star
7

training-detr

Unofficial Colab on how to train DETR, the intelligent object detector, with your own dataset. DETR = Detection Transformer
Jupyter Notebook
40
star
8

transformer_central

Various transformers for FSDP research
Jupyter Notebook
31
star
9

FAdam_PyTorch

an implementation of FAdam (Fisher Adam) in PyTorch
Python
31
star
10

Ranger22

Testing various improvements to Ranger21 for 2022
Python
18
star
11

mrnet-fastai

Deep Learning CNN using FastAI for the Stanford MRNet Knee MRI diagnosis challenge
Jupyter Notebook
16
star
12

triton_kernels_for_fun_and_profit

Custom kernels in Triton language for accelerating LLMs
Python
15
star
13

Thunder-Detr

(unofficial) - customized fork of DETR, optimized for intelligent obj detection on 'real world' custom datasets
Jupyter Notebook
12
star
14

fsdp_llm

FSDP optimizations for LLM training
Python
6
star
15

t5_11

housing our model example of fine tuning an 11B t5 with FSDP
Python
6
star
16

transformer_framework

framework for plug and play of various transformers (vision and nlp) with FSDP
Python
6
star
17

FTSwishPlus

FTSwish with mean shifting added to increase performance
Python
6
star
18

LightRelu

Customized PyTorch implementation of LiSHT (linear scaled hyperbolic tangent) activation function for deep learning
Python
5
star
19

hyper_efficient_optimizers

Development of hyper efficient optimizers that can match/exceed AdamW, while using reduced memory
Python
5
star
20

fsdp_review

Some eval and profile routines for fsdp
4
star
21

auto-adaptive-ai

auto adaptive framework for intrinsic hyperparameter selection, adaptive padding, normalized weights
Jupyter Notebook
4
star
22

TRelu

An improved activation function for deep learning - Threshold Relu, or TRelu
Python
4
star
23

sigma_reparam

Sigma Reparam for Transformers (based on Apple's paper)
Python
3
star
24

EfficientNet-PyTorch

Unofficial port of Google's new EfficientNet to Pytorch and FastAI
Jupyter Notebook
3
star
25

RangerQH-Testing

Repo for running RangerQH + Res2NetPLus with LIP Pooling
Jupyter Notebook
3
star
26

facial-keypoint-detection

Facial keypoint detection CNN - custom architecture using partial convolution padding
Jupyter Notebook
3
star
27

AutoOpt-for-FastAI

Integrate Ebay's AutoOpt Deep Learning Optimizer into the FastAI framework
3
star
28

skycraft2

Minecraft in the sky, written in Python
Python
2
star
29

perception_tools

additional utils for working with Unity perception package
Jupyter Notebook
2
star
30

PolarBearLLM

testing new TransFormer, MoE, and TransNormer features
Python
2
star
31

unet-seg

Jupyter Notebook
2
star
32

FTSwish

Flattened Threshold Swish Activation function - PyTorch implementation
Python
2
star
33

coordinate_clipped_Optimizers

coordinate wise clipped Optimizers in PyTorch
Python
2
star
34

snowfall

helpful image handling utils - abstracts various file and opencv and pil features into result oriented functions
Python
2
star
35

style-transfer-vgg

Artistic Style transfer using VGG19
Jupyter Notebook
2
star
36

cuda-kernel-dev

in progress cuda kernels
Cuda
2
star
37

Curriculum-Learning-Dropout

Implementation of Curriculum Learning Dropout for FastAI framework
Jupyter Notebook
2
star
38

medExam

Training an AI with FSDP to take the US medical exam
1
star
39

5D-Compiler

Auto-Parallelization Compiler using 4D Parallel + Checkpointing (5D)
Python
1
star
40

aot_fsdp

When AOT Autograd meets FSDP = large models train faster
1
star
41

alibi_positional_embeddings

Alibi in PyTorch
Python
1
star
42

optimal-lr-finder

Automated optimal learning rate finder for PyTorch deep learning with FastAI
1
star
43

ft_linen

experiments with flax re-design to interop with pytorch
Python
1
star
44

linear-graph-slam

Linear Graph SLAM
Jupyter Notebook
1
star
45

bfloat_optimizer

Pure bfloat AdamW+ tweaks
Python
1
star
46

snake-id

FastAI deep learning classifier for snakes
1
star
47

Thunder

AI framework for flexible training and results review (pytorch, vision and tabular)
1
star
48

t5_finetuning

T5 and ExT5 fine tuning
Jupyter Notebook
1
star
49

pretrainer

FSDP codebase for pretraining large language models (LLM)
Python
1
star
50

Fusion

Advanced yet low code framework for fully sharded distributed training
Python
1
star
51

hsdp_demo

Tutorial repo for PyTorch FSDP running HSDP on single node.
Python
1
star
52

image-captioning-cnn-lstm

Image captioning system combining CNN + LSTM for caption generation
Jupyter Notebook
1
star
53

self-tuning-ai

implementation of self tuning networks in pytorch, based on https://arxiv.org/pdf/1903.03088v1.pdf
1
star
54

triton_flashv2_alibi

working repo for Triton based Flash2 supporting alibi pos embeddings
Python
1
star
55

Pytorch_train_test_split

Function to randomize and split training data into train/test, from same directory
Python
1
star