• Stars
    star
    202
  • Rank 192,579 (Top 4 %)
  • Language Cuda
  • Created over 2 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

RWKV-CUDA

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

Towards RWKV-4 (see the wkv folder)

I have a basic RWKV-4 kernel in the wkv folder. Let's optimize it.

Experiment 1 - depthwise_conv1d - 20x faster than pytorch

The formula:

w.shape = (C, T)
k.shape = (B, C, T)
out.shape = (B, C, T)
out[b][c][t] = sum_u{ w[c][(T-1)-(t-u)] * k[b][c][u] }

pytorch = fwd 94ms bwd 529ms

CUDA kernel v0 = fwd 45ms bwd 84ms (simple)

CUDA kernel v1 = fwd 17ms bwd 43ms (shared memory)

CUDA kernel v2 = fwd 13ms bwd 31ms (float4)

CUDA kernel v3 = fwd 3.4ms bwd 23ms (B-group)

More test on RTX3090:

pytorch = fwd 14ms bwd 65ms

CUDA kernel v3 = fwd 0.8ms bwd 5.5ms

How to use: python run.py and it will compile everything for you (pip install Ninja if you don't have it).

More Repositories

1

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Python
11,940
star
2

ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Python
9,333
star
3

AI-Writer

AI 写小说,生成玄幻和言情网文等等。中文预训练生成模型。采用我的 RWKV 模型,类似 GPT-2 。AI写作。RWKV for Chinese novel generation.
Python
2,791
star
4

Hua

Hua is an AI image editor with Stable Diffusion (and more).
351
star
5

BlinkDL.github.io

A collection of State of the Art results in AI / ML / DL / RL / CV / NLP.
88
star
6

BlinkDL

A minimalist deep learning library in Javascript using WebGL + asm.js. Run convolutional neural network in your browser.
JavaScript
82
star
7

YYDZ

丁真宇宙,一眼丁真合集,已有两千多张图片。The YYDZ (Yi Yan Ding Zhen / One Eye Ding Zhen) dataset.
79
star
8

RWKV-v2-RNN-Pile

RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.
Python
65
star
9

BookCNN

《深度卷积网络:原理与实践》现已在淘宝天猫京东当当发售. 这里是其中的代码下载.
Jupyter Notebook
55
star
10

LinearAttentionArena

Here we will test various linear attention designs.
Python
50
star
11

SmallInitEmb

LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
Python
45
star
12

WorldModel

Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business / finance / governance, and can align agents with human too.
40
star
13

LM-Trick-Questions

Here we collect trick questions and failed tasks for open source LLMs to improve them.
31
star
14

Basis

The Basis Programming Language
Python
27
star
15

BlinkToDo

A minimalist ToDo.txt page. 如果你的ToDo有一百项以上,试试这个基于txt的极简事项管理工具。
JavaScript
25
star
16

AntiAging

List of Anti-aging Research
11
star
17

RWKV.com

HTML
10
star
18

Nala

The Nala markup, to turn a "Natural Language" sentence into a code-like statement. Nala 标注,将自然语言变为编程语言。
9
star
19

PathTracingJS

Path tracing demo with JS in your web browser. 用浏览器JS做路径跟踪渲染。
JavaScript
7
star
20

BlinkColorTheme

A colorful theme for HTML+JS+CSS.
CSS
4
star
21

Model_Leaderboard

Leaderboard of AI models.
HTML
3
star
22

MathBook

一个较为系统的数学笔记(graduate level)
2
star
23

BasisLang.com

BasisLang.com
HTML
1
star