• Stars
    star
    2
  • Language
    Python
  • License
    GNU General Publi...
  • Created 6 months ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Byte-Pair Encoding tokenizer for training large language models on huge datasets

More Repositories

1

bulma-helpers

πŸ₯ Library with Functional / Atomic CSS classes for Bulma framework
CSS
277
star
2

gdansk-ai

Full stack voice chatbot
TypeScript
191
star
3

basic-vue-chat

πŸ’¬ Easy to use Vue chat
Vue
73
star
4

pff

Examine your Internet connection quality and status in terminal
Rust
16
star
5

text-to-ml

Homebrew AutoML using natural text. Like HuggingGPT + LangChain + type inference
Python
12
star
6

asr-dysarthria

Research on Automatic Speech Recognition for dysarthric speech
Jupyter Notebook
6
star
7

0x6b73746b

🐱 Tree-Walk Interpreter
Rust
4
star
8

spellbook

πŸͺ„ Shell and Powershell scripts registry
TypeScript
3
star
9

tinyconv

🌸 Image processing with kernel and convolution
Python
3
star
10

xiexie

🎐 Static site generator
Rust
2
star
11

openai-polling

Polling library for OpenAI Threads API
TypeScript
2
star
12

funkcja

πŸ–‡οΈ Log function body and name to browser's console
TypeScript
1
star
13

repeat-sh

πŸ•°οΈ Repeat any shell command with interval
Shell
1
star
14

mac-addresses-scanner

πŸ”¦ Find MAC addresses of devices connected to a network
Python
1
star
15

mlp-classifier

πŸ€— Multi-Layer Perceptron artificial neural network
Python
1
star
16

deep-learning-pytorch

Deep Learning architectures implemented in PyTorch Lightning
Python
1
star
17

csv-to-ml

🧌 Upload a CSV file and get an ML model
TypeScript
1
star
18

bpe.c

Byte-Pair Encoding tokenizer for training large language models on huge datasets. I don't know C, so most of the code comes from AI :D I hope to learn by rewriting it and making changes, fixes etc
C
1
star