• Stars
    star
    10,796
  • Rank 3,155 (Top 0.07 %)
  • Language
    C++
  • License
    MIT License
  • Created about 2 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tensor library for machine learning

ggml

Roadmap / Manifesto

Tensor library for machine learning

Note that this project is under active development.
Some of the development is currently happening in the llama.cpp and whisper.cpp repos

Features

  • Written in C
  • 16-bit float support
  • Integer quantization support (4-bit, 5-bit, 8-bit, etc.)
  • Automatic differentiation
  • ADAM and L-BFGS optimizers
  • Optimized for Apple Silicon
  • On x86 architectures utilizes AVX / AVX2 intrinsics
  • On ppc64 architectures utilizes VSX intrinsics
  • No third-party dependencies
  • Zero memory allocations during runtime

Updates

Whisper inference (example)

With ggml you can efficiently run Whisper inference on the CPU.

Memory requirements:

Model Disk Mem
tiny 75 MB ~280 MB
base 142 MB ~430 MB
small 466 MB ~1.0 GB
medium 1.5 GB ~2.6 GB
large 2.9 GB ~4.7 GB

GPT inference (example)

With ggml you can efficiently run GPT-2 and GPT-J inference on the CPU.

Here is how to run the example programs:

# Build ggml + examples
git clone https://github.com/ggerganov/ggml
cd ggml
mkdir build && cd build
cmake ..
make -j4 gpt-2 gpt-j

# Run the GPT-2 small 117M model
../examples/gpt-2/download-ggml-model.sh 117M
./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin -p "This is an example"

# Run the GPT-J 6B model (requires 12GB disk space and 16GB CPU RAM)
../examples/gpt-j/download-ggml-model.sh 6B
./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"

# Install Python dependencies
python3 -m pip install -r ../requirements.txt

# Run the Cerebras-GPT 111M model
# Download from: https://huggingface.co/cerebras
python3 ../examples/gpt-2/convert-cerebras-to-ggml.py /path/to/Cerebras-GPT-111M/
./bin/gpt-2 -m /path/to/Cerebras-GPT-111M/ggml-model-f16.bin -p "This is an example"

The inference speeds that I get for the different models on my 32GB MacBook M1 Pro are as follows:

Model Size Time / Token
GPT-2 117M 5 ms
GPT-2 345M 12 ms
GPT-2 774M 23 ms
GPT-2 1558M 42 ms
--- --- ---
GPT-J 6B 125 ms

For more information, checkout the corresponding programs in the examples folder.

Using cuBLAS

# fix the path to point to your CUDA compiler
cmake -DGGML_CUBLAS=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.1/bin/nvcc ..

Using clBLAST

cmake -DGGML_CLBLAST=ON ..

Resources

More Repositories

1

llama.cpp

LLM inference in C/C++
C++
66,953
star
2

whisper.cpp

Port of OpenAI's Whisper model in C/C++
C++
33,134
star
3

kbd-audio

🎤⌨️ Acoustic keyboard eavesdropping
C++
7,569
star
4

imtui

ImTui: Immediate Mode Text-based User Interface C++ Library
C++
2,421
star
5

wave-share

Serverless, peer-to-peer, local file sharing through sound
C++
1,955
star
6

ggwave

Tiny data-over-sound library
C++
1,496
star
7

imgui-ws

Dear ImGui over WebSockets
C++
373
star
8

dot-to-ascii

Graphviz to ASCII converter using Graph::Easy
HTML
371
star
9

hnterm

📃 Hacker News in the terminal
C++
144
star
10

ggmorse

Morse code decoding library
C++
136
star
11

whisper.spm

whisper.cpp package for the Swift Package Manager
C
120
star
12

tweet2doom

Tweet to play Doom
Shell
83
star
13

wave-gui

Yet another data-over-sound tool
C++
70
star
14

wtf-tui

Text-based UI tool for configuring the WTF terminal dashboard
C++
66
star
15

incppect

Inspect C++ memory in the browser
C++
63
star
16

ggweb

Template for C++ GUI apps that can run in the browser
C
54
star
17

ggterm

Terminal configuration for C++ development with Vim
Vim Script
34
star
18

diff-challenge

Is this even possible?
Shell
27
star
19

wordle-bg

🇧🇬 Wordle clone in Bulgarian
C++
27
star
20

hnreplies

Scrape Hacker News replies
C++
22
star
21

intervals

Downsampling array of intervals
C++
21
star
22

wave-em

Data over sound in the browser
C++
20
star
23

typing-battles

A multiplayer typing game (server: C++/WebSockets, client: JS)
C++
18
star
24

morse-meme

Meme generator in Bash
Shell
16
star
25

tweet2doom-data

@tweet2doom data and tools
JavaScript
16
star
26

ggwave-java

Minimal Java app for Android using ggwave
Java
14
star
27

ggwave-arduino

Mirror of ggwave used in the Arduino Library Manager
C++
14
star
28

imgui-em

Emscripten port of Dear ImGui (not maintained)
C++
14
star
29

ggwave-objc

Minimal Objective-C app for iOS using ggwave
Objective-C
10
star
30

ggwords

Generate language n-gram statistics
C++
10
star
31

ggerganov.github.io

JavaScript
10
star
32

imtui-template

Template repo for simple ImTui apps
C++
9
star
33

ggwave-spm

ggwave package for the Swift Package Manager
C++
8
star
34

asteroid-generator

The demo generates and renders asteroids floating in space. The shape and the texture of the generated asteroids are procedurally generated. The space background is procedurally generated as well.
C++
7
star
35

ggint

Poor man's big integer arithmetic operations
C++
6
star
36

ggsock

Non-blocking sockets wrapper
C++
5
star
37

hnguessr

Guess the Hacker News titles
C++
5
star
38

ggimg

Poor man's 2d and 3d image operations
C++
4
star
39

load-em

Load a local file in C++ Emscripten program
HTML
4
star
40

the-story

Collaborative storytelling experiment
HTML
4
star
41

tweet2btc

BTC price predictions via Twitter Polls
Shell
3
star
42

site-wave-share

A dedicated page for the wave-share tool:
JavaScript
3
star
43

ocl-lights

Pixel perfect 2D shadows on the GPU
C++
3
star
44

font-rasterizer

Simple TTF rasterizer
C
3
star
45

toto-check

See how many times you could have won the lottery
HTML
3
star
46

puzzle-solver

C++
2
star
47

OSSRH-94491

Jira
1
star
48

homebrew-ggerganov

My Homebrew tap for projects that are not notable enough
Ruby
1
star