• Stars
    star
    5,940
  • Rank 6,447 (Top 0.2 %)
  • Language
    C++
  • License
    Other
  • Created about 10 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A fast compressor/decompressor

Snappy, a fast compressor/decompressor.

Build Status

Introduction

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. (For more information, see "Performance", below.)

Snappy has the following properties:

  • Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. See "Performance" below.
  • Stable: Over the last few years, Snappy has compressed and decompressed petabytes of data in Google's production environment. The Snappy bitstream format is stable and will not change between versions.
  • Robust: The Snappy decompressor is designed not to crash in the face of corrupted or malicious input.
  • Free and open source software: Snappy is licensed under a BSD-type license. For more information, see the included COPYING file.

Snappy has previously been called "Zippy" in some Google presentations and the like.

Performance

Snappy is intended to be fast. On a single core of a Core i7 processor in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. (These numbers are for the slowest inputs in our benchmark suite; others are much faster.) In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios.

Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are capable of achieving yet higher compression rates, although usually at the expense of speed. Of course, compression ratio will vary significantly with the input.

Although Snappy should be fairly portable, it is primarily optimized for 64-bit x86-compatible processors, and may run slower in other environments. In particular:

  • Snappy uses 64-bit operations in several places to process more data at once than would otherwise be possible.
  • Snappy assumes unaligned 32 and 64-bit loads and stores are cheap. On some platforms, these must be emulated with single-byte loads and stores, which is much slower.
  • Snappy assumes little-endian throughout, and needs to byte-swap data in several places if running on a big-endian platform.

Experience has shown that even heavily tuned code can be improved. Performance optimizations, whether for 64-bit x86 or other platforms, are of course most welcome; see "Contact", below.

Building

You need the CMake version specified in CMakeLists.txt or later to build:

git submodule update --init
mkdir build
cd build && cmake ../ && make

Usage

Note that Snappy, both the implementation and the main interface, is written in C++. However, several third-party bindings to other languages are available; see the home page for more information. Also, if you want to use Snappy from C code, you can use the included C bindings in snappy-c.h.

To use Snappy from your own C++ program, include the file "snappy.h" from your calling file, and link against the compiled library.

There are many ways to call Snappy, but the simplest possible is

snappy::Compress(input.data(), input.size(), &output);

and similarly

snappy::Uncompress(input.data(), input.size(), &output);

where "input" and "output" are both instances of std::string.

There are other interfaces that are more flexible in various ways, including support for custom (non-array) input sources. See the header file for more information.

Tests and benchmarks

When you compile Snappy, the following binaries are compiled in addition to the library itself. You do not need them to use the compressor from your own library, but they are useful for Snappy development.

  • snappy_benchmark contains microbenchmarks used to tune compression and decompression performance.
  • snappy_unittests contains unit tests, verifying correctness on your machine in various scenarios.
  • snappy_test_tool can benchmark Snappy against a few other compression libraries (zlib, LZO, LZF, and QuickLZ), if they were detected at configure time. To benchmark using a given file, give the compression algorithm you want to test Snappy against (e.g. --zlib) and then a list of one or more file names on the command line.

If you want to change or optimize Snappy, please run the tests and benchmarks to verify you have not broken anything.

The testdata/ directory contains the files used by the microbenchmarks, which should provide a reasonably balanced starting point for benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they are used to verify correctness in the presence of corrupted data in the unit test.)

Contributing to the Snappy Project

In addition to the aims listed at the top of the README Snappy explicitly supports the following:

  1. C++11
  2. Clang (gcc and MSVC are best-effort).
  3. Low level optimizations (e.g. assembly or equivalent intrinsics) for:
  4. x86
  5. x86-64
  6. ARMv7 (32-bit)
  7. ARMv8 (AArch64)
  8. Supports only the Snappy compression scheme as described in format_description.txt.
  9. CMake for building

Changes adding features or dependencies outside of the core area of focus listed above might not be accepted. If in doubt post a message to the Snappy discussion mailing list.

We are unlikely to accept contributions to the build configuration files, such as CMakeLists.txt. We are focused on maintaining a build configuration that allows us to test that the project works in a few supported configurations inside Google. We are not currently interested in supporting other requirements, such as different operating systems, compilers, or build systems.

Contact

Snappy is distributed through GitHub. For the latest version and other information, see https://github.com/google/snappy.

More Repositories

1

material-design-icons

Material Design icons by Google
49,605
star
2

guava

Google core libraries for Java
Java
48,313
star
3

zx

A tool for writing better scripts
JavaScript
37,928
star
4

styleguide

Style guides for Google-originated open-source projects
HTML
36,487
star
5

leveldb

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
C++
33,564
star
6

material-design-lite

Material Design Components in HTML/CSS/JS
HTML
32,283
star
7

googletest

GoogleTest - Google Testing and Mocking Framework
C++
32,215
star
8

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Python
27,869
star
9

python-fire

Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
Python
26,112
star
10

comprehensive-rust

This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
Rust
25,973
star
11

mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
C++
25,320
star
12

gson

A Java serialization/deserialization library to convert Java Objects into JSON and back
Java
22,856
star
13

flatbuffers

FlatBuffers: Memory Efficient Serialization Library
C++
21,883
star
14

iosched

The Google I/O Android App
Kotlin
21,792
star
15

ExoPlayer

An extensible media player for Android
Java
21,465
star
16

eng-practices

Google's Engineering Practices documentation
19,741
star
17

web-starter-kit

Web Starter Kit - a workflow for multi-device websites
HTML
18,434
star
18

flexbox-layout

Flexbox for Android
Kotlin
18,141
star
19

fonts

Font files available from Google Fonts, and a public issue tracker for all things Google Fonts
HTML
17,563
star
20

filament

Filament is a real-time physically based rendering engine for Android, iOS, Windows, Linux, macOS, and WebGL2
C++
16,946
star
21

cadvisor

Analyzes resource usage and performance characteristics of running containers.
Go
16,273
star
22

libphonenumber

Google's common Java, C++ and JavaScript library for parsing, formatting, and validating international phone numbers.
C++
15,728
star
23

gvisor

Application Kernel for Containers
Go
15,047
star
24

WebFundamentals

Former git repo for WebFundamentals on developers.google.com
JavaScript
13,842
star
25

yapf

A formatter for Python files
Python
13,560
star
26

tink

Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.
Java
13,318
star
27

deepdream

13,212
star
28

brotli

Brotli compression format
TypeScript
12,921
star
29

guetzli

Perceptual JPEG encoder
C++
12,863
star
30

guice

Guice (pronounced 'juice') is a lightweight dependency injection framework for Java 11 and above, brought to you by Google.
Java
12,342
star
31

wire

Compile-time Dependency Injection for Go
Go
12,222
star
32

blockly

The web-based visual programming editor.
TypeScript
12,067
star
33

sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer
C
10,754
star
34

grumpy

Grumpy is a Python to Go source code transcompiler and runtime.
Go
10,464
star
35

or-tools

Google's Operations Research tools:
C++
10,405
star
36

dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
Jupyter Notebook
10,367
star
37

auto

A collection of source code generators for Java.
Java
10,234
star
38

go-github

Go library for accessing the GitHub v3 API
Go
9,941
star
39

oss-fuzz

OSS-Fuzz - continuous fuzzing for open source software.
Shell
9,859
star
40

go-cloud

The Go Cloud Development Kit (Go CDK): A library and tools for open cloud development in Go.
Go
9,314
star
41

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.
C++
8,657
star
42

re2

RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
C++
8,190
star
43

traceur-compiler

Traceur is a JavaScript.next-to-JavaScript-of-today compiler
JavaScript
8,182
star
44

tsunami-security-scanner

Tsunami is a general purpose network security scanner with an extensible plugin system for detecting high severity vulnerabilities with high confidence.
Java
8,086
star
45

trax

Trax β€” Deep Learning with Clear Code and Speed
Python
7,943
star
46

skia

Skia is a complete 2D graphic library for drawing Text, Geometries, and Images.
C++
7,874
star
47

benchmark

A microbenchmark support library
C++
7,812
star
48

android-classyshark

Android and Java bytecode viewer
Java
7,468
star
49

pprof

pprof is a tool for visualization and analysis of profiling data
Go
7,408
star
50

closure-compiler

A JavaScript checker and optimizer.
Java
7,245
star
51

agera

Reactive Programming for Android
Java
7,227
star
52

accompanist

A collection of extension libraries for Jetpack Compose
Kotlin
7,190
star
53

magika

Detect file content types with deep learning
Python
7,171
star
54

flutter-desktop-embedding

Experimental plugins for Flutter for Desktop
C++
7,109
star
55

latexify_py

A library to generate LaTeX expression from Python code.
Python
6,953
star
56

diff-match-patch

Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Python
6,918
star
57

lovefield

Lovefield is a relational database for web apps. Written in JavaScript, works cross-browser. Provides SQL-like APIs that are fast, safe, and easy to use.
JavaScript
6,847
star
58

glog

C++ implementation of the Google logging module
C++
6,797
star
59

jsonnet

Jsonnet - The data templating language
Jsonnet
6,742
star
60

error-prone

Catch common Java mistakes as compile-time errors
Java
6,690
star
61

model-viewer

Easily display interactive 3D models on the web and in AR!
TypeScript
6,473
star
62

gops

A tool to list and diagnose Go processes currently running on your system
Go
6,375
star
63

draco

Draco is a library for compressing and decompressing 3D geometric meshes and point clouds. It is intended to improve the storage and transmission of 3D graphics.
C++
6,188
star
64

automl

Google Brain AutoML
Jupyter Notebook
6,150
star
65

gopacket

Provides packet processing capabilities for Go
Go
6,082
star
66

physical-web

The Physical Web: walk up and use anything
Java
6,017
star
67

j2objc

A Java to iOS Objective-C translation tool and runtime.
Java
5,976
star
68

grafika

Grafika test app
Java
5,964
star
69

ios-webkit-debug-proxy

A DevTools proxy (Chrome Remote Debugging Protocol) for iOS devices (Safari Remote Web Inspector).
C
5,848
star
70

osv-scanner

Vulnerability scanner written in Go which uses the data provided by https://osv.dev
Go
5,821
star
71

seesaw

Seesaw v2 is a Linux Virtual Server (LVS) based load balancing platform.
Go
5,599
star
72

seq2seq

A general-purpose encoder-decoder framework for Tensorflow
Python
5,577
star
73

EarlGrey

🍡 iOS UI Automation Test Framework
Objective-C
5,570
star
74

flax

Flax is a neural network library for JAX that is designed for flexibility.
Python
5,493
star
75

google-java-format

Reformats Java source code to comply with Google Java Style.
Java
5,366
star
76

wireit

Wireit upgrades your npm/pnpm/yarn scripts to make them smarter and more efficient.
TypeScript
5,280
star
77

battery-historian

Battery Historian is a tool to analyze battery consumers using Android "bugreport" files.
Go
5,249
star
78

clusterfuzz

Scalable fuzzing infrastructure.
Python
5,201
star
79

bbr

5,156
star
80

gumbo-parser

An HTML5 parsing library in pure C99
HTML
5,141
star
81

syzkaller

syzkaller is an unsupervised coverage-guided kernel fuzzer
Go
5,111
star
82

git-appraise

Distributed code review system for Git repos
Go
5,090
star
83

google-authenticator

Open source version of Google Authenticator (except the Android app)
Java
5,077
star
84

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
C++
5,076
star
85

uuid

Go package for UUIDs based on RFC 4122 and DCE 1.1: Authentication and Security Services.
Go
4,994
star
86

gemma_pytorch

The official PyTorch implementation of Google's Gemma models
Python
4,920
star
87

gts

β˜‚οΈ TypeScript style guide, formatter, and linter.
TypeScript
4,890
star
88

closure-library

Google's common JavaScript library
JavaScript
4,837
star
89

cameraview

[DEPRECATED] Easily integrate Camera features into your Android app
Java
4,734
star
90

grr

GRR Rapid Response: remote live forensics for incident response
Python
4,641
star
91

liquidfun

2D physics engine for games
C++
4,559
star
92

pytype

A static type analyzer for Python code
Python
4,528
star
93

gxui

An experimental Go cross platform UI library.
Go
4,450
star
94

bloaty

Bloaty: a size profiler for binaries
C++
4,386
star
95

clasp

πŸ”— Command Line Apps Script Projects
TypeScript
4,336
star
96

ko

Build and deploy Go applications on Kubernetes
Go
4,329
star
97

santa

A binary authorization and monitoring system for macOS
Objective-C
4,288
star
98

google-ctf

Google CTF
Go
4,246
star
99

tamperchrome

Tamper Dev is an extension that allows you to intercept and edit HTTP/HTTPS requests and responses as they happen without the need of a proxy. Works across all operating systems (including Chrome OS).
TypeScript
4,148
star
100

end-to-end

End-To-End is a crypto library to encrypt, decrypt, digital sign, and verify signed messages (implementing OpenPGP)
JavaScript
4,126
star