• Stars
    star
    1,831
  • Rank 25,349 (Top 0.5 %)
  • Language
    C++
  • License
    MIT License
  • Created over 7 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Universal Storage Engine

TileDB logo

Full CI Azure Pipelines Anaconda download count badge

The Universal Storage Engine

TileDB is a powerful engine for storing and accessing dense and sparse multi-dimensional arrays, which can help you model any complex data efficiently. It is an embeddable C++ library that works on Linux, macOS, and Windows. It is open-sourced under the permissive MIT License, developed and maintained by TileDB, Inc. To distinguish this project from other TileDB offerings, we often refer to it as TileDB Embedded.

TileDB includes the following features:

  • Support for both dense and sparse arrays
  • Support for dataframes and key-value stores (via sparse arrays)
  • Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage)
  • Chunked (tiled) arrays
  • Multiple compression, encryption and checksum filters
  • Fully multi-threaded implementation
  • Parallel IO
  • Data versioning (rapid updates, time traveling)
  • Array metadata
  • Array groups
  • Numerous APIs on top of the C++ library
  • Numerous integrations (Spark, Dask, MariaDB, GDAL, etc.)

You can use TileDB to store data in a variety of applications, such as Genomics, Geospatial, Finance and more. The power of TileDB stems from the fact that any data can be modeled efficiently as either a dense or a sparse multi-dimensional array, which is the format used internally by most data science tooling. By storing your data and metadata in TileDB arrays, you abstract all the data storage and management pains, while efficiently accessing the data with your favorite data science tool.

Quickstart

You can install the TileDB C++ library as follows:

# Conda (macOS, Linux, Windows):
$ conda install -c conda-forge tiledb

(see links below for Python, R, and other API installation instructions)

Alternatively, you can use the Docker image we provide:

$ docker pull tiledb/tiledb
$ docker run -it tiledb/tiledb

We include several examples. You can start with the following:

Documentation

You can find the detailed TileDB documentation at https://docs.tiledb.com.

Building from source

Please see building from source in the documentation.

Format Specification

The TileDB data format is open-source and can be found here.

Application-specific Packages

APIs

The TileDB team maintains a variety of APIs built on top of the C++ library:

Integrations

TileDB is also integrated with several popular databases and data science tools:

Get involved

TileDB Embedded is an open-source project and welcomes all forms of contributions. Contributors to the project should read over the contribution docs for more information.

We'd love to hear from you. Drop us a line at [email protected], visit our forum or contact form, or follow us on Twitter to stay informed of updates and news.

More Repositories

1

TileDB-Py

Python interface to the TileDB storage engine
Python
184
star
2

TileDB-R

R interface to TileDB: The Modern Database
R
101
star
3

TileDB-VCF

Efficient variant-call data storage and retrieval library using the TileDB storage library.
C++
84
star
4

TileDB-Go

Go Interface to the TileDB storage manager
Go
51
star
5

TileDB-Vector-Search

Cloud-native vector similarity search and storage with efficient, serverless scale-out
Jupyter Notebook
51
star
6

TileDB-Trino

TileDB Connector for TrinoDB
Java
29
star
7

TileDB-MariaDB

MyTile is a MariaDB storage engine for accessing TileDB arrays
C++
28
star
8

TileDB-Java

Java JNI interface to the TileDB storage engine
Java
26
star
9

TileDB-ML

TileDB integrations for machine learning data and model i/o (PyTorch, TensorFlow, Scikit-Learn)
Python
23
star
10

TileDB-Examples

Notebooks which are dedicated examples for TileDB
Jupyter Notebook
19
star
11

TileDB-CF-Py

TileDB interface with awareness of the CF metadata conventions
Python
19
star
12

TileDB-CSharp

C# API for TileDB Embedded
C#
15
star
13

TileDB-Spark

Spark interface to the TileDB storage manager
Java
15
star
14

TileDB-BioImaging

Package providing bioimaging functionality using TileDB. Source of the tiledb-bioimg Python package.
Python
15
star
15

TileDB-Cloud-Py

Python interface to TileDB Cloud REST API
Python
14
star
16

shinybg

Run and manage Shiny applications as background processes.
R
14
star
17

tiledbsc

Single-cell data structures in TileDB
R
14
star
18

TileDB-xarray

A TileDB backend for xarray.
Python
13
star
19

TileDB-Cloud-JS

TileDB Cloud Javascript Client
TypeScript
10
star
20

TileDB-Viz

A collection of packages to help create beautiful visualizations from TileDB arrays
TypeScript
9
star
21

napari-tiledb-bioimg

TileDB BioImaging plugin for Napari
Python
9
star
22

TileDB-Presto

TileDB Connector for PrestoDB
Java
8
star
23

tiledb-benchmarks

TileDB benchmark scripts and output
Jupyter Notebook
8
star
24

TileDB-Geospatial

TileDB geospatial docker image
Dockerfile
7
star
25

tiledb-rs

Rust Bindings for TileDB [early wip]
Rust
7
star
26

TileDB-PyBabylonJS

BabylonJS based viewer for three dimensional TileDB arrays
Python
7
star
27

TileDB-CLI

CLI to the TileDB array storage manager
Python
6
star
28

scverse-ml-workshop-2024

Scripts/Notebooks for "Training models on atlas-scale single-cell datasets" at scverse Conference 2024
Jupyter Notebook
6
star
29

tiledb-vcf-feedstock

Shell
5
star
30

TileDB-FOSS4G-2021

Resources for the 2021 FOSS4G Conference
Jupyter Notebook
5
star
31

rwinlib-tiledb

C++
4
star
32

TileDB-Docker

Docker files for TileDB
Dockerfile
4
star
33

TileDB-Cloud-JDBC

JDBC driver for TileDB-Coud
Java
4
star
34

TileDB-NYSE-Ingestor

Ingestor for loading NYSE data into tiledb
C++
3
star
35

archived-TileDB-FastQ

FastQ ingestor using TileDB storage format
C++
3
star
36

tiledbsoma-feedstock

A conda-smithy repository for tiledbsoma.
Shell
3
star
37

7k

Teledyne Marine 7k protocol for marine sensor interfacing.
C
3
star
38

TileDB-Cloud-JS-demo

JavaScript
3
star
39

htslib-feedstock

Shell
2
star
40

cellxgene-census-feedstock

2
star
41

TileDB-Cloud-PythonDB

TileDB Connector for Python DB API 2.0
Python
2
star
42

TileDB-Cloud-Jupyter-Contents

Jupyter notebook (lab) content manager based on TileDB Cloud
Python
2
star
43

TileDB-Cloud-Java

TileDB-Cloud-Java contains the Java client for the TileDB Cloud Service
Java
2
star
44

homebrew-stable

Homebrew tap for TileDB
Ruby
2
star
45

conda-forge-nightly-controller

Centralized nightly CI builds for TileDB conda feedstocks
Shell
2
star
46

TileDB-Cloud-API-Spec

TileDB Cloud REST Specification
2
star
47

TileDB-MariaDB-Embedded-Example

Instructions and examples of using MariaDB embedded with the MyTile storage engine
C++
2
star
48

tiledb-vector-search-feedstock

2
star
49

pyread7k

Pyread7k is a library for reading 7k files. It provides a high-level interface to the data in a file, with an API that is a compromise between being ergonomic, while still being easy to correlate with the Data Format Definition.
Python
2
star
50

TileDB-Cloud-R

TileDB-Cloud-R contains the R client for the TileDB Cloud Service
R
1
star
51

pydata-demo

Notebook from PyData NYC
Jupyter Notebook
1
star
52

MariaDB-ServerFest2021-Benchmarks

Python
1
star
53

github-actions

Python
1
star
54

TileDB-Notebooks

TileDB Jupyter Notebooks
Jupyter Notebook
1
star
55

centralized-tiledb-nightlies

Centralized nightly builds of TileDB stack
Shell
1
star
56

pysam-feedstock

Batchfile
1
star
57

TileDB-SAR

SAR processing with TileDB
Python
1
star
58

TileDB-Cloud-CSharp

This repository contains the CSharp client for the TileDB Cloud Service.
C#
1
star
59

TileDB-Tableau-Connector

Custom Tableau connector for the TileDB-Cloud JDBC driver
JavaScript
1
star
60

bioimg-compression-demo

Demonstration and side-by-side comparison of histopathology image conversion from SVS to TileDB
Python
1
star