• Stars
    star
    1,739
  • Rank 25,697 (Top 0.6 %)
  • Language
    C++
  • License
    MIT License
  • Created about 7 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Universal Storage Engine

TileDB logo

Full CI Azure Pipelines Anaconda download count badge

The Universal Storage Engine

TileDB is a powerful engine for storing and accessing dense and sparse multi-dimensional arrays, which can help you model any complex data efficiently. It is an embeddable C++ library that works on Linux, macOS, and Windows. It is open-sourced under the permissive MIT License, developed and maintained by TileDB, Inc. To distinguish this project from other TileDB offerings, we often refer to it as TileDB Embedded.

TileDB includes the following features:

  • Support for both dense and sparse arrays
  • Support for dataframes and key-value stores (via sparse arrays)
  • Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage)
  • Chunked (tiled) arrays
  • Multiple compression, encryption and checksum filters
  • Fully multi-threaded implementation
  • Parallel IO
  • Data versioning (rapid updates, time traveling)
  • Array metadata
  • Array groups
  • Numerous APIs on top of the C++ library
  • Numerous integrations (Spark, Dask, MariaDB, GDAL, etc.)

You can use TileDB to store data in a variety of applications, such as Genomics, Geospatial, Finance and more. The power of TileDB stems from the fact that any data can be modeled efficiently as either a dense or a sparse multi-dimensional array, which is the format used internally by most data science tooling. By storing your data and metadata in TileDB arrays, you abstract all the data storage and management pains, while efficiently accessing the data with your favorite data science tool.

Quickstart

You can install the TileDB C++ library as follows:

# Conda (macOS, Linux, Windows):
$ conda install -c conda-forge tiledb

(see links below for Python, R, and other API installation instructions)

Alternatively, you can use the Docker image we provide:

$ docker pull tiledb/tiledb
$ docker run -it tiledb/tiledb

We include several examples. You can start with the following:

Documentation

You can find the detailed TileDB documentation at https://docs.tiledb.com.

Building from source

Please see building from source in the documentation.

Format Specification

The TileDB data format is open-source and can be found here.

Application-specific Packages

APIs

The TileDB team maintains a variety of APIs built on top of the C++ library:

Integrations

TileDB is also integrated with several popular databases and data science tools:

Get involved

TileDB Embedded is an open-source project and welcomes all forms of contributions. Contributors to the project should read over the contribution docs for more information.

We'd love to hear from you. Drop us a line at [email protected], visit our forum or contact form, or follow us on Twitter to stay informed of updates and news.

More Repositories

1

TileDB-Py

Python interface to the TileDB storage engine
Python
178
star
2

TileDB-R

R interface to TileDB: The Modern Database
R
97
star
3

TileDB-VCF

Efficient variant-call data storage and retrieval library using the TileDB storage library.
C++
79
star
4

TileDB-Go

Go Interface to the TileDB storage manager
Go
46
star
5

TileDB-Vector-Search

Cloud-native vector similarity search and storage with efficient, serverless scale-out
Jupyter Notebook
40
star
6

TileDB-Trino

TileDB Connector for TrinoDB
Java
28
star
7

TileDB-MariaDB

MyTile is a MariaDB storage engine for accessing TileDB arrays
C++
23
star
8

TileDB-Java

Java JNI interface to the TileDB storage engine
Java
23
star
9

TileDB-ML

All machine learning oriented functionality TileDB supports.
Python
21
star
10

TileDB-Examples

Notebooks which are dedicated examples for TileDB
Jupyter Notebook
18
star
11

TileDB-CF-Py

TileDB interface with awareness of the CF metadata conventions
Python
17
star
12

TileDB-Spark

Spark interface to the TileDB storage manager
Java
15
star
13

TileDB-CSharp

C# API for TileDB Embedded
C#
14
star
14

shinybg

Run and manage Shiny applications as background processes.
R
14
star
15

tiledbsc

Single-cell data structures in TileDB
R
14
star
16

TileDB-BioImaging

Package providing bioimaging functionality using TileDB. Source of the tiledb-bioimg Python package.
Python
14
star
17

TileDB-xarray

A TileDB backend for xarray.
Python
13
star
18

TileDB-Cloud-Py

Python interface to TileDB Cloud REST API
Python
12
star
19

TileDB-Cloud-JS

TileDB Cloud Javascript Client
TypeScript
11
star
20

tiledb-benchmarks

TileDB benchmark scripts and output
Jupyter Notebook
8
star
21

TileDB-Geospatial

TileDB geospatial docker image
Dockerfile
7
star
22

TileDB-Presto

TileDB Connector for PrestoDB
Java
7
star
23

TileDB-Viz

A collection of packages to help user create beautiful visualizations from TileDB arrays
TypeScript
7
star
24

napari-tiledb-bioimg

TileDB BioImaging plugin for Napari
Python
7
star
25

TileDB-CLI

CLI to the TileDB array storage manager
Python
6
star
26

TileDB-PyBabylonJS

BabylonJS based viewer for three dimensional TileDB arrays
Python
6
star
27

tiledb-vcf-feedstock

Shell
5
star
28

TileDB-FOSS4G-2021

Resources for the 2021 FOSS4G Conference
Jupyter Notebook
5
star
29

rwinlib-tiledb

C++
4
star
30

TileDB-Cloud-JDBC

JDBC driver for TileDB-Coud
Java
4
star
31

TileDB-NYSE-Ingestor

Ingestor for loading NYSE data into tiledb
C++
3
star
32

TileDB-FastQ

FastQ ingestor using TileDB storage format
C++
3
star
33

tiledbsoma-feedstock

A conda-smithy repository for tiledbsoma.
Shell
3
star
34

7k

Teledyne Marine 7k protocol for marine sensor interfacing.
C
3
star
35

TileDB-Cloud-JS-demo

JavaScript
3
star
36

TileDB-Cloud-Jupyter-Contents

Jupyter notebook (lab) content manager based on TileDB Cloud
Python
2
star
37

cellxgene-census-feedstock

2
star
38

TileDB-Cloud-PythonDB

TileDB Connector for Python DB API 2.0
Python
2
star
39

TileDB-Cloud-Java

TileDB-Cloud-Java contains the Java client for the TileDB Cloud Service
Java
2
star
40

TileDB-Docker

Docker files for TileDB
Dockerfile
2
star
41

conda-forge-nightly-controller

Centralized nightly CI builds for TileDB conda feedstocks
Shell
2
star
42

TileDB-Cloud-API-Spec

TileDB Cloud REST Specification
2
star
43

pyread7k

Pyread7k is a library for reading 7k files. It provides a high-level interface to the data in a file, with an API that is a compromise between being ergonomic, while still being easy to correlate with the Data Format Definition.
Python
2
star
44

tiledb-vector-search-feedstock

2
star
45

htslib-feedstock

Shell
1
star
46

TileDB-Cloud-R

TileDB-Cloud-R contains the R client for the TileDB Cloud Service
R
1
star
47

pydata-demo

Notebook from PyData NYC
Jupyter Notebook
1
star
48

github-actions

Python
1
star
49

TileDB-Notebooks

TileDB Jupyter Notebooks
Jupyter Notebook
1
star
50

pysam-feedstock

Batchfile
1
star
51

homebrew-stable

Homebrew tap for TileDB
Ruby
1
star
52

TileDB-SAR

SAR processing with TileDB
Python
1
star
53

TileDB-Cloud-CSharp

This repository contains the CSharp client for the TileDB Cloud Service.
C#
1
star
54

TileDB-MariaDB-Embedded-Example

Instructions and examples of using MariaDB embedded with the MyTile storage engine
C++
1
star