• Stars
    star
    297
  • Rank 140,075 (Top 3 %)
  • Language
  • License
    MIT License
  • Created almost 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg

PuffinDB

Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg

Accelerate DuckDB with 10,000 AWS Lambda functions running on your own VPC

Note: This repository only contains preliminary design documents (Cf. Roadmap)

Kickoff meetup: Rovinj, Croatia, March 29-31, 2023

Introduction

Architecture

If you are using DuckDB client-side with any client application, adding the PuffinDB extension will let you:

PuffinDB is an initiative of STOIC, and not DuckDB Labs or the DuckDB Foundation.

DuckDB and the DuckDB logo are trademarks of the DuckDB Foundation.

PuffinDB and the PuffinDB logo are trademarks of STOIC (Sutoiku, Inc.).

STOIC is a member of the DuckDB Foundation.

Beliefs

Rationale

Many excellent distributed SQL engines are available today. Why do we need yet another one?

Outline

Features

Deployment

PuffinDB will support four incremental deployment options:

Philosophy

  • Developer-first — no non-sense, zero friction
  • Lowest latency — every millisecond counts
  • Elastic design — from kilobytes to petabytes

FAQ

Please check our Frequently Asked Questions.

Roadmap

Please check our Roadmap.

Sponsors

This project was initiated and is currently funded by STOIC.

Please check our sponsors page for sponsorship opportunities.

Credits

This project leverages several DuckDB features implemented by DuckDB Labs and funded by STOIC:

  • Support for Apache Arrow streaming when using Node.js deployment (released)
  • Support for user-defined functions when using Node.js deployment (released)
  • Support for map-reduced queries with binary map results using new COMBINE function (released)
  • Support for import of Hive partitions (released)
  • Support for partitioned exports with COPY ... TO ... PARTITION_BY (released)
  • Support for SQL query parsing | stringifying through standard query API (under development)
  • Support for Azure Blob Storage (development starting soon)

We are also considering funding the following projects:

  • Support for SELECT * THROUGH 'https://myPuffinDB.com/' FROM remoteTable syntax (Cf. EDDI)
  • Support for FIXED fixed-length character strings (Cf. #3)
  • Support for C and S tpch-dbgen options in tpch extension

This project was initially inspired by this excellent article from Alon Agmon.

Discussions

Most discussions about this project are currently taking place on the @ghalimi Twitter account.

For a lower-frequency alternative, please follow @PuffinDB.

Notes

PuffinDB should not be confused with the Puffin file format.

Be stoic, be kind, be cool. Like a puffin...

Sutoiku, Inc.