• Stars
    star
    130
  • Rank 277,575 (Top 6 %)
  • Language
    Go
  • License
    MIT License
  • Created over 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Go sqlite3 http vfs: query sqlite databases over http with range headers

sqlite3vfshttp: a Go sqlite VFS for querying databases over http(s)

sqlite3vfshttp is a sqlite3 VFS for querying remote databases over http(s). This allows you to perform queries without needing to download the complete database first.

Your database must be hosted on a webserver that supports HTTP range requests (such as Amazon S3).

Example

See sqlitehttpcli/sqlitehttpcli.go for a simple CLI tool that is able to query a remotely hosted sqlite database.

Usage

	vfs := sqlite3vfshttp.New(*url)

	err := sqlite3vfs.RegisterVFS("httpvfs", vfs)
	if err != nil {
		log.Fatalf("register vfs err: %s", err)
	}

	db, err := sql.Open("sqlite3", "not_a_read_name.db?vfs=httpvfs&mode=ro")
	if err != nil {
		log.Fatalf("open db err: %s", err)
	}

Querying a database in S3

The original purpose of this library was to allow an AWS Lambda function to be able to query a sqlite database stored in S3 without downloading the entire database.

This is possible even for private files stored in S3 by generating a presigned URL and passing that to this library. That allows the client to make HTTP Get range requests without it needing to know how to sign S3 requests.

Building a loadable extension for the sqlite3 cli

The sqlite3 cli supports runtime loadable extensions. We can build sqlite3vfshttp as a shared library, and then load it from the sqlite3 cli to interactively query sqlite databases over http connections. The shared library code is located in the sqlite3http-ext directory. See the sqlite3http-ext/README.md for more details.

Demo

I've uploaded a 30MB sqlite database to a publicly accessible webserver for testing, based on "Balance of payments international investment position: March 2021 quarter – CSV" from https://www.stats.govt.nz/large-datasets/csv-files-for-download/. The schema is:

CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1);

You can query this dataset from the sqlite3 cli tool using the shared library extension:

$ cd sqlite3http-ext

# build httpvfs.so shared library
$ make
go build -tags SQLITE3VFS_LOADABLE_EXT -o sqlite3http_ext.a -buildmode=c-archive sqlite3http_ext.go
gcc -g -fPIC -shared -o httpvfs.so sqlite3http_ext.c sqlite3http_ext.a

# set url of sqlite3 db as environment variable SQLITE3VFSHTTP_URL:
$ export SQLITE3VFSHTTP_URL='https://www.sanford.io/demo.db'

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> -- load extention
sqlite> .load ./httpvfs
sqlite> -- open db using vfs=httpvfs, note you must use the sqlite uri syntax which starts with file://
sqlite> .open file:///foo.db?vfs=httpvfs
sqlite> -- query from remote db
sqlite> SELECT * from csv where period > '2010' limit 10;
series_reference      period      data_value  suppressed  status      units       magntude    subject                    grp                                                   series_title_1
--------------------  ----------  ----------  ----------  ----------  ----------  ----------  -------------------------  ----------------------------------------------------  --------------
BOPQ.S06AC000000000A  2010.03     17463                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.06     17260                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.09     15419                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.12     17088                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.03     18516                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.06     18835                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.09     16390                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.12     18748                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.03     18477                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.06     18270                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual

Alternatively, you can query this dataset using the sqlitehttpcli example tool:

# query the sqlite schema table
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query 'select * from main.sqlite_master'
row: [table csv csv 2 CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1)]

# get 10 rows from the dataset
./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv limit 10"
row: [BOPQ.S06AC000000000A 1971.06 426  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.09 435  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.12 360  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.03 417  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.06 528  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.09 471  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.12 437  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.03 607  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.06 666  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.09 578  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

# get 10 rows where the period is after 2010
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv where period > '2010' limit 10"
row: [BOPQ.S06AC000000000A 2010.03 17463  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.06 17260  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.09 15419  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.12 17088  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.03 18516  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.06 18835  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.09 16390  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.12 18748  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.03 18477  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.06 18270  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

More Repositories

1

wormhole-william

End-to-end encrypted file transfer. A magic wormhole CLI and API in Go (golang).
Go
730
star
2

tpm-fido

A WebAuthn/U2F token protected by a TPM (Go/Linux)
Go
179
star
3

donutdb

Store and query a sqlite db directly backed by DynamoDB.
Go
134
star
4

wormhole-william-mobile

End-to-end encrypted file transfer for Android and iOS. A Magic Wormhole Mobile client.
Go
104
star
5

memfs

In-memory implementation of Go's `io/fs.FS` interface
Go
85
star
6

emacs-oauth

An oauth library for emacs
Emacs Lisp
42
star
7

sqlite3vfs

Go sqlite3 vfs
Go
29
star
8

cloudtrail-tattletail

AWS Cloudtrail event alerting lambda function. Send alerts to Slack, Email, or SNS.
Go
19
star
9

node-proxy

HTTP and SSL Proxy Using Node.js
JavaScript
16
star
10

node-mjpeg-test-server

Example of an mjpeg server written in node.js
JavaScript
15
star
11

emacs-yammer

A simple yammer client for emacs
Emacs Lisp
14
star
12

csv2sqlite

Go
9
star
13

cassandra-visual-ring

Visualization and planning tool for Cassandra rings
JavaScript
7
star
14

mirabox

Globalscale Mirabox Info
6
star
15

uhid

Linux uhid api in Go.
Go
6
star
16

tpm-ssh-ca

Go
5
star
17

ctapkey

Go
5
star
18

lencode

Go (golang) length prefix encoder and decoder package
Go
4
star
19

lambda-email

SES Lambda email forwarding and programmatic routing service
Go
3
star
20

ruby-mode

Emacs ruby-mode with modified indentation rules.
Emacs Lisp
3
star
21

awsip

Go package to check if ip address belongs to AWS
Go
3
star
22

getlogin

go implementation of getlogin(3)
Go
3
star
23

slack-channel-history

Go
3
star
24

github-stars

CLI tool to list all starred repos by user
Go
3
star
25

ec2price

EC2 price comparison cli tool
Go
2
star
26

wg-captive-browser

Connect to captive portals without disabling wireguard on linux
Shell
2
star
27

what-the-fido

https://what-the-fido.sanford.io Identify FIDO key by its attestation certificate
Go
2
star
28

awsso-agent

awsso is a credential agent for caching aws sso credentials (similar to ssh-agent)
Go
2
star
29

lambdahttp

Go
2
star
30

pinephoneproxy

Go
1
star
31

goversions

CLI tool to list Go releases
Go
1
star
32

awsv4signer

Go aws v4 signer implementation with pluggable hmac function
Go
1
star
33

json2csv

Go
1
star
34

gopherfest-2016-slides

Go
1
star
35

android-media-backup

Android application that uploads your media files to a webserver in the background
Go
1
star
36

nft-to-beanie-baby

Replace 'NFT' with 'Beanie Baby'
JavaScript
1
star
37

github-recent-activity

CLI tool that shows recent github activity for a user.
Go
1
star
38

door-awesomer-chrome

Chrome extension for Nearbuy's door awesomer
1
star
39

remarkablecloud

Go API to the ReMarkable cloud
Go
1
star
40

photo-backup-lambda

Go
1
star
41

git-code-review-el

Emacs Lisp
1
star
42

systray-inbox

Show systray icon when files appear in directory
Go
1
star
43

git-time-machine-el

Easy file history viewing in emacs
Emacs Lisp
1
star
44

dnsforward

Simple dns forwarding server (stub resolver)
Go
1
star
45

btf

BPF Type Format (BTF) in Go
Go
1
star
46

door-awesomer-arduino

C
1
star
47

ubuntuami

Go
1
star
48

parquet-buddy

Parquet-buddy is a CLI tool for inspecting parquet files written in Go
Go
1
star
49

cloudflareip

Go
1
star