• Stars
    star
    230
  • Rank 174,053 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 2 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Postgres Proxy Server in Python

Buena Vista: A Programmable Postgres Proxy Server

Buena Vista is a Python library that provides a socketserver-based implementation of the Postgres wire protocol (PDF).

I started working on this project in order to address a common issue that people had when they were using another one of my Python projects, dbt-duckdb: when a long-running Python process is operating on a DuckDB database, you cannot connect to the DuckDB file using the CLI or with a database query tool like DBeaver to examine the state of the database, because each DuckDB file may only be open by a single process at a time. The Buena Vista library makes it possible to work with a DuckDB database from multiple different processes over the Postgres wire protocol, and the library makes it simple enough to run an example that illustrates the idea locally:

pip3 install buenavista
python3 -m buenavista.examples.duckdb_postgres <optional_duckdb_file>

in order to start a Postgres server on localhost:5433 backed by the DuckDB database file that you passed in at the command line (or by an in-memory DuckDB database if you do not specify an argument.) You should be able to query the database via psql in another window by running psql -h localhost -p 5433 (no database/username/password arguments required) or by using the DBeaver Postgres client connection.

More Repositories

1

exhibit

A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.
Java
54
star
2

duckdbt

The Modern Data Stack in a Python package
Python
45
star
3

de4ml

Supporting materials/code examples for my course in data engineering for machine learning.
Python
37
star
4

avro-json

Utilities for converting to and from JSON from Avro records via Hadoop streaming or Hive.
Java
29
star
5

geojson

Scala library for working with GeoJSON records using Esri's Geometry API for Java
Scala
28
star
6

target-duckdb

A Singer.io target for DuckDB
Python
17
star
7

driskill

Either[Hotel in Austin, Prototype of a Scala Distributed Collections API]
Scala
13
star
8

nba_monte_carlo

The Modern Data Stack in a (Smaller) Box
Python
11
star
9

lineage

An R package for tracking the transformations applied to the vectors in a data frame.
R
9
star
10

supernova

A starter kit for working with supernova schemas.
9
star
11

mz-fastapi

A FastAPI utility for building HTTP endpoints powered by Materialize TAIL queries
Python
8
star
12

dbt-buenavista

The dbt adapter for a Buena Vista database proxy server
Python
6
star
13

hive-scd

A new kind of slowly changing dimension pattern for Apache Hive.
Java
6
star
14

crunch-demo

A demo application for getting started with Apache Crunch.
Java
4
star
15

dbt-mysql

MySQL plugin for dbt
Python
3
star
16

saferdd

Tools for working with dirty data in Apache Spark.
Scala
3
star
17

attribution

MapReduce job for creating multitouch attribution models.
Java
3
star
18

avroplay

Me messing around with some Avro stuff
Java
3
star
19

s3-demo

Demo dbt-duckdb against localstack w/the new fsspec config options in version 1.4.1
Dockerfile
3
star
20

hanukkahofdata

My solutions to the 2023 Hanukkah of Data
Python
3
star
21

cdh-mapreduce-ext

Classes in the new mapreduce.* API that are not part of CDH3 yet.
Java
2
star
22

avro-json-serde

A wrapper that uses the Hive AvroSerDe to deserialize data as JSON for use with Hive Streaming
Java
1
star
23

hosprunner

R
1
star