• Stars
    star
    252
  • Rank 160,894 (Top 4 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ClickHouse FDW for PostgreSQL

clickhouse_fdw - ClickHouse Foreign Data Wrapper for PostgreSQL

The clickhouse_fdw is open-source. It is a Foreign Data Wrapper (FDW) for ClickHouse column oriented database.

Supported PostgreSQL Versions

PostgreSQL 11-14

Installation

Prerequisite

The clickhouse_fdw uses an HTTP interface provided by ClickHouse. libcurl and uuid libraries should be installed in the system. Make sure that ClickHouse uses UTC timezone.

Installing clickhouse_fdw
git clone [email protected]:ildus/clickhouse_fdw.git
cd clickhouse_fdw
mkdir build && cd build
cmake ..
make && make install
Centos 7 Notes

You should be using modern compiler, available in devtoolset-7. Once installed, activate it with the command:

source scl_source enable devtoolset-7

Minimal libcurl compatible with clickhouse-fdw is 7.43.0. It is not available in the official Centos repo. You can download recent RPMs from here

You can upgrade libcurl with command:

sudo rpm -Uvh  *curl*rpm --nodeps
sudo yum install libmetalink

uuid can be installed with:

yum install libuuid-devel

Usage

You need to set up the sample database and tables in the ClickHouse database. Here we create a sample database name test_database and two sample tables tax_bills_nyc and tax_bills:

CREATE DATABASE test_database;
USE test_database;
CREATE TABLE tax_bills_nyc
(
    bbl Int64,
    owner_name String,
    address String,
    tax_class String,
    tax_rate String,
    emv Float64,
    tbea Float64,
    bav Float64,
    tba String,
    property_tax String,
    condonumber String,
    condo String,
    insertion_date DateTime MATERIALIZED now()
)
ENGINE = MergeTree PARTITION BY tax_class ORDER BY (owner_name);

CREATE TABLE tax_bills
(
    bbl Int64,
    owner_name String
)
ENGINE = MergeTree
PARTITION BY bbl
ORDER BY bbl;

Download the sample data from the taxbills.nyc website and put the data in the table:

curl -X GET 'http://taxbills.nyc/tax_bills_june15_bbls.csv' | \
	clickhouse-client --input_format_allow_errors_num=10 \
	--query="INSERT INTO test_database.tax_bills_nyc FORMAT CSV"

Now the data is ready in the ClickHouse, the next step is to set up the PostgreSQL side. First we need to create a FDW extension and a ClickHouse foreign server:

CREATE EXTENSION clickhouse_fdw;
CREATE SERVER clickhouse_svr FOREIGN DATA WRAPPER clickhouse_fdw OPTIONS(dbname 'test_database');

By default the server will use http protocol. But we could use binary protocol:

CREATE SERVER clickhouse_svr FOREIGN DATA WRAPPER clickhouse_fdw
	OPTIONS(dbname 'test_database', driver 'binary');

Available parameters:

* dbname
* host
* port
* driver

Now create user mapping and foreign tables:

CREATE USER MAPPING FOR CURRENT_USER SERVER clickhouse_svr OPTIONS (user 'default', password '');
IMPORT FOREIGN SCHEMA "test_database" FROM SERVER clickhouse_svr INTO public;

SELECT bbl,tbea,bav,insertion_date FROM tax_bills_nyc LIMIT 5;
    bbl     | tbea  |  bav   |   insertion_date
------------+-------+--------+---------------------
 1001200009 | 72190 | 648900 | 2019-08-03 11:04:38
 4000620001 |  8961 |  80550 | 2019-08-03 11:04:38
 4157860094 | 13317 | 119700 | 2019-08-03 11:04:38
 4123850237 |    50 |    450 | 2019-08-03 11:04:38
 4103150163 |  2053 |  18450 | 2019-08-03 11:04:38

INSERT INTO tax_bills SELECT bbl, tbea from tax_bills_nyc LIMIT 100;

EXPLAIN VERBOSE SELECT bbl,tbea,bav,insertion_date FROM tax_bills_nyc LIMIT 5;
                                     QUERY PLAN
--------------------------------------------------------------------------------------------
Limit  (cost=0.00..0.00 rows=1 width=32)
Output: bbl, tbea, bav, insertion_date
 ->  Foreign Scan on public.tax_bills_nyc  (cost=0.00..0.00 rows=0 width=32)
         Output: bbl, tbea, bav, insertion_date
         Remote SQL: SELECT bbl, tbea, bav, insertion_date FROM test_database.tax_bills_nyc
(5 rows)

AggregatingMergeTree

For columns where you need Merge prefix in aggregations just add AggregateFunction option with aggregation function name. Example:

ALTER TABLE table ALTER COLUMN b OPTIONS (SET AggregateFunction 'count');

Or use IMPORT SCHEMA for automatic definitions.

More Repositories

1

decoder_json

PostgreSQL logical decoder to JSON
C
28
star
2

owly

Guitar Effects Pedal based on STM32 and WM8731 audio codec
C
22
star
3

django-history

Full history for django models
Python
20
star
4

jabber_bot

Jabber bot for Telegram
Go
13
star
5

TDA7313

TDA7313 library for Arduino
Eagle
11
star
6

pangaea_cp16

Guitar cabinet simulator (on impulse responses) based on AMT Pangaea CP-16
6
star
7

ansible-logs

Ansible configuration for centralized logging
Shell
5
star
8

mio

MIO FUSE desktop client (OSX & Linux)
Go
5
star
9

kicad-library-custom

my kicad library
4
star
10

etables

Simple tables editor on erlang and mochiweb
JavaScript
3
star
11

django-ilib

Old library with some widgets and fields
JavaScript
3
star
12

pg_zstd

C
2
star
13

car_radio

Arduino
2
star
14

pqt

Utility for postgres instances management.
Go
2
star
15

pg_kll_sketch

PostgreSQL extension for calculating quantiles using KLL sketch data structure
C
2
star
16

usi_bitbang

USI <-> GPIO communcation using bit banging through /sys/class/gpio interface
C
2
star
17

personal

Simple blog engine
JavaScript
2
star
18

calc

Инженерный калькулятор
Component Pascal
2
star
19

small_bootstrap

bootstrap with smaller elements
JavaScript
1
star
20

midi_fpga

MIDI footswitch controller on iCE40HX1K-EVB
SystemVerilog
1
star
21

sway

`sway` configuration
1
star
22

nvim

My vim configuration
Vim Script
1
star
23

pg_dict_compression

Custom dictionary compression for PostgreSQL
C
1
star
24

remote_elgen

Control an electric generator with ESP32
C
1
star
25

avrisp

schematic and pcb for http://www.fischl.de/usbasp/
KiCad Layout
1
star
26

midi_foot_ctrl

Bluetooth MIDI footswitch controller
C
1
star
27

greatiful_client

JavaScript
1
star
28

time_tracker

Simple cross-platform time tracker
Go
1
star
29

pwr_controller

Power controller for Orange PI, based on Attiny24A
C
1
star
30

ch

ch project
Python
1
star
31

esp32_zigbee_temp_sensor

ESP32C6 based temperature monitor
C
1
star