• Stars
    star
    255
  • Rank 159,729 (Top 4 %)
  • Language
    C
  • License
    Other
  • Created about 10 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Similar images search for PostgreSQL

Build Status codecov GitHub license

ImgSmlr – similar images search for PostgreSQL

Introduction

ImgSmlr – is a PostgreSQL extension which implements similar images searching functionality.

ImgSmlr method is based on Haar wavelet transform. The goal of ImgSmlr is not to provide most advanced state of art similar images searching methods. ImgSmlr was written as sample extension which illustrate how PostgreSQL extendability could cover such untypical tasks for RDBMS as similar images search.

Authors

Availability

ImgSmlr is released as an extension and not available in default PostgreSQL installation. It is available from github under the same license as PostgreSQL and supports PostgreSQL 9.1+.

Installation

Before build and install ImgSmlr you should ensure following:

  • PostgreSQL version is 9.1 or higher.
  • You have development package of PostgreSQL installed or you built PostgreSQL from source.
  • You have gd2 library installed on your system.
  • Your PATH variable is configured so that pg_config command available.

Typical installation procedure may look like this:

$ git clone https://github.com/postgrespro/imgsmlr.git
$ cd imgsmlr
$ make USE_PGXS=1
$ sudo make USE_PGXS=1 install
$ make USE_PGXS=1 installcheck
$ psql DB -c "CREATE EXTENSION imgsmlr;"

Usage

ImgSmlr offers two datatypes: pattern and signature.

Datatype Storage length Description
pattern 16388 bytes Result of Haar wavelet transform on the image
signature 64 bytes Short representation of pattern for fast search using GiST indexes

There is set of functions *2pattern(bytea) which converts bynary data in given format into pattern. Convertion into pattern consists of following steps.

  • Decompress image.
  • Make image black&white.
  • Resize image to 64x64 pixels.
  • Apply Haar wavelet transform to the image.

Pattern could be converted into signature and shuffled for less sensitivity to image shift.

Function Return type Description
jpeg2pattern(bytea) pattern Convert jpeg image into pattern
png2pattern(bytea) pattern Convert png image into pattern
gif2pattern(bytea) pattern Convert gif image into pattern
pattern2signature(pattern) signature Create signature from pattern
shuffle_pattern(pattern) pattern Shuffle pattern for less sensitivity to image shift

Both pattern and signature datatypes supports <-> operator for eucledian distance. Signature also supports GiST indexing with KNN on <-> operator.

Operator Left type Right type Return type Description
<-> pattern pattern float8 Eucledian distance between two patterns
<-> signature signature float8 Eucledian distance between two signatures

The idea is to find top N similar images by signature using GiST index. Then find top n (n < N) similar images by pattern from top N similar images by signature.

Example

Let us assume we have an image table with columns id and data where data column contains binary jpeg data. We can create pat table with patterns and signatures of given images using following query.

CREATE TABLE pat AS (
	SELECT
		id,
		shuffle_pattern(pattern) AS pattern, 
		pattern2signature(pattern) AS signature 
	FROM (
		SELECT 
			id, 
			jpeg2pattern(data) AS pattern 
		FROM 
			image
	) x 
);

Then let's create primary key for pat table and GiST index for signatures.

ALTER TABLE pat ADD PRIMARY KEY (id);
CREATE INDEX pat_signature_idx ON pat USING gist (signature);

Prelimimary work is done. Now we can search for top 10 similar images to given image with specified id using following query.

SELECT
	id,
	smlr
FROM
(
	SELECT
		id,
		pattern <-> (SELECT pattern FROM pat WHERE id = :id) AS smlr
	FROM pat
	WHERE id <> :id
	ORDER BY
		signature <-> (SELECT signature FROM pat WHERE id = :id)
	LIMIT 100
) x
ORDER BY x.smlr ASC 
LIMIT 10

Inner query selects top 100 images by signature using GiST index. Outer query search for top 10 images by pattern from images found by inner query. You can adjust both of number to achieve better search results on your images collection.

More Repositories

1

rum

RUM access method - inverted index with additional information in posting lists
C
725
star
2

pg_probackup

Backup and recovery manager for PostgreSQL
Python
711
star
3

jsquery

JsQuery – json query language with GIN indexing support
C
702
star
4

pg_pathman

Partitioning tool for PostgreSQL
C
583
star
5

zson

ZSON is a PostgreSQL extension for transparent JSONB compression
C
539
star
6

aqo

Adaptive query optimization for PostgreSQL
C
428
star
7

mamonsu

Python
186
star
8

vops

C
165
star
9

postgres_cluster

Various experiments with PostgreSQL clustering
C
151
star
10

pg_query_state

Tool for query progress monitoring in PostgreSQL
C
150
star
11

pg_wait_sampling

Sampling based statistics of wait events
C
144
star
12

testgres

Testing framework for PostgreSQL and its extensions
Python
141
star
13

hunspell_dicts

Hunspell dictionaries for PostgreSQL
TSQL
63
star
14

pg_credereum

Prototype of PostgreSQL extension bringing some properties of blockchain to the relational DBMS
C
62
star
15

sr_plan

Save and restore query plans in PostgreSQL
C
61
star
16

mmts

multimaster
C
57
star
17

raft

Raft protocol implementation in C
C
49
star
18

ptrack

Block-level incremental backup engine for PostgreSQL
C
45
star
19

pg_trgm_pro

C
44
star
20

sqljson

C
38
star
21

postgresql.pthreads

Port of postgresql for pthreads
C
31
star
22

postgresql.builtin_pool

Version of PostgreSQL with built-in connection pooling
C
29
star
23

pg_dtm

Distributed transaction manager
C
27
star
24

postgrespro

Postgres Professional fork of PostgreSQL
C
27
star
25

lsm3

LSM tree implementation based on standard B-Tree
C
26
star
26

lsm

RocksDB FDW for PostgreSQL
C
24
star
27

tsvector2

Extended tsvector type for PostgreSQL
C
20
star
28

pg_backtrace

Show backtrace for errors and signals
C
20
star
29

pgwininstall

PostgreSQL Windows installer
Roff
19
star
30

monq

MonQ - PostgreSQL extension for MongoDB-like queries to jsonb data
C
17
star
31

pg_tsparser

pg_tsparser - parser for text search
C
16
star
32

pgsphere

PgSphere provides spherical data types, functions, operators, and indexing for PostgreSQL.
C
16
star
33

hstore_ops

Better operator class for hstore: smaller index and faster @> queries.
C
16
star
34

undam

Undo storage implementation
C
15
star
35

pg_logging

PostgreSQL logging interface
C
15
star
36

pg_ycsb

YCSB-like benchmark for pgbench
PLpgSQL
15
star
37

tsexample

Example of custom postgresql full text search parser, dictionaries and configuration
C
14
star
38

libblobstamper

Framework for Structure Aware Fuzzing. Allows to build own stamps that would convert pulp-data that came from fuzzer to data with structure you need
C++
14
star
39

pg_oltp_bench

Extension and scripts to run analogue of sysbench OLTP test using pgbench
PLpgSQL
13
star
40

pg_grab_statement

PostgreSQL extension for recoding workload of specific database
C
12
star
41

tsexact

PostgreSQL fulltext search addon
C
11
star
42

jsonbd

JSONB compression method for PostgreSQL
C
10
star
43

rusmorph

Russian morphological dictionary (rusmorph) for Postgres based on libmorph library: https://github.com/big-keva/libmorph
C++
10
star
44

pg_parallizator

C
9
star
45

memstat

C
9
star
46

plantuner

C
8
star
47

pg_pageprep

PostgreSQL extension which helps to prepare heap pages for migration to 64bit XID page format (PostgresPro Enterprise)
C
8
star
48

wildspeed

C
7
star
49

pgbouncer

C
6
star
50

bztree

C++
6
star
51

pg_pathman_build

Prerequisites for pg_pathman building
Shell
5
star
52

snapfs

Fast recoverry and snapshoting
C
4
star
53

pq2jdbc

Java
4
star
54

jsonb_schema

Store jsonb schema separately from data
C
4
star
55

postgrespro-os-templates

Packer templates for building minimal baseboxes
Shell
3
star
56

pg_variables

Session wide variables for PostgreSQL
C
3
star
57

pg_hint_plan

C
2
star
58

pgpro_redefinition

PLpgSQL
2
star
59

snowball_ext

The Snowball dictionary template extension for PostgreSQL
C
2
star
60

jsonb_plpython

PLpgSQL
1
star
61

dict_regex

C
1
star
62

pg-mark

Postgres benchmarking framework
R
1
star
63

anyarray

contrib package for working with 1-D arrays
C
1
star
64

libpq_compression

C
1
star