• Stars
    star
    2,830
  • Rank 16,073 (Top 0.4 %)
  • Language
    C
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A cloud-native database based on PostgreSQL developed by Alibaba Cloud.

logo

PolarDB for PostgreSQL

A cloud-native database developed by Alibaba Cloud

English | 简体中文

official

cirrus-ci-stable cirrus-ci-dev license github-issues github-pullrequest github-forks github-stars github-contributors Leaderboard

Overview

arch.png

PolarDB for PostgreSQL (hereafter simplified as PolarDB) is a cloud native database service independently developed by Alibaba Cloud. This service is 100% compatible with PostgreSQL and uses a shared-storage-based architecture in which computing is decoupled from storage. This service features flexible scalability, millisecond-level latency and hybrid transactional/analytical processing (HTAP) capabilities.

  1. Flexible scalability: You can use the service to scale out a compute cluster or a storage cluster based on your business requirements.
    • If the computing power is insufficient, you can scale out only the compute cluster.
    • If the storage capacity or the storage I/O is insufficient, you can scale out a storage cluster without interrupting your service.
  2. Millisecond-level latency:
    • Write-ahead logging (WAL) logs are stored in the shared storage. Only the metadata of WAL records is replicated from the read-write node to read-only nodes.
    • The LogIndex technology provided by PolarDB features two record replay modes: lazy replay and parallel replay. The technology can be used to minimize the record replication latency from the read-write node to read-only nodes.
  3. HTAP: HTAP is implemented by using a shared-storage-based massively parallel processing (MPP) architecture. The architecture is used to accelerate online analytical processing (OLAP) queries in online transaction processing (OLTP) scenarios. PolarDB supports a complete suite of data types that are used in OLTP scenarios. PolarDB supports two computing engines that can process these types of data:
    • Standalone execution: processes OLTP queries that feature high concurrency.
    • Distributed execution: processes large OLAP queries.

PolarDB provides a wide range of innovative multi-model database capabilities to help you process, analyze, and search for different types of data, such as spatio-temporal, geographic information system (GIS), image, vector, and graph data.

Branch Introduction

The POLARDB_11_STABLE is the stable branch based on PostgreSQL 11.9, which supports compute-storage separation architecture. The distributed branch supports distributed architecture.

Architecture and Roadmap

PolarDB uses a shared-storage-based architecture in which computing is decoupled from storage. The conventional shared-nothing architecture is changed to the shared-storage architecture. N copies of data in the compute cluster and N copies of data in the storage cluster are changed to N copies of data in the compute cluster and one copy of data in the storage cluster. The shared storage stores one copy of data, but the data states in memory are different. The WAL logs must be synchronized from the primary node to read-only nodes to ensure data consistency. In addition, when the primary node flushes dirty pages, it must be controlled to prevent the read-only nodes from reading future pages. Meanwhile, the read-only nodes must be prevented from reading the outdated pages that are not correctly replayed in memory. To resolve this issue, PolarDB provides the index structure LogIndex to maintain the page replay history. LogIndex can be used to synchronize data from the primary node to read-only nodes.

After computing is decoupled from storage, the I/O latency and throughput increase. When a single read-only node is used to process analytical queries, the CPUs, memory, and I/O of other read-only nodes and the large storage I/O bandwidth cannot be fully utilized. To resolve this issue, PolarDB provides the shared-storage-based MPP engine. The engine can use CPUs to accelerate analytical queries at SQL level and support a mix of OLAP workloads and OLTP workloads for HTAP.

For more information, see Architecture.

Quick Start with PolarDB

If you have Docker installed already,then you can pull the instance image of PolarDB for PostgreSQL based on local storage. Create, run and enter the container, and use PolarDB instance directly:

# pull the instance image from DockerHub
docker pull polardb/polardb_pg_local_instance:single
# create, run and enter the container
docker run -it --cap-add=SYS_PTRACE --privileged=true --name polardb_pg_single polardb/polardb_pg_local_instance:single bash
# check
psql -h 127.0.0.1 -c 'select version();'
            version
--------------------------------
 PostgreSQL 11.9 (POLARDB 11.9)
(1 row)

For more advanced deployment way, please refer to Advanced Deployment. Before your deployment, we recommand to figure out the architecture of PolarDB for PostgreSQL.

Documentation

Please refer to Online Documentation Website to see the whole documentations.

If you want to explore or develop documentation locally, see Document Contribution.

Contributing

You are welcome to make contributions to PolarDB, no matter code or documentation.

Here are the contributors:

Made with contrib.rocks.

Software License

PolarDB code is released under the Apache License (Version 2.0), developed based on the PostgreSQL which is released under the PostgreSQL License. This product contains various third-party components under other open source licenses.

See the LICENSE and NOTICE file for more information.

Acknowledgments

Some code and design ideas are based on other open source projects, such as PG-XC/XL (pgxc_ctl), TBase (Timestamp-based vacuum and MVCC), Greenplum and Citus (pg_cron). We thank the contributions of the preceding open source projects.

Join the Community


Copyright © Alibaba Group, Inc.

More Repositories

1

galaxysql

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
Java
943
star
2

PolarDB-FileSystem

C++
192
star
3

galaxyengine

GalaxyEngine is a MySQL branch originated from Alibaba Group, especially supports large-scale distributed database system.
C++
170
star
4

galaxycdc

GalaxyCDC is a core component of PolarDB-X which is responsible for global binary log generation, publication and subscription.
Java
45
star
5

galaxykube

PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.
Go
44
star
6

PolarDB-X

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
Makefile
37
star
7

galaxyglue

GalaxyGlue is an extension to MySQL Connector/J 8.0.
Java
27
star
8

PolarDB-Stack-Operator

PolarDB Stack is a DBaaS implementation for PolarDB-for-Postgres, as an operator creates and manages PolarDB/PostgreSQL clusters running in Kubernetes. It provides re-construct, failover swtich-over, scale up/out, high-available capabilities for each clusters.
Go
24
star
9

PolarDB-NodeAgent

PolarDB-NodeAgent is a light-weight and flexible agent for data collection, which supports performance data collection from hosts and instances. It is a plug-in process running on physical machines or virtual machines, collecting performance data every second and real-time logs of all the instances (both containerized instances and non-containerized instances) on the machine.
Go
15
star
10

PolarDB-Stack-Workflow

Go
13
star
11

PolarDB-Stack-Storage

Go
11
star
12

PolarDB-ClusterManager

PolarDB Cluster Manager is the cluster management component of PolarDB for PostgreSQL, responsible for topology management, high availability, configuration management, and plugin extensions.
Go
11
star
13

PolarDB-Stack-Daemon

PolarStack-Daemon is a daemon process in DBaaS PolarStack. It runs on all hosts and is responsible for port status collection, db log clear, db engine images availability collection, host network status collection. It provides the basic host information and status for db cluster creating/migrating/state recognition and running.
Go
10
star
14

PolarDB-Stack-Common

Go
9
star
15

PolarDB-Hands-On

C
6
star
16

polardbx-backup

polardbx-backup is a hot backup tool for PolarDB-X
C++
3
star
17

polardb-pg-docker-images

Shell
3
star
18

PolarDB-Hackathon-2023

3
star
19

PolarDB-ImageBuilder

Python
3
star
20

galaxysql-tools

Tools for PolarDB-X, such as data migration, CSV files import, benchmark, etc.
Java
3
star
21

learn-some-polardb-x

Java
1
star
22

PolarDB-BackupAgent

A distributed, high performance and high avaliable backup agent for polardb pg which has rich features and is easy to extend storage plugins.
C
1
star