pg_auto_failover
pg_auto_failover is an extension and service for PostgreSQL that monitors and manages automated failover for a Postgres cluster. It is optimized for simplicity and correctness and supports Postgres 10 and newer.
pg_auto_failover supports several Postgres architectures and implements a safe automated failover for your Postgres service. It is possible to get started with only two data nodes which will be given the roles of primary and secondary by the monitor.
The pg_auto_failover Monitor implements a state machine and relies on
in-core PostgreSQL facilities to deliver HA. For example, when the
secondary node is detected to be unavailable, or when its lag is too
much, then the Monitor removes it from the synchronous_standby_names
setting on the primary node. Until the secondary is back to being
monitored healthy, failover and switchover operations are not allowed,
preventing data loss.
pg_auto_failover consists of the following parts:
- a PostgreSQL extension named
pgautofailover
- a PostgreSQL service to operate the pg_auto_failover monitor
- a pg_auto_failover keeper to operate your PostgreSQL instances, see
pg_autoctl run
Multiple Standbys
It is possible to implement a production architecture with any number of Postgres nodes, for better data availability guarantees.
By default, pg_auto_failover uses synchronous replication and every node that reaches the secondary state is added to synchronous_standby_names on the primary. With pg_auto_failover 1.4 it is possible to remove a node from the replication quorum of Postgres.
Citus HA
Starting with pg_auto_failover 2.0 it's now possible to also implement High Availability for a Citus cluster.
Documentation
Please check out project documentation for tutorial, manual pages, detailed design coverage, and troubleshooting information.
Installing pg_auto_failover from packages
Note that pg_auto_failover packages are also found in Postgres PGDG package repositories. If you're using those repositories already, you can install the packages from there.
Ubuntu or Debian:
Binary packages for debian and derivatives (ubuntu) are available from apt.postgresql.org repository, install by following the linked documentation and then::
$ sudo apt-get install pg-auto-failover-cli
$ sudo apt-get install postgresql-14-auto-failover
When using debian, two packages are provided for pg_auto_failover: the
monitor Postgres extension is packaged separately and depends on the
Postgres version you want to run for the monitor itself. The monitor's
extension package is named postgresql-14-auto-failover
when targeting
Postgres 14.
Then another package is prepared that contains the pg_autoctl
command, and
the name of the package is pg-auto-failover-cli
. That's the package that
is needed for the Postgres nodes.
To avoid debian creating a default Postgres data directory and service, follow these steps before installing the previous packages.
$ curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
$ echo "deb http://apt.postgresql.org/pub/repos/apt buster-pgdg main" > /etc/apt/sources.list.d/pgdg.list
# bypass initdb of a "main" cluster
$ echo 'create_main_cluster = false' | sudo tee -a /etc/postgresql-common/createcluster.conf
$ apt-get update
$ apt-get install -y --no-install-recommends postgresql-14
Other installation methods
Please see our extended documentation chapter Installing pg_auto_failover for details.
Trying pg_auto_failover on your local computer
The main documentation for pg_auto_failover includes the following 3 tutorial:
-
The main pg_auto_failover Tutorial uses docker-compose on your local computer to start multiple Postgres nodes and implement your first failover.
-
The complete pg_auto_failover Azure VM Tutorial guides you into creating an Azure network and then Azure VMs in that network, to then provisioning those VMs, and then running Postgres nodes with pg_auto_failover and then introducing hard failures and witnessing an automated failover.
-
The Citus Cluster Quick Start tutorial uses docker-compose to create a full Citus cluster and guide you to a worker failover and then a coordinator failover.
Reporting Security Issues
Security issues and bugs should be reported privately, via email, to the Microsoft Security Response Center (MSRC) at [email protected]. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Further information, including the MSRC PGP key, can be found in the Security TechCenter.
Authors
- Dimitri Fontaine
- Nils Dijk
- Marco Slot
- Louise Grandjonc
- Joe Nelson
- Hadi Moshayedi
- Lukas Fittl
- Murat Tuncer
- Jelte Fennema
License
Copyright (c) Microsoft Corporation. All rights reserved.
This project is licensed under the PostgreSQL License, see LICENSE file for details.
This project includes bundled third-party dependencies, see NOTICE file for details.