• This repository has been archived on 05/Dec/2022
  • Stars
    star
    316
  • Rank 132,587 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created about 12 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A battle-tested, flexible & comprehensive monitoring solution for your PostgreSQL databases

PGObserver

PGObserver is a battle-tested monitoring solution for your PostgreSQL databases. It covers almost all the metrics provided by the database engine's internal statistics collector, and works out of the box with all PostgreSQL versions (beginning with 9.0) as well as AWS RDS. You don’t have to install any non-standard, server-side database extensions to take advantage of its core functionality, nor do you need to register any privileged users.

Monitored Metrics Include:

  • Stored procedure data: number of calls, run time per procedure, self time per procedure
  • All executed statements data: query runtimes, call counts, time spent on IO
    • based on the pg_stat_statements module, which must be enabled on the DB
  • Table IO statistics: number of sequential and index scans, number of inserts, (hot) updates, deletes, table and index size, heap hits vs disk hits
  • General database indicators: number of backends, exceptions, deadlocks, temporary files written
  • Schema usage: procedure calls, IUD
  • CPU load: needs a plpythonu stored procedure from the sql/data_collection_helpers folder (plpythonu isn't available on RDS)
  • WAL (XLOG) volumes
  • Index usage

For some metrics you must install data-gathering wrapper functions — also known as stored procedures — on the server being monitored. This will enable you to circumvent the superuser requirements.

Go here for some PGObserver screenshots and a nice illustration by Zalando Tech resident artist Kolja Wilcke.

Additional Features

With some extra setup (see instructions), you can also:

  • monitor blocked processes (needs a cron script on the host DB)
  • monitor pg_stat_statements (needs an enabled pg_stat_statements extension)
  • Do cron aggregations for speeding up sproc load and database size graphs; these are useful when monitoring tens of instances
  • export metrics data to InfluxDB for custom charting/dashboarding with Grafana or some other tool

Status

Still in use but does not receive active attention or development as stored procedure usage has dropped in new projects.

How PGObserver Works

A Java application gathers metrics by querying PostgreSQL performance views (pg_stat_*). You can configure gathering intervals for the different metrics on a per-host, per-metric basis. This enables you to gather more details for critical systems and provide fewer details for less-important systems — thereby reducing the amount of data stored.

Additionally, you can configure sets of hosts to monitor from different Java processes — for example, when deploying to multiple locations with limited accessibility.

PGObserver’s frontend is a standalone Python + CherryPy application; the "screenshots" folder includes basic examples. Charts are rendered with the JS Flot library.

To help you generate generate minimalistic test data for a local test setup, we’ve included [this] (https://github.com/zalando/PGObserver/blob/master/frontend/src/testdata.py) script.

Quick Test Run Using Vagrant

Make sure you've installed the latest version of Vagrant. Use Vagrant to clone PGObserver to the machine where you want to run it. Then run from the PGObserver base directory:

git clone https://github.com/zalando/PGObserver.git
cd PGObserver
vagrant up

This last step will take a while, as PGObserver performs the following inside the virtual machine:

  • Fetches and starts an official PostgreSQL 9.3 Docker image
  • Compiles the gatherer for you, creates a Docker image, and starts it inside the VM
  • Creates a Docker image for the frontend and starts it inside the VM
  • Exposes ports 38080 and 38081 for the frontend and the gatherer, respectively. You can then open the frontend on port 38080 and configure a database cluster to monitor — e.g., http://localhost:38080/hosts/

The easiest way to run it somewhere else is to change the config files and create your own Docker images to deploy. Just point it to the PostgreSQL cluster where you created the PGObserver database.

Setup

Install:

  • Python 2.7 (to run PGObserver’s frontend)
  • Pip (to prepare Python dependencies)
pip install -r frontend/requirements.txt
  • the PostgreSQL contrib modules pg_trgm and btree_gist. These should come with your operating system distribution in a package named postgresql-contrib (or similar).

Create a schema by executing the SQL files from your sql/schema folder on a Postgres database where you want to store monitoring data:

cat sql/schema/*.sql | psql -1 -f - -d my_pgobserver_db

Configuration

Start by preparing your configuration files for gatherer and frontend; the provided examples are good starting points.

  • set your database connection parameters: name, host and port
  • configure the usernames and passwords for gatherer and frontend; find defaults here
  • set gather_group (important for gatherer only; enables many gatherer processes)
  • create an unprivileged user on the database you want to monitor; to do selects from the system catalogs

Configuring Hosts to Monitor

You can either:

  • Insert an entry to the monitor_data.hosts table to include the connection details and to-be-monitored features of the cluster you want to monitor (include the same password that you used in the previous step); OR
  • use the "frontend" web application's (next step) /hosts page, inserting all needed data and pressing "add", followed by "reload" to refresh menus
    • set host_gather_group to decide which gatherer monitors which cluster
    • to decide which schemas are scanned for sprocs statistics, review the table sproc_schemas_monitoring_configuration. Defaults are provided.

Some features will require you to create according helper functions on the databases being monitored:

  • CPU load monitoring requires a stored procedure from cpu_load.sql. This is a plpythonu function, so a superuser is needed.
  • For pg_stat_statement monitoring, you need this file.
  • For table & index bloat query, you need this.
  • Blocking processes monitoring requires setup from this folder.

Run the frontend by going into the "frontend" folder and running run.sh, which creates a "python src/web.py" and puts it in the background:

frontend$ ./run.sh --config frontend.yaml

Build the data gatherer single jar, including dependencies, by going to the "gatherer" folder and running:

mvn clean verify assembly:single

Start data monitoring daemons by running run.sh.

Troubleshooting Hint


You might have to change your PostgreSQL server configuration to gather certain types of statistics. Please refer to the Postgres documentation on The Statistics Collector and pg_stat_statements.

Contributions

PGObserver welcomes contributions to the community. Please go to the Issues page to learn more about planned project enhancements and noted bugs. Feel free to make a pull request and we'll take a look.

Thank You

Thank you to our Zalando contributors, as well as Fabian Genter.

License

Copyright 2012 Zalando GmbH

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

More Repositories

1

patroni

A template for PostgreSQL High Availability with Etcd, Consul, ZooKeeper, or Kubernetes
Python
6,267
star
2

postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
Go
3,686
star
3

skipper

An HTTP router and reverse proxy for service composition, including use cases like Kubernetes Ingress
Go
3,088
star
4

restful-api-guidelines

A model set of guidelines for RESTful APIs and Events, created by Zalando
CSS
2,605
star
5

zalenium

A flexible and scalable container based Selenium Grid with video recording, live preview, basic auth & dashboard.
Java
2,385
star
6

SwiftMonkey

A framework for doing randomised UI testing of iOS apps
Swift
1,947
star
7

logbook

An extensible Java library for HTTP request and response logging
Java
1,788
star
8

tailor

A streaming layout service for front-end microservices
JavaScript
1,728
star
9

tech-radar

Visualizing our technology choices
1,581
star
10

spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker
Python
1,225
star
11

intellij-swagger

A plugin to help you easily edit Swagger and OpenAPI specification files inside IntelliJ IDEA
Java
1,172
star
12

problem-spring-web

A library for handling Problems in Spring Web MVC
Java
1,031
star
13

nakadi

A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues
Java
928
star
14

zally

A minimalistic, simple-to-use API linter
Kotlin
903
star
15

problem

A Java library that implements application/problem+json
Java
869
star
16

zalando-howto-open-source

Open Source guidance from Zalando, Europe's largest online fashion platform
799
star
17

go-keyring

Cross-platform keyring interface for Go
Go
689
star
18

gin-oauth2

Middleware for Gin Framework users who also want to use OAuth2
Go
579
star
19

zappr

An agent that enforces guidelines for your GitHub repositories
JavaScript
542
star
20

pg_view

Get a detailed, real-time view of your PostgreSQL database and system metrics
Python
494
star
21

engineering-principles

Our guidelines for building new applications and managing legacy systems
376
star
22

gulp-check-unused-css

A build tool for checking your HTML templates for unused CSS classes
CSS
359
star
23

zmon

Real-time monitoring of critical metrics & KPIs via elegant dashboards, Grafana3 visualizations & more
Shell
355
star
24

expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
Python
325
star
25

riptide

Client-side response routing for Spring
Java
292
star
26

jackson-datatype-money

Extension module to properly support datatypes of javax.money
Java
240
star
27

grafter

Grafter is a library to configure and wire Scala applications
Scala
240
star
28

opentracing-toolbox

Best-of-breed OpenTracing utilities, instrumentations and extensions
Java
180
star
29

elm-street-404

A fun WebGL game built with Elm
Elm
176
star
30

tokens

Java library for conveniently verifying and storing OAuth 2.0 service access tokens
Java
169
star
31

innkeeper

Simple route management API for Skipper
Scala
166
star
32

public-presentations

List of public talks by Zalando Tech: meetup presentations, recorded conference talks, slides
165
star
33

python-nsenter

Enter kernel namespaces from Python
Python
139
star
34

faux-pas

A library that simplifies error handling for Functional Programming in Java
Java
132
star
35

dress-code

The official style guide and framework for all Zalando Brand Solutions products
CSS
129
star
36

beard

A lightweight, logicless templating engine, written in Scala and inspired by Mustache
Scala
121
star
37

friboo

Utility library for writing microservices in Clojure, with support for Swagger and OAuth
Clojure
117
star
38

spring-cloud-config-aws-kms

Spring Cloud Config add-on that provides encryption via AWS KMS
Java
99
star
39

zalando.github.io

Open Source Documentation and guidelines for Zalando developers
HTML
86
star
40

failsafe-actuator

Endpoint library for the failsafe framework
Java
52
star
41

package-build

A toolset for building system packages using Docker and fpm-cookery
Ruby
35
star
42

ghe-backup

Github Enterprise backup at ZalandoTech (Kubernetes, AWS, Docker)
Shell
30
star
43

rds-health

discover anomalies, performance issues and optimization within AWS RDS
Go
26
star
44

backstage-plugin-api-linter

API Linter is a quality assurance tool that checks the compliance of API's specifications to Zalando's API rules.
TypeScript
12
star
45

.github

Standard github health files
1
star