• Stars
    star
    400
  • Rank 105,619 (Top 3 %)
  • Language
    Java
  • License
    Other
  • Created almost 3 years ago
  • Updated 27 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multi-DBMS SQL Benchmarking Framework via JDBC

BenchBase

BenchBase (Java with Maven)

BenchBase (formerly OLTPBench) is a Multi-DBMS SQL Benchmarking Framework via JDBC.

Table of Contents


Quickstart

To clone and build BenchBase using the postgres profile,

git clone --depth 1 https://github.com/cmu-db/benchbase.git
cd benchbase
./mvnw clean package -P postgres

This produces artifacts in the target folder, which can be extracted,

cd target
tar xvzf benchbase-postgres.tgz
cd benchbase-postgres

Inside this folder, you can run BenchBase. For example, to execute the tpcc benchmark,

java -jar benchbase.jar -b tpcc -c config/postgres/sample_tpcc_config.xml --create=true --load=true --execute=true

A full list of options can be displayed,

java -jar benchbase.jar -h

Description

Benchmarking is incredibly useful, yet endlessly painful. This benchmark suite is the result of a group of PhDs/post-docs/professors getting together and combining their workloads/frameworks/experiences/efforts. We hope this will save other people's time, and will provide an extensible platform, that can be grown in an open-source fashion.

BenchBase is a multi-threaded load generator. The framework is designed to be able to produce variable rate, variable mixture load against any JDBC-enabled relational database. The framework also provides data collection features, e.g., per-transaction-type latency and throughput logs.

The BenchBase framework has the following benchmarks:

This framework is design to allow for easy extension. We provide stub code that a contributor can use to include a new benchmark, leveraging all the system features (logging, controlled speed, controlled mixture, etc.)


Usage Guide

How to Build

Run the following command to build the distribution for a given database specified as the profile name (-P). The following profiles are currently supported: postgres, mysql, mariadb, sqlite, cockroachdb, phoenix, and spanner.

./mvnw clean package -P <profile name>

The following files will be placed in the ./target folder:

  • benchbase-<profile name>.tgz
  • benchbase-<profile name>.zip

How to Run

Once you build and unpack the distribution, you can run benchbase just like any other executable jar. The following examples assume you are running from the root of the expanded .zip or .tgz distribution. If you attempt to run benchbase outside of the distribution structure you may encounter a variety of errors including java.lang.NoClassDefFoundError.

To bring up help contents:

java -jar benchbase.jar -h

To execute the tpcc benchmark:

java -jar benchbase.jar -b tpcc -c config/postgres/sample_tpcc_config.xml --create=true --load=true --execute=true

For composite benchmarks like chbenchmark, which require multiple schemas to be created and loaded, you can provide a comma separated list:

java -jar benchbase.jar -b tpcc,chbenchmark -c config/postgres/sample_chbenchmark_config.xml --create=true --load=true --execute=true

The following options are provided:

usage: benchbase
 -b,--bench <arg>               [required] Benchmark class. Currently
                                supported: [tpcc, tpch, tatp, wikipedia,
                                resourcestresser, twitter, epinions, ycsb,
                                seats, auctionmark, chbenchmark, voter,
                                sibench, noop, smallbank, hyadapt, otmetrics]
 -c,--config <arg>              [required] Workload configuration file
    --clear <arg>               Clear all records in the database for this
                                benchmark
    --create <arg>              Initialize the database for this benchmark
 -d,--directory <arg>           Base directory for the result files,
                                default is current directory
    --dialects-export <arg>     Export benchmark SQL to a dialects file
    --execute <arg>             Execute the benchmark workload
 -h,--help                      Print this help
 -im,--interval-monitor <arg>   Throughput Monitoring Interval in
                                milliseconds
    --load <arg>                Load data using the benchmark's data
                                loader
 -s,--sample <arg>              Sampling window

How to Run with Maven

Instead of first building, packaging and extracting before running benchbase, it is possible to execute benchmarks directly against the source code using Maven. Once you have the project cloned you can run any benchmark from the root project directory using the Maven exec:java goal. For example, the following command executes the tpcc benchmark against postgres:

mvn clean compile exec:java -P postgres -Dexec.args="-b tpcc -c config/postgres/sample_tpcc_config.xml --create=true --load=true --execute=true"

this is equivalent to the steps above but eliminates the need to first package and then extract the distribution.

How to Enable Logging

To enable logging, e.g., for the PostgreSQL JDBC driver, add the following JVM property when starting...

-Djava.util.logging.config.file=src/main/resources/logging.properties

To modify the logging level you can update logging.properties and/or log4j.properties.

How to Release

./mvnw -B release:prepare
./mvnw -B release:perform

How use with Docker

  • Build or pull a dev image to help building from source:

    ./docker/benchbase/build-dev-image.sh
    ./docker/benchbase/run-dev-image.sh

    or

    docker run -it --rm --pull \
      -v /path/to/benchbase-source:/benchbase \
      -v $HOME/.m2:/home/containeruser/.m2 \
      benchbase.azure.cr.io/benchbase-dev
  • Build the full image:

    # build an image with all profiles
    ./docker/benchbase/build-full-image.sh
    
    # or if you only want to build some of them
    BENCHBASE_PROFILES='postgres mysql' ./docker/benchbase/build-full-image.sh
  • Run the image for a given profile:

    BENCHBASE_PROFILE='postgres' ./docker/benchbase/run-full-image.sh --help # or other benchbase args as before

    or

    docker run -it --rm --env BENCHBASE_PROFILE='postgres' \
      -v results:/benchbase/results benchbase.azurecr.io/benchbase --help # or other benchbase args as before

See the docker/benchbase/README.md for further details.

Github Codespaces and VSCode devcontainer support is also available.

How to Add Support for a New Database

Please see the existing MySQL and PostgreSQL code for an example.


Contributing

We welcome all contributions! Please open a pull request. Common contributions may include:

  • Adding support for a new DBMS.
  • Adding more tests of existing benchmarks.
  • Fixing any bugs or known issues.

Known Issues

Please use GitHub's issue tracker for all issues.

Credits

BenchBase is the official modernized version of the original OLTPBench.

The original OLTPBench code was largely written by the authors of the original paper, OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases, D. E. Difallah, A. Pavlo, C. Curino, and P. Cudré-Mauroux. In VLDB 2014. Please see the citation guide below.

A significant portion of the modernization was contributed by Tim Veil @ Cockroach Labs, including but not limited to:

  • Built with and for Java 11 17.
  • Migration from Ant to Maven.
    • Reorganized project to fit Maven structure.
    • Removed static lib directory and dependencies.
    • Updated required dependencies and removed unused or unwanted dependencies.
    • Moved all non .java files to standard Maven resources directory.
    • Shipped with Maven Wrapper.
  • Improved packaging and versioning.
    • Moved to Calendar Versioning (https://calver.org/).
    • Project is now distributed as a .tgz or .zip with an executable .jar.
    • All code updated to read resources from inside .jar instead of directory.
  • Moved from direct dependence on Log4J to SLF4J.
  • Reorganized and renamed many files for clarity and consistency.
  • Applied countless fixes based on "Static Analysis".
    • JDK migrations (boxing, un-boxing, etc.).
    • Implemented try-with-resources for all java.lang.AutoCloseable instances.
    • Removed calls to printStackTrace() or System.out.println in favor of proper logging.
  • Reformatted code and cleaned up imports.
  • Removed all calls to assert.
  • Removed various forms of dead code and stale configurations.
  • Removed calls to commit() during Loader operations.
  • Refactored Worker and Loader usage of Connection objects and cleaned up transaction handling.
  • Introduced Dependabot to keep Maven dependencies up to date.
  • Simplified output flags by removing most of them, generally leaving the reporting functionality enabled by default.
  • Provided an alternate Catalog that can be populated directly from the configured Benchmark database. The old catalog was proxied through HSQLDB -- this remains an option for DBMSes that may have incomplete catalog support.

Citing This Repository

If you use this repository in an academic paper, please cite this repository:

D. E. Difallah, A. Pavlo, C. Curino, and P. Cudré-Mauroux, "OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases," PVLDB, vol. 7, iss. 4, pp. 277-288, 2013.

The BibTeX is provided below for convenience.

@article{DifallahPCC13,
  author = {Djellel Eddine Difallah and Andrew Pavlo and Carlo Curino and Philippe Cudr{\'e}-Mauroux},
  title = {OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases},
  journal = {PVLDB},
  volume = {7},
  number = {4},
  year = {2013},
  pages = {277--288},
  url = {http://www.vldb.org/pvldb/vol7/p277-difallah.pdf},
}

More Repositories

1

bustub

The BusTub Relational Database Management System (Educational)
C++
3,755
star
2

peloton

The Self-Driving Database Management System
C++
2,026
star
3

noisepage

Self-Driving Database Management System from Carnegie Mellon University
C++
1,735
star
4

ottertune

The automatic DBMS configuration tool
Python
1,202
star
5

15445-bootcamp

A basic introduction to coding in modern C++.
C++
601
star
6

dbdb.io

The On-line Database of Databases
Python
465
star
7

optd

CMU-DB's Cascades optimizer framework
Rust
324
star
8

mongodb-d4

Automatic MongoDB database designer
Python
54
star
9

cmdbac

CMDBAC - Carnegie Mellon Database Application Catalog
Python
35
star
10

libfixeypointy

Fixed-Point Decimal Library from Carnegie Mellon University
C++
32
star
11

noisepage-stats

DBMS Performance & Correctness Testing Framework
Python
30
star
12

peloton-design

Peloton Design Docs
27
star
13

peloton-test

SQL Testing Framework for the Peloton DBMS
Java
20
star
14

pgextmgrext

A Postgres Extension to Manage Extensions! (As well as some random stuff)
Rust
13
star
15

15721-s24-cache1

15-721 Spring 2024 - Cache #1
Rust
12
star
16

noisepage-pilot

Because "pilot" was a better name than "brain"
Jupyter Notebook
8
star
17

dbgym

Python
7
star
18

15721-s24-scheduler1

15-721 Spring 2024 - Scheduler #1
Rust
5
star
19

noisepage-control

NoisePage Autonomous Control Plane Infrastructure
4
star
20

terrier-dashboard

JavaScript
4
star
21

noisepage-forecast

Python
4
star
22

pgext-analyzer

PostgreSQL Extensions Analyzer
C++
2
star
23

15721-s24-cache2

15-721 Spring 2024 - Cache #2
Rust
1
star
24

15721-s24-catalog2

15-721 Spring 2024 - Catalog #2
Rust
1
star
25

benchpress

Benchpress Demo (SIGMOD 2015)
JavaScript
1
star
26

noisepage-testfiles

Test Files & Data Sets for NoisePage DBMS Project
TSQL
1
star
27

15721-s24-scheduler2

Rust
1
star