• Stars
    star
    318
  • Rank 126,947 (Top 3 %)
  • Language
    C++
  • License
    BSD 3-Clause "New...
  • Created over 8 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Caliper is an instrumentation and performance profiling library

Caliper: A Performance Analysis Toolbox in a Library

Github Actions Build Status Coverage

Caliper is a performance instrumentation and profiling library for HPC (high-performance computing) programs. It provides source-code annotation APIs for marking regions of interest in C, C++, and Fortran code, as well as a set of built-in performance measurement recipes for a wide range of performance engineering use cases, such as lightweight always-on profiling, event tracing, or performance monitoring. Alternatively, users can create custom measurement configurations for specialized use cases.

Caliper can either generate simple human-readable reports or machine-readable JSON or .cali files for automated data processing with user-provided scripts or analysis frameworks like Hatchet and Thicket. It can also generate detailed event traces for timeline visualizations with Perfetto and the Google Chrome trace viewer.

Features include:

  • Low-overhead source-code annotation API
  • Configuration API to control performance measurements from within an application
  • Recording program metadata for analyzing collections of runs
  • Flexible key:value data model to capture application-specific features for performance analysis
  • Fully threadsafe implementation, support for parallel programming models like MPI
  • Event-based as well as sample-based performance measurements
  • Trace and profile recording
  • Connection to third-party tools, e.g. NVidia's NSight tools, AMD ROCProf, or Intel(R) VTune(tm)
  • Measurement and profiling functionality such as timers, PAPI hardware counters, and Linux perf_events
  • Memory annotations to associate performance measurements with memory regions

Documentation

Extensive documentation is available here: https://software.llnl.gov/Caliper/

Usage examples of the C++, C, and Fortran annotation and ConfigManager APIs are provided in the examples directory.

See the "Getting started" section below for a brief tutorial.

Building and installing

You can install Caliper with the spack package manager:

$ spack install caliper

To build Caliper manually, you need cmake 3.12+ and a current C++11-compatible Compiler. Clone Caliper from github and proceed as follows:

$ git clone https://github.com/LLNL/Caliper.git
$ cd Caliper
$ mkdir build && cd build
$ cmake -DCMAKE_INSTALL_PREFIX=<path to install location> ..
$ make
$ make install

Link Caliper to a program by adding libcaliper:

$ g++ -o app app.o -L<path install location>/lib64 -lcaliper

There are many build flags to enable optional features, such as -DWITH_MPI for MPI support. See the "Build and install" section in the documentation for further information.

Getting started

Typically, we integrate Caliper into a program by marking source-code sections of interest with descriptive annotations. Performance profiling can then be enabled through the Caliper ConfigManager API or environment variables. Alternatively, third-party tools can connect to Caliper and access information provided by the source-code annotations.

Source-code annotations

Caliper's source-code annotation API allows you to mark source-code regions of interest in your program. Much of Caliper's functionality depends on these region annotations.

Caliper provides macros and functions for C, C++, and Fortran to mark functions, loops, or sections of source-code. For example, use CALI_CXX_MARK_FUNCTION to mark a function in C++:

#include <caliper/cali.h>

void foo()
{
    CALI_CXX_MARK_FUNCTION;
    // ...
}

You can mark arbitrary code regions with the CALI_MARK_BEGIN and CALI_MARK_END macros or the corresponding cali_begin_region() and cali_end_region() functions:

#include <caliper/cali.h>

// ...
CALI_MARK_BEGIN("my region");
// ...
CALI_MARK_END("my region");

The cxx-example, c-example, and fortran-example example apps show how to use Caliper in C++, C, and Fortran, respectively.

Recording performance data

With the source-code annotations in place, we can run performance measurements. By default, Caliper does not record data - we have to activate performance profiling at runtime. An easy way to do this is to use one of Caliper's built-in measurement recipes. For example, the runtime-report config prints out the time spent in the annotated regions. You can activate built-in measurement configurations with the ConfigManager API or with the CALI_CONFIG environment variable. Let's try this on Caliper's cxx-example program:

$ cd Caliper/build
$ make cxx-example
$ CALI_CONFIG=runtime-report ./examples/apps/cxx-example
Path       Min time/rank Max time/rank Avg time/rank Time %
main            0.000119      0.000119      0.000119  7.079120
  mainloop      0.000067      0.000067      0.000067  3.985723
    foo         0.000646      0.000646      0.000646 38.429506
  init          0.000017      0.000017      0.000017  1.011303

The runtime-report config works for MPI and non-MPI programs. It reports the minimum, maximum, and average exclusive time (seconds) spent in each marked code region across MPI ranks (the values are identical in non-MPI programs).

You can customize the report with additional options. Some options enable additional Caliper functionality, such as profiling MPI and CUDA functions in addition to the user-defined regions, or additional metrics like memory usage. Other measurement configurations besides runtime-report include:

  • loop-report: Print summary and time-series information for loops.
  • mpi-report: Print time spent in MPI functions.
  • callpath-sample-report: Print a time spent in functions using call-path sampling.
  • event-trace: Record a trace of region enter/exit events in .cali format.
  • hatchet-region-profile: Record a region time profile for processing with Hatchet or cali-query.

See the "Builtin configurations" section in the documentation to learn more about different configurations and their options.

You can also create entirely custom measurement configurations by selecting and configuring Caliper services manually. See the "Manual configuration" section in the documentation to learn more.

ConfigManager API

A distinctive Caliper feature is the ability to enable performance measurements programmatically with the ConfigManager API. For example, we often let users activate performance measurements with a command-line argument.

With the C++ ConfigManager API, built-in performance measurement and reporting configurations can be activated within a program using a short configuration string. This configuration string can be hard-coded in the program or provided by the user in some form, e.g. as a command-line parameter or in the programs's configuration file.

To use the ConfigManager API, create a cali::ConfigManager object, add a configuration string with add(), start the requested configuration channels with start(), and trigger output with flush():

#include <caliper/cali-manager.h>
// ...
cali::ConfigManager mgr;
mgr.add("runtime-report");
// ...
mgr.start(); // start requested performance measurement channels
// ... (program execution)
mgr.flush(); // write performance results

The cxx-example program uses the ConfigManager API to let users specify a Caliper configuration with the -P command-line argument, e.g. -P runtime-report:

$ ./examples/apps/cxx-example -P runtime-report
Path       Min time/rank Max time/rank Avg time/rank Time %
main            0.000129      0.000129      0.000129  5.952930
  mainloop      0.000080      0.000080      0.000080  3.691740
    foo         0.000719      0.000719      0.000719 33.179511
  init          0.000021      0.000021      0.000021  0.969082

See the Caliper documentation for more examples and the full API and configuration reference.

Authors

Caliper was created by David Boehme, [email protected].

A complete list of contributors is available on GitHub.

Major contributors include:

Citing Caliper

To reference Caliper in a publication, please cite the following paper:

On GitHub, you can copy this citation in APA or BibTeX format via the "Cite this repository" button. Or, see the comments in CITATION.cff for the raw BibTeX.

Release

Caliper is released under a BSD 3-clause license. See LICENSE for details.

LLNL-CODE-678900

More Repositories

1

zfp

Compressed numerical arrays that support high-speed random access
C++
668
star
2

sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
C
454
star
3

RAJA

RAJA Performance Portability Layer (C++)
C++
431
star
4

Umpire

An application-focused API for memory management on NUMA & GPU architectures
C++
300
star
5

blt

A streamlined CMake build system foundation for developing HPC software
C++
242
star
6

lbann

Livermore Big Artificial Neural Network Toolkit
C++
219
star
7

SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development
C++
213
star
8

hiop

HPC solver for nonlinear optimization problems
C++
205
star
9

libROM

Model reduction library with an emphasis on large scale parallelism and linear subspace methods
C++
189
star
10

HPC-Tutorials

Future home of hpc-tutorials.llnl.gov
C
188
star
11

magpie

Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for Lustre, Slurm, Moab, Torque. LSF, Flux, and more.
Shell
188
star
12

conduit

Simplified Data Exchange for HPC Simulations
C++
179
star
13

units

A run-time C++ library for working with units of measurement and conversions between them and with string representations of units and measurements
C++
128
star
14

maestrowf

A tool to easily orchestrate general computational workflows both locally and on supercomputers
Python
126
star
15

serac

Serac is a high order nonlinear thermomechanical simulation code
C++
120
star
16

merlin

Machine Learning for HPC Workflows
Python
115
star
17

axom

CS infrastructure components for HPC applications
C++
110
star
18

ior

Parallel filesystem I/O benchmark
C
105
star
19

cowc

Cars Overhead With Context related scripts described in Mundhenk et al. 2016 ECCV.
Python
104
star
20

CHAI

Copy-hiding array abstraction to automatically migrate data between memory spaces
C++
101
star
21

UnifyFS

UnifyFS: A file system for burst buffers
C
96
star
22

scr

SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
C
96
star
23

LULESH

Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
C++
92
star
24

RAJAPerf

RAJA Performance Suite
C++
90
star
25

umap

User-space Page Management
C++
88
star
26

shroud

Shroud: generate Fortran and Python wrappers for C and C++ libraries
C++
87
star
27

MacPatch

Software & Patch management for macOS
Objective-C
86
star
28

FAST

Fusion models for Atomic and molecular STructures (FAST)
Python
85
star
29

msr-safe

Allows safer access to model specific registers (MSRs)
C
83
star
30

Aluminum

High-performance, GPU-aware communication library
C++
82
star
31

yorick

yorick interpreted language
C
76
star
32

fpzip

Lossless compressor of multidimensional floating-point arrays
C++
75
star
33

camp

Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda
C++
72
star
34

mpiP

A light-weight MPI profiler.
C
68
star
35

dataracebench

Data race benchmark suite for evaluating OpenMP correctness tools aimed to detect data races.
C
66
star
36

GOTCHA

GOTCHA is a library for wrapping function calls in shared libraries
C
64
star
37

STAT

STAT - the Stack Trace Analysis Tool
C
62
star
38

lmt

Lustre Monitoring Tools
C
62
star
39

variorum

Vendor-neutral library for exposing power and performance features across diverse architectures
C++
59
star
40

spheral

C++
56
star
41

pyranda

A Python driven, Fortran powered Finite Difference solver for arbitrary hyperbolic PDE systems. This is the mini-app for the Miranda code.
Fortran
56
star
42

lustre

LLNL's branches of Lustre
C
55
star
43

pylibROM

Python interface for libROM, library for reduced order models
Python
52
star
44

libmsr

Wrapper library for model-specific registers. APIs cover RAPL, performance counters, clocks and turbo.
C
51
star
45

metall

Persistent memory allocator for data-centric analytics
C++
50
star
46

cardioid

Cardiac simulation toolkit.
C++
49
star
47

scraper

Python library for getting metadata from source code hosting tools
Python
49
star
48

llnl.github.io

Public home for LLNL software catalog
JavaScript
48
star
49

mpiBench

MPI benchmark to test and measure collective performance
C
48
star
50

Abmarl

Agent Based Modeling and Reinforcement Learning
Python
47
star
51

H5Z-ZFP

A registered ZFP compression plugin for HDF5
C
47
star
52

ExaCA

Cellular automata code for alloy nucleation and solidification written with Kokkos
C++
46
star
53

mttime

Time Domain Moment Tensor Inversion in Python
Python
45
star
54

qball

Qball (also known as qb@ll) is a first-principles molecular dynamics code that is used to compute the electronic structure of atoms, molecules, solids, and liquids within the Density Functional Theory (DFT) formalism. It is a fork of the Qbox code by Francois Gygi.
C++
43
star
55

Juqbox.jl

Juqbox.jl solves quantum optimal control problems in closed quantum systems
Julia
42
star
56

quandary

Optimal control for open quantum systems
C++
42
star
57

unum

Universal Number Library
C
40
star
58

LaSDI

Jupyter Notebook
40
star
59

GridDyn

GridDyn is an open-source power transmission simulation software package
C++
40
star
60

fastcam

A toolkit for efficent computation of saliency maps for explainable AI attribution. This tool was developed at Lawrence Livermore National Laboratory.
Jupyter Notebook
39
star
61

DJINN

Deep jointly-informed neural networks -- as easy-to-use algorithm for designing/initializing neural nets
Python
39
star
62

CxxPolyFit

A simple library for producing multidimensional polynomial fits for C++
Fortran
37
star
63

ExaConstit

A crystal plasticity FEM code that runs on the GPU
C++
36
star
64

acrotensor

A C++ library for computing large scale tensor contractions.
C++
34
star
65

zero-rk

Zero-order Reaction Kinetics (Zero-RK) is a software package that simulates chemically reacting systems in a computationally efficient manner.
C++
33
star
66

wrap

MPI wrapper generator, for writing PMPI tool libraries
Python
33
star
67

mgmol

MGmol is a scalable O(N) First-Principles Molecular Dynamics code that is capable of performing large-scale electronics structure calculations and molecular dynamics simulations of atomistic systems.
C++
33
star
68

cruise

User space POSIX-like file system in main memory
C
32
star
69

ddcMD

A fully GPU-accelerated molecular dynamics program for the Martini force field
C
32
star
70

Quicksilver

A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037
C++
32
star
71

MACSio

A Multi-purpose, Application-Centric, Scalable I/O Proxy Application
C
32
star
72

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code
C++
31
star
73

UEDGE

2D fluid simulation of plasma and neutrals in magnetic fusion devices
Mathematica
30
star
74

FGPU

Fortran
30
star
75

graphite

A repository for implementing graph network models based on atomic structures.
Jupyter Notebook
30
star
76

CallFlow

Visualization tool for analyzing call trees and graphs
Vue
29
star
77

AMPE

Adaptive Mesh Phase-field Evolution
C++
29
star
78

burstfs

C
27
star
79

FPChecker

A dynamic analysis tool to detect floating-point errors in HPC applications.
Python
27
star
80

ravel

Ravel MPI trace visualization tool
C++
27
star
81

mpibind

Pragmatic, Productive, and Portable Affinity for HPC
C
27
star
82

CARE

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
C++
27
star
83

AMG

Algebraic multigrid benchmark
C
26
star
84

gLaSDI

Python
26
star
85

uberenv

Automates using spack to build and deploy software
Shell
25
star
86

havoqgt

C++
25
star
87

benchpark

An open collaborative repository for reproducible specifications of HPC benchmarks and cross site benchmarking environments
Python
24
star
88

ygm

C++
24
star
89

Silo

Mesh and Field I/O Library and Scientific Database
C
24
star
90

mpiGraph

MPI benchmark to generate network bandwidth images
Perl
24
star
91

muster

Massively Scalable Clustering
C++
23
star
92

cram

Tool to run many small MPI jobs inside of one large MPI job.
Python
23
star
93

SoRa

SoRa uses genetic programming to find mathematical representations from experimental data
Python
23
star
94

Task-Time-Tracker

A client side web app for tracking your time
JavaScript
23
star
95

apollo

Apollo: Online Machine Learning for Performance Portability
C++
22
star
96

MemAxes

Interactive Visualization of Memory Access Samples
C++
22
star
97

csld

Compressive sensing lattice dynamics
Python
22
star
98

MultiscaleTopOpt

A 3D multsicale topology optimization code using surrogate models of lattice microscale response
MATLAB
22
star
99

inq

This is a mirror. Please check our main website on gitlab.
C++
22
star
100

coda-calibration-tool

Tool for calibrating seismic coda source models
Java
22
star