• Stars
    star
    68
  • Rank 442,316 (Top 9 %)
  • Language
    C
  • Created about 8 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A light-weight MPI profiler.

mpiP 3.5

A light-weight MPI profiler.

Introduction

mpiP is a light-weight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file.

Downloading

The current version of mpiP can be accessed at https://github.com/LLNL/mpiP/releases/latest.

New Features & Bug Fixes

Version 3.5 includes several new features, including

  • Multi-threaded support
  • Additional MPI-IO functions
  • Various updates including
    • New configuration options and tests
    • Updated test suite
    • Updated build behavior

Please see the ChangeLog for additional changes.

Configuring and Building mpiP

Dependencies

  • MPI installation
  • libunwind : for collecting stack traces.
  • binutils : for address to source translation
  • glibc backtrace() can also be usef for stack tracing, but source line numbers may be inconsistent.

Configuration

Several specific configuration flags can be using, as provided by ./configure -h. Standard configure flags, such as CC, can be used for specifying MPI compiler wrapper scripts.

Build Make Targets

Target Effect
[default] Build libmpiP.so
all Build shared library and all tests
check Use dejagnu to run and evaluate tests

Using mpiP

Using mpiP is very simple. Because it gathers MPI information through the MPI profiling layer, mpiP is a link time library. That is, you don't have to recompile your application to use mpiP. Note that you might have to recompile to include the '-g' option. This is important if you want mpiP to decode the PC to a source code filename and line number automatically. mpiP will work without -g, but mileage may vary.

Instrumentation

Link Time Instrumentation

Link the mpiP library with an executable. The dependent libraries may need to be specified as well. If the link command includes the MPI library, order the mpiP library before the MPI library, as in -lmpiP -lmpi.

Run Time Instrumentation

An uninstrumented executable may able to be instrumented at run time by setting the LD_PRELOAD environment variable, as in export LD_PRELOAD=[path to mpiP]/libmpiP.so. Preloading libmpiP can possibly interfere with the launcher and may need to be specified on the launch command, such as srun -n 2 --export=LD_PRELOAD=[path to mpiP]/libmpiP.so [executable].

mpiP Run Time Flags

The behavior of mpiP can be set at run time through the use of the following flags. Multiple flags can be delimited with spaces or commas.

Option Description Default
-c Generate concise version of report, omitting callsite process-specific detail.
-d Suppress printing of callsite detail sections.
-e Print report data using floating-point format.
-f dir Record output file in directory <dir>. .
-g Enable mpiP debug mode. disabled
-k n Sets callsite stack traceback depth to . 1
-l Use less memory to generate the report by using MPI collectives to generate callsite information on a callsite-by-callsite basis.
-n Do not truncate full pathname of filename in callsites.
-o Disable profiling at initialization. Application must enable profiling with MPI_Pcontrol().
-p Point-to-point histogram reporting on message size and communicator used.
-r Generate the report by aggregating data at a single task. default
-s n Set hash table size to <n>. 256
-t x Set print threshold for report, where <x> is the MPI percentage of time for each callsite. 0.0
-v Generates both concise and verbose report output.
-x exe Specify the full path to the executable.
-y Collective histogram reporting on message size and communicator used.
-z Suppress printing of the report at MPI_Finalize.

For example, to set the callsite stack walking depth to 2 and the report print threshold to 10%, you simply need to define the mpiP string in your environment, as in any of the following examples:

$ export MPIP="-t 10.0 -k 2" (bash)

$ export MPIP=-t10.0,-k2 (bash)

$ setenv MPIP "-t 10.0 -k 2" (csh)

mpiP prints a message at initialization if it successfully finds the MPIP variable.

For more information on mpiP, please see the User Guide in the mpiP distribution.

License

Copyright (c) 2006, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory Written by Jeffery Vetter and Christopher Chambreau. UCRL-CODE-223450. All rights reserved.

This file is part of mpiP. For details, see http://llnl.github.io/mpiP.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the disclaimer below.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the disclaimer (as noted below) in the documentation and/or other materials provided with the distribution.

  • Neither the name of the UC/LLNL nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OF THE UNIVERSITY OF CALIFORNIA, THE U.S. DEPARTMENT OF ENERGY OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Additional BSD Notice

  1. This notice is required to be provided under our contract with the U.S. Department of Energy (DOE). This work was produced at the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-ENG-48 with the DOE.

  2. Neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, or assumes any liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights.

  3. Also, reference herein to any specific commercial products, process, or services by trade name, trademark, manufacturer or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California, and shall not be used for advertising or product endorsement purposes.

More Repositories

1

zfp

Compressed numerical arrays that support high-speed random access
C++
668
star
2

sundials

Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
C
454
star
3

RAJA

RAJA Performance Portability Layer (C++)
C++
431
star
4

Caliper

Caliper is an instrumentation and performance profiling library
C++
318
star
5

Umpire

An application-focused API for memory management on NUMA & GPU architectures
C++
300
star
6

blt

A streamlined CMake build system foundation for developing HPC software
C++
242
star
7

lbann

Livermore Big Artificial Neural Network Toolkit
C++
219
star
8

SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development
C++
213
star
9

hiop

HPC solver for nonlinear optimization problems
C++
205
star
10

libROM

Model reduction library with an emphasis on large scale parallelism and linear subspace methods
C++
189
star
11

HPC-Tutorials

Future home of hpc-tutorials.llnl.gov
C
188
star
12

magpie

Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for Lustre, Slurm, Moab, Torque. LSF, Flux, and more.
Shell
188
star
13

conduit

Simplified Data Exchange for HPC Simulations
C++
179
star
14

units

A run-time C++ library for working with units of measurement and conversions between them and with string representations of units and measurements
C++
128
star
15

maestrowf

A tool to easily orchestrate general computational workflows both locally and on supercomputers
Python
126
star
16

serac

Serac is a high order nonlinear thermomechanical simulation code
C++
120
star
17

merlin

Machine Learning for HPC Workflows
Python
115
star
18

axom

CS infrastructure components for HPC applications
C++
110
star
19

ior

Parallel filesystem I/O benchmark
C
105
star
20

cowc

Cars Overhead With Context related scripts described in Mundhenk et al. 2016 ECCV.
Python
104
star
21

CHAI

Copy-hiding array abstraction to automatically migrate data between memory spaces
C++
101
star
22

UnifyFS

UnifyFS: A file system for burst buffers
C
96
star
23

scr

SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
C
96
star
24

LULESH

Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
C++
92
star
25

RAJAPerf

RAJA Performance Suite
C++
90
star
26

umap

User-space Page Management
C++
88
star
27

shroud

Shroud: generate Fortran and Python wrappers for C and C++ libraries
C++
87
star
28

MacPatch

Software & Patch management for macOS
Objective-C
86
star
29

FAST

Fusion models for Atomic and molecular STructures (FAST)
Python
85
star
30

msr-safe

Allows safer access to model specific registers (MSRs)
C
83
star
31

Aluminum

High-performance, GPU-aware communication library
C++
82
star
32

yorick

yorick interpreted language
C
76
star
33

fpzip

Lossless compressor of multidimensional floating-point arrays
C++
75
star
34

camp

Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda
C++
72
star
35

dataracebench

Data race benchmark suite for evaluating OpenMP correctness tools aimed to detect data races.
C
66
star
36

GOTCHA

GOTCHA is a library for wrapping function calls in shared libraries
C
64
star
37

STAT

STAT - the Stack Trace Analysis Tool
C
62
star
38

lmt

Lustre Monitoring Tools
C
62
star
39

variorum

Vendor-neutral library for exposing power and performance features across diverse architectures
C++
59
star
40

spheral

C++
56
star
41

pyranda

A Python driven, Fortran powered Finite Difference solver for arbitrary hyperbolic PDE systems. This is the mini-app for the Miranda code.
Fortran
56
star
42

lustre

LLNL's branches of Lustre
C
55
star
43

pylibROM

Python interface for libROM, library for reduced order models
Python
52
star
44

libmsr

Wrapper library for model-specific registers. APIs cover RAPL, performance counters, clocks and turbo.
C
51
star
45

metall

Persistent memory allocator for data-centric analytics
C++
50
star
46

cardioid

Cardiac simulation toolkit.
C++
49
star
47

scraper

Python library for getting metadata from source code hosting tools
Python
49
star
48

llnl.github.io

Public home for LLNL software catalog
JavaScript
48
star
49

mpiBench

MPI benchmark to test and measure collective performance
C
48
star
50

Abmarl

Agent Based Modeling and Reinforcement Learning
Python
47
star
51

H5Z-ZFP

A registered ZFP compression plugin for HDF5
C
47
star
52

ExaCA

Cellular automata code for alloy nucleation and solidification written with Kokkos
C++
46
star
53

mttime

Time Domain Moment Tensor Inversion in Python
Python
45
star
54

qball

Qball (also known as qb@ll) is a first-principles molecular dynamics code that is used to compute the electronic structure of atoms, molecules, solids, and liquids within the Density Functional Theory (DFT) formalism. It is a fork of the Qbox code by Francois Gygi.
C++
43
star
55

Juqbox.jl

Juqbox.jl solves quantum optimal control problems in closed quantum systems
Julia
42
star
56

quandary

Optimal control for open quantum systems
C++
42
star
57

unum

Universal Number Library
C
40
star
58

LaSDI

Jupyter Notebook
40
star
59

GridDyn

GridDyn is an open-source power transmission simulation software package
C++
40
star
60

fastcam

A toolkit for efficent computation of saliency maps for explainable AI attribution. This tool was developed at Lawrence Livermore National Laboratory.
Jupyter Notebook
39
star
61

DJINN

Deep jointly-informed neural networks -- as easy-to-use algorithm for designing/initializing neural nets
Python
39
star
62

CxxPolyFit

A simple library for producing multidimensional polynomial fits for C++
Fortran
37
star
63

ExaConstit

A crystal plasticity FEM code that runs on the GPU
C++
36
star
64

acrotensor

A C++ library for computing large scale tensor contractions.
C++
34
star
65

zero-rk

Zero-order Reaction Kinetics (Zero-RK) is a software package that simulates chemically reacting systems in a computationally efficient manner.
C++
33
star
66

wrap

MPI wrapper generator, for writing PMPI tool libraries
Python
33
star
67

mgmol

MGmol is a scalable O(N) First-Principles Molecular Dynamics code that is capable of performing large-scale electronics structure calculations and molecular dynamics simulations of atomistic systems.
C++
33
star
68

cruise

User space POSIX-like file system in main memory
C
32
star
69

ddcMD

A fully GPU-accelerated molecular dynamics program for the Martini force field
C
32
star
70

Quicksilver

A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037
C++
32
star
71

MACSio

A Multi-purpose, Application-Centric, Scalable I/O Proxy Application
C
32
star
72

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code
C++
31
star
73

UEDGE

2D fluid simulation of plasma and neutrals in magnetic fusion devices
Mathematica
30
star
74

FGPU

Fortran
30
star
75

graphite

A repository for implementing graph network models based on atomic structures.
Jupyter Notebook
30
star
76

CallFlow

Visualization tool for analyzing call trees and graphs
Vue
29
star
77

AMPE

Adaptive Mesh Phase-field Evolution
C++
29
star
78

burstfs

C
27
star
79

FPChecker

A dynamic analysis tool to detect floating-point errors in HPC applications.
Python
27
star
80

ravel

Ravel MPI trace visualization tool
C++
27
star
81

ygm

C++
27
star
82

mpibind

Pragmatic, Productive, and Portable Affinity for HPC
C
27
star
83

CARE

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
C++
27
star
84

AMG

Algebraic multigrid benchmark
C
26
star
85

gLaSDI

Python
26
star
86

uberenv

Automates using spack to build and deploy software
Shell
25
star
87

havoqgt

C++
25
star
88

benchpark

An open collaborative repository for reproducible specifications of HPC benchmarks and cross site benchmarking environments
Python
24
star
89

Silo

Mesh and Field I/O Library and Scientific Database
C
24
star
90

mpiGraph

MPI benchmark to generate network bandwidth images
Perl
24
star
91

muster

Massively Scalable Clustering
C++
23
star
92

cram

Tool to run many small MPI jobs inside of one large MPI job.
Python
23
star
93

SoRa

SoRa uses genetic programming to find mathematical representations from experimental data
Python
23
star
94

Task-Time-Tracker

A client side web app for tracking your time
JavaScript
23
star
95

apollo

Apollo: Online Machine Learning for Performance Portability
C++
22
star
96

MemAxes

Interactive Visualization of Memory Access Samples
C++
22
star
97

csld

Compressive sensing lattice dynamics
Python
22
star
98

MultiscaleTopOpt

A 3D multsicale topology optimization code using surrogate models of lattice microscale response
MATLAB
22
star
99

inq

This is a mirror. Please check our main website on gitlab.
C++
22
star
100

coda-calibration-tool

Tool for calibrating seismic coda source models
Java
22
star