• Stars
    star
    122
  • Rank 292,031 (Top 6 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scalable High-performance Algorithms and Data-structures

SHAD

https://github.com/pnnl/SHAD/raw/master/docs/shad_logo_500.jpeg

SHAD is the Scalable High-Performance Algorithms and Data-structures C++ library. SHAD is designed as a software stack, composed of three main layers:

  • Abstract Runtime Interface: SHAD adopts a shared-memory, task-based, programming model, whose main tasking primitives are definide in its runtime abstraction layer; this component represents an interface to underlying runtime systems, which implement tasking and threading; for portability, SHAD can interface with multiple Runtime Systems.
  • General Purpose Data-structures: SHAD data-structures offer a shared-memory abstraction, and provide APIs for parallel access and update; data-structures include arrays, vectors, maps and sets.
  • Extensions: SHAD extensions are custom libraries built using the underlying SHAD components, and/or other extensions; SHAD currently include graph data-structures and algorithms.

SHAD is written in C++, and requires compiler support for (at least) C++ 11. When building with GCC, version >=8 is required. To enable all of the SHAD's features, please review its Install Dependencies and Runtime Systems requirements.

How to cite SHAD

In publications SHAD can be cited as [SHAD].

[SHAD]V. G. Castellana and M. Minutoli, "SHAD: The Scalable High-Performance Algorithms and Data-Structures Library," 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA, 2018.

Quickstart with Docker

$ git clone https://github.com/pnnl/SHAD.git shad
$ cd shad
$ docker-compose -f docker/docker-compose.yml pull head worker
$ docker-compose -f docker/docker-compose.yml up -d scale worker=2
$ docker exec -u mpi -it dokcer_head_1 /bin/bash
$ cd $HOME/shad
$ mkdir build && cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Release -DSHAD_RUNTIME_SYSTEM=GMT
$ make

To run the unit test of the array on the docker cluster:

$ mpiexec -np 2 -ppn 1 --hosts docker_worker_1,docker_worker_2 \
    test/unit_tests/core/shad_array_test

Build Instructions

Install Dependencies

GPerftools

GPerftools is an optional dependency. Of the whole GPerftools framework, SHAD currently uses only tcmalloc when available. We have seen significant performance improvement in using tcmalloc over the standard allocator. Therefore, we recommend its use. In the case it is not available through your package manager, you can follow the following basic instruction to build and install GPerftools. Please refer to the project page to have more detailed information.

$ git clone https://github.com/gperftools/gperftools.git
$ cd gperftools
$ ./autogen.sh
$ mkdir build && cd build
$ ../configure --prefix=$GPERFTOOLSROOT
$ make && make install

where $GPERFTOOLSROOT is the directory where you want the library to be installed.

GTest

The Google Test framework is only required if you want to run the unit tests. On some system, GTest is not available through the package manager. In those cases you can install it following these instructions:

$ git clone https://github.com/google/googletest.git
$ cd googletest
$ mkdir build && cd build && cmake .. -DCMAKE_INSTALL_PREFIX=$GTESTROOT
$ make && make install

where $GTESTROOT is the directory where you want the library to be installed.

Runtime Systems

To fully exploit its features, SHAD requires a supported runtime system or threading library to be installed. SHAD currently supports:

If such software is not available on the system, SHAD can be compiled and used with its default (single-threaded) C++ backend.

GMT

SHAD uses the Global Memory and Threading (GMT) Runtime System as backend for commodity clusters. GMT requires a Linux OS, C compiler and MPI. It can be installed using the following commands:

$ git clone https://github.com/pnnl/gmt.git
$ cd gmt
$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=$GMT_ROOT \
    -DCMAKE_BUILD_TYPE=Release
$ make -j <SOMETHING_REASONABLE> && make install

where $GMT_ROOT is the directory where you want the library to be installed.

Build SHAD

Before attempting to build SHAD, please take a look at the requirements in Install Dependencies. In case gtest is not available, compilation of unit tests may be disabled setting SHAD_ENABLE_UNIT_TEST to off. Currently SHAD has full support for TBB and GMT Runtime Systems. Future releases will provide additional backends. Target runtime systems may be specified via the SHAD_RUNTIME_SYSTEM option: valid values for this option are GMT, TBB, and, CPP_SIMPLE.

$ git clone <url-to-SHAD-repo>  # or untar the SHAD source code.
$ cd shad
$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=$SHADROOT        \
    -DCMAKE_BUILD_TYPE=Release                     \
    -DSHAD_RUNTIME_SYSTEM=<SupportedRuntimeSystem> \
    # if using TBB                                 \
    -DTBB_ROOT=$TBBROOT                            \
    # else if using GMT                            \
    -DGMT_ROOT=$GMTROOT                            \
    # endif                                        \
    -DGTEST_ROOT=$GTESTROOT                        \
    -DGPERFTOOLS_ROOT=$GPERFTOOLSROOT
$ make -j <SOMETHING_REASONABLE> && make install

If you have multiple compilers (or compiler versions) available on your system, you may want to indicate a specific one using the -DCMAKE_CXX_COMPILER=<compiler> option.

Build the Documentation

SHAD's documentation is entirely written using Doxygen. You can obtain a copy of Doxygen through your package manager or following the installation instructions from their website. To build SHAD's documentation, you need to:

$ cd shad/build  # cd into your build directory.
$ cmake .. -DSHAD_ENABLE_DOXYGEN=1
$ make doxygen

Once the documentation is build, you can open with your favorite web browser the first page with:

open docs/doxygen/html/index.html  # From your build directory

SHAD Team

Disclamer Notice

This material was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor the United States Department of Energy, nor Battelle, nor any of their employees, nor any jurisdiction or organization that has cooperated in the development of these materials, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness or any information, apparatus, product, software, or process disclosed, or represents that its use would not infringe privately owned rights.

Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or Battelle Memorial Institute. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

PACIFIC NORTHWEST NATIONAL LABORATORY
operated by
BATTELLE
for the
UNITED STATES DEPARTMENT OF ENERGY
under Contract DE-AC05-76RL01830

More Repositories

1

neuromancer

Pytorch-based framework for solving parametric constrained optimization problems, physics-informed system identification, and parametric model predictive control.
Python
904
star
2

HyperNetX

Python package for hypergraph analysis and visualization.
Python
544
star
3

safekit

Python
140
star
4

OpenCGRA

OpenCGRA is an open-source framework for modeling, testing, and evaluating CGRAs.
Verilog
97
star
5

NWGraph

Complete Project Documentation
C++
93
star
6

cmaputil

Python
80
star
7

QASMBench

A low-level OpenQASM benchmark suite for NISQ evaluation and simulation. Please see our paper for details.
OpenQASM
79
star
8

ExaGO

High-performance power grid optimization for stochastic, security-constrained, and multi-period ACOPF problems.
C++
64
star
9

isicle

In silico chemical library engine for high-accuracy chemical property prediction
Python
58
star
10

DM-Sim

DM-Sim: Quantum Simulator on GPU Cluster using Density Matrix
OpenQASM
51
star
11

deps_arXiv2020

Differentiable predictive control (DPC) policy optimization examples.
MATLAB
47
star
12

lamellar-runtime

Lamellar is an asynchronous tasking runtime for HPC systems developed in RUST
Rust
42
star
13

DHSVM-PNNL

C
39
star
14

tesp

Python
39
star
15

COMET

C++
34
star
16

soda-opt

C++
32
star
17

deimos

Python
31
star
18

ripples

A C++ Library for Influence Maximization
C++
31
star
19

darkchem

Python
31
star
20

TCBNN

Cuda
31
star
21

chemreasoner

ChemReasoner - Catalyst Discovery via Large Language Model-driven Reasoning
Python
31
star
22

chgl

Chapel HyperGraph Library (CHGL) - HPC-class Hypergraphs in Chapel
Chapel
29
star
23

SLiCE

Subgraph Based Learning of Contextual Embedding
Python
28
star
24

gmt

Global Memory and Threading runtime system
C
23
star
25

chissl

Interactive machine learning interface
Jupyter Notebook
22
star
26

s-blas

This package includes the implementation for four sparse linear algebra kernels: Sparse-Matrix-Vector-Multiplication (SpMV), Sparse-Triangular-Solve (SpTRSV), Sparse-Matrix-Transposition (SpTrans) and Sparse-Matrix-Matrix-Multiplication (SpMM) for Single-node Multi-GPU (scale-up) platforms such as NVIDIA DGX-1 and DGX-2.
C++
22
star
27

MSAC

Jupyter Notebook
21
star
28

socialsim

Python
20
star
29

qasmtrans

A C++ based quantum transpiler for NISQ devices
OpenQASM
20
star
30

cactus

LLM Agent that leverages cheminformatics tools to provide informed responses.
Jupyter Notebook
20
star
31

torchntk

Jupyter Notebook
19
star
32

buildingid

Unique Building Identifier (UBID)
18
star
33

mol_dgnn

Molecular Dynamic Graph Neural Network
Python
18
star
34

DeepDataProfiler

Python library for analyzing the internal structure of deep neural networks.
Jupyter Notebook
18
star
35

mercat

MerCat: python code for versatile k-mer counting and diversity estimation for database independent property analysis for meta -ome data
Python
18
star
36

leapR

R
17
star
37

E4D

Standard ML
17
star
38

socialsim_package

Jupyter Notebook
17
star
39

DDKS

A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions
Jupyter Notebook
16
star
40

GridSTAGE

MATLAB
16
star
41

slim

Drop-in replacements for PyTorch nn.Linear for stable learning and inductive priors in physics informed machine learning applications.
Cuda
16
star
42

pychip_gui

pyCHIP is a tool for segmentation and feature classification in transmission electron microscopy (TEM) images based on a small support set of user-provided examples.
HTML
16
star
43

NWQ-Sim

OpenQASM
15
star
44

sven

JavaScript
15
star
45

arena

The programming runtime and interfaces for ARENA.
C++
14
star
46

fqc

📈 Extensible quality control dashboard built around FASTQ assessment.
JavaScript
14
star
47

buildingid-py

Unique Building Identifier (UBID)
Python
14
star
48

hundo

💯 Snakemake-based amplicon processing protocol for 16S and ITS sequences.
HTML
14
star
49

memgaze

C
12
star
50

SV-Sim

SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits
OpenQASM
11
star
51

Neural-Modules-for-Differential-Algebraic-Equations

Jupyter Notebook
11
star
52

rofi

C
10
star
53

hexwatershed

Flow routing using a hexagonal grid.
C++
10
star
54

HiParTI

C
10
star
55

solubility-prediction-paper

Jupyter Notebook
10
star
56

blueprint-styler

Make custom blueprint.css styles
TypeScript
10
star
57

copper

Performance curve generator for building energy simulation
Python
9
star
58

hypernetx-widget

JavaScript
9
star
59

building-energy-standards-data

Database of building energy standards data for building energy simulation.
Python
8
star
60

eqc

Python
8
star
61

ANIMATE

Jupyter Notebook
8
star
62

i2x

Interconnection Innovation e-Xchange (i2x) Test Systems
Python
8
star
63

esteem

JavaScript
8
star
64

psl

Jupyter Notebook
8
star
65

GeoCLUSTER

GeoCLUSTER is a Python-based web application that provides a collection of interactive methods for streamlining the visualization of the technical and economic modeling of closed-loop geothermal systems.
Python
8
star
66

mcl

C
7
star
67

grid_prediction

Jupyter Notebook
7
star
68

ruleset-checking-tool

Python
7
star
69

ARTS

Abstract RunTime System
C
7
star
70

ConStrain

ConStrain is a data-driven knowledge-integrated framework that automatically verifies that building system controls function as intended.
Jupyter Notebook
7
star
71

gparm

A tool for developing parametric inputs for other software.
Perl
6
star
72

sds

Jupyter Notebook
6
star
73

DMIPL

Differentiable Mixed-Integer Programming Layers
Jupyter Notebook
6
star
74

brislawn-2018-founders-species

🦠 📓 a microbiome paper
HTML
6
star
75

DPC_for_robotics

Python
6
star
76

AutoMicroED

Python
6
star
77

qFeature

Extract features from time series using moving windows of regression fits
HTML
6
star
78

STOMP-W

Fortran 90 source code, example problems, and output conversion scripts for the STOMP-W simulator.
Fortran
6
star
79

lamellar

Lamellar is an asynchronous tasking runtime for HPC systems developed in RUST
5
star
80

ThermalTracker

C++
5
star
81

Active-Sampling-for-Atomistic-Potentials

Active sampling for neural network potentials: Accelerated simulations of shear-induced deformation in Cu–Ni multilayers
Python
5
star
82

NWPEsSE

Python
5
star
83

NWHypergraph

C++
4
star
84

nwqbench

Python
4
star
85

nmrfit

Quantitative NMR analysis through least-squares fit of spectroscopy data
Python
4
star
86

LOPO

Learning to Optimize with Proximal Operators (LOPO)
Jupyter Notebook
4
star
87

DieselWolf

Open source data set for radio frequency machine learning research
Jupyter Notebook
4
star
88

pytorch_soo

Second Order Optimizers for Machine Learning
HTML
4
star
89

pecblocks

Generalized block diagram modeling of power electronic converters for grid solar and storage applications.
Python
4
star
90

external_sort

Rust
4
star
91

conformer_selection

Jupyter Notebook
4
star
92

neural_ODE_ICLR2020

Python
4
star
93

sppsi_cppf

Simulate microstructure evolution using the coupled CP and PF methods for the SPPSI project.
GLSL
4
star
94

rsed

Stream editing in R: Manipulating text files with insertions, replacements, deletions, substitutions, and commenting
R
4
star
95

ssass-e

SSASSE software is responsible for validating, and verifying innovative safe scanning methodologies, models, architectures, and prototypes to safely assess operational technology (OT) installed in critical energy infrastructure.
Python
4
star
96

oedisi_dopf

OpenEDI - System Integration (OEDI-SI) - PNNL Distributed Optimal Power Flow (DOPF)
Jupyter Notebook
4
star
97

DREAM_V2

The DREAM tool is an optimization software that determines subsurface monitoring configurations which detect carbon dioxide (CO2) leakage in the least amount of time. DREAM reads ensembles of CO2 leakage scenarios and determines optimal monitoring locations and techniques to deploy based on user-identified constraints. These data result in well configurations with the highest potential to detect leakage and minimize aquifer degradation in the shortest amount of time. DREAM was developed as part of the National Risk Assessment Partnership.
Java
4
star
98

pakman

PaKman: A Scalable Algorithm for Generating Genomic Contigs on Distributed Memory Machines
C++
4
star
99

renyiqnets

Python
3
star
100

mass1

Modular Aquatic Simulation System 1D (MASS1)
Fortran
3
star