• Stars
    star
    151
  • Rank 246,057 (Top 5 %)
  • Language
    C++
  • License
    Other
  • Created almost 9 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Vector Particle-In-Cell (VPIC) Project

Vector Particle-In-Cell (VPIC) Project

Welcome to the legacy version of VPIC! The new version of VPIC, based on the Kokkos performance portable framework, is available here: https://github.com/lanl/vpic-kokkos. This legacy version is no longer under active development, and new users are encouraged to use the Kokkos version.

VPIC is a general purpose particle-in-cell simulation code for modeling kinetic plasmas in one, two, or three spatial dimensions. It employs a second-order, explicit, leapfrog algorithm to update charged particle positions and velocities in order to solve the relativistic kinetic equation for each species in the plasma, along with a full Maxwell description for the electric and magnetic fields evolved via a second- order finite-difference-time-domain (FDTD) solve. The VPIC code has been optimized for modern computing architectures and uses Message Passing Interface (MPI) calls for multi-node application as well as data parallelism using threads. VPIC employs a variety of short-vector, single-instruction-multiple-data (SIMD) intrinsics for high performance and has been designed so that the data structures align with cache boundaries. The current feature set for VPIC includes a flexible input deck format capable of treating a wide variety of problems. These include: the ability to treat electromagnetic materials (scalar and tensor dielectric, conductivity, and diamagnetic material properties); multiple emission models, including user-configurable models; arbitrary, user-configurable boundary conditions for particles and fields; user- definable simulation units; a suite of "standard" diagnostics, as well as user-configurable diagnostics; a Monte-Carlo treatment of collisional processes capable of treating binary and unary collisions and secondary particle generation; and, flexible checkpoint-restart semantics enabling VPIC checkpoint files to be read as input for subsequent simulations. VPIC has a native I/O format that interfaces with the high-performance visualization software Ensight and Paraview. While the common use cases for VPIC employ low-order particles on rectilinear meshes, a framework exists to treat higher-order particles and curvilinear meshes, as well as more advanced field solvers.

Attribution

Researchers who use the VPIC code for scientific research are asked to cite the papers by Kevin Bowers listed below.

  1. Bowers, K. J., B. J. Albright, B. Bergen, L. Yin, K. J. Barker and D. J. Kerbyson, "0.374 Pflop/s Trillion-Particle Kinetic Modeling of Laser Plasma Interaction on Road-runner," Proc. 2008 ACM/IEEE Conf. Supercomputing (Gordon Bell Prize Finalist Paper). http://dl.acm.org/citation.cfm?id=1413435

  2. K.J. Bowers, B.J. Albright, B. Bergen and T.J.T. Kwan, Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation, Phys. Plasmas 15, 055703 (2008); http://dx.doi.org/10.1063/1.2840133

  3. K.J. Bowers, B.J. Albright, L. Yin, W. Daughton, V. Roytershteyn, B. Bergen and T.J.T Kwan, Advances in petascale kinetic simulations with VPIC and Roadrunner, Journal of Physics: Conference Series 180, 012055, 2009

Getting the Code

To checkout the VPIC source, do the following:

    git clone https://github.com/lanl/vpic.git

Branches

The stable release of vpic exists on master, the default branch.

For more cutting edge features, consider using the devel branch.

User contributions should target the devel branch.

Requirements

The primary requirement to build VPIC is a C++11 capable compiler and an up-to-date version of MPI.

Build Instructions

    cd vpic 

VPIC uses the CMake build system. To configure a build, do the following from the top-level source directory:

    mkdir build
    cd build

The ./arch directory also contains various cmake scripts (including specific build options) which can help with building, but the user is left to select which compiler they wish to use. The scripts are largely organized into folders by compiler, with specific flags and options set to match the target compiler.

Any of the arch scripts can be invoked specifying the file name from inside a build directory:

    ../arch/reference-Debug

After configuration, simply type:

    make

Three scripts in the ./arch directory are of particular note: lanl-ats1-hsw, lanl-ats1-knl and lanl-cts1. These scripts provide a default way to build VPIC on LANL ATS-1 clusters such as Trinity and Trinitite and LANL CTS-1 clusters. The LANL ATS-1 clusters are the first generation of DOE Advanced Technology Systems and consist of a partition of dual socket Intel Haswell nodes and a partition of single socket Intel Knights Landing nodes. The LANL CTS-1 clusters are the first generation of DOE Commodity Technology Systems and consist of dual socket Intel Broadwell nodes running the TOSS 3.3 operating system. The lanl-ats1-hsw, lanl-ats1-knl and lanl-cts1 scripts are heavily documented and can be configured to provide a large variety of custom builds for their respective platform types. These scripts could also serve as a good starting point for development of a build script for other platform types. Because these scripts also configure the users build environment via the use of module commands, the scripts run both the cmake and make commands.

From the user created build directory, these scripts can be invoked as follows:

    ../arch/lanl-ats1-hsw

or

    ../arch/lanl-ats1-knl

or

    ../arch/lanl-cts1

Advanced users may choose to instead invoke cmake directly and hand select options. Documentation on valid ways to select these options may be found in the lanl-ats1 and lanl-cts1 build scripts mentioned above.

GCC users should ensure the -fno-strict-aliasing compiler flag is set (as shown in ./arch/generic-gcc-sse).

Building an example input deck

After you have successfully built VPIC, you should have an executable in the bin directory called vpic (./bin/vpic). To build an executable from one of the sample input decks (found in ./sample), simply run:

    ./bin/vpic input_deck

where input_deck is the name of your sample deck. For example, to build the harris input deck in the sample subdirectory (assuming that your build directory is located in the top-level source directory):

    ./bin/vpic ../sample/harris

Beginners are advised to read the harris deck thoroughly, as it provides many examples of common uses cases.

Command Line Arguments

Note: Historic VPIC users should note that the format of command line arguments was changed in the first open source release. The equals symbol is no longer accepted, and two dashes are mandatory.

In general, command line arguments take the form --command value, in which two dashes are followed by a keyword, with a space delimiting the command and the value.

The following specific syntax is available to the users:

Threading

Threading (per MPI rank) can be enabled using the following syntax:

    ./binary.Linux --tpp n

Where n specifies the number of threads

Example:

    mpirun -n 2 ./binary.Linux --tpp 2

To run with VPIC with two threads per MPI rank.

Checkpoint Restart

VPIC can restart from a checkpoint dump file, using the following syntax:

    ./binary.Linux --restore <path to file>

Example:

    ./binary.Linux --restore ./restart/restart0 

To restart VPIC using the restart file ./restart/restart0

Compile Time Arguments

Currently, the following options are exposed at compile time for the users consideration:

Particle Array Resizing

  • DISABLE_DYNAMIC_RESIZING (default OFF): Enable to disable the use of dynamic particle resizing
  • SET_MIN_NUM_PARTICLES (default 128 [4kb]): Set the minimum number of particles allowable when dynamically resizing

Threading Model

  • USE_PTHREADS: Use Pthreads for threading model, (default ON)
  • USE_OPENMP: Use OpenMP for threading model

Vectorization

The following CMake variables are used to control the vector implementation that VPIC uses for each SIMD width. Currently, there is support for 128 bit, 256 bit and 512 bit SIMD widths. The default is for each of these CMake variables to be disabled which means that an unvectorized reference implementation of functions will be used.

  • USE_V4_SSE: Enable 4 wide (128-bit) SSE

  • USE_V4_AVX: Enable 4 wide (128-bit) AVX

  • USE_V4_AVX2: Enable 4 wide (128-bit) AVX2

  • USE_V4_ALTIVEC: Enable 4 wide (128-bit) Altivec

  • USE_V4_PORTABLE: Enable 4 wide (128-bit) portable implementation

  • USE_V8_AVX: Enable 8 wide (256-bit) AVX

  • USE_V8_AVX2: Enable 8 wide (256-bit) AVX2

  • USE_V8_PORTABLE: Enable 8 wide (256-bit) portable implementation

  • USE_V16_AVX512: Enable 16 wide (512-bit) AVX512

  • USE_V16_PORTABLE: Enable 16 wide (512-bit) portable implementation

Several functions in VPIC have vector implementations for each of the three SIMD widths. Some only have a single implementation. An example of the latter is move_p which only has a reference implementation and a V4 implementation.

It is possible to have a single CMake vector variable configured as ON for each of the three supported SIMD vector widths. It is recommended to always have a CMake variable configured as ON for the 128 bit SIMD vector width so that move_p will be vectorized. In addition, it is recommended to configure as ON the CMake variable that is associated with the native SIMD vector width of the processor that VPIC is targeting. If a CMake variable is configured as ON for each of the three available SIMD vector widths, then for a given function in VPIC, the implementation which supports the largest SIMD vector length will be chosen. If a V16 implementation exists, it will be chosen. If a V16 implementation does not exist but V8 and V4 implementations exist, the V8 implementation will be chosen. If V16 and V8 implementations do not exist but a V4 implementation does, it will be chosen. If no SIMD vector implementation exists, the unvectorized reference implementation will be chosen.

In summary, when using vector versions on a machine with 256 bit SIMD, the V4 and V8 implementations should be configured as ON. When using a machine with 512 bit SIMD, V4 and V16 implementations should be configured as ON. When choosing a vector implementation for a given SIMD vector length, the implementation that is closest to the SIMD instruction set for the targeted processor should be chosen. The portable versions are most commonly used for debugging the implementation of new intrinsics versions. However, the portable versions are generally more performant than the unvectorized reference implemenation. So, one might consider using the V4_PORTABLE version on ARM processors until a V4_NEON implementation becomes available.

Output

  • VPIC_PRINT_MORE_DIGITS: Enable more digits in timing output of status reports

Particle sorting implementation

The CMake variable below allows building VPIC to use the legacy, thread serial implementation of the particle sort algorithm.

  • USE_LEGACY_SORT: Use legacy thread serial particle sort, (default OFF)

The legacy particle sort implementation is the thread serial particle sort implementation from the legacy v407 version of VPIC. This implementation supports both in-place and out-of-place sorting of the particles. It is very competitive with the thread parallel sort implementation for a small number of threads per MPI rank, i.e. 4 or less, especially on KNL because sorting the particles in-place allows the fraction of particles stored in High Bandwidth Memory (HBM) to remain stored in HBM. Also, the memory footprint of VPIC is reduced by the memory of a particle array which can be significant for particle dominated problems.

The default particle sort implementation is a thread parallel implementation. Currently, it can only perform out-of-place sorting of the particles. It will be more performant than the legacy implementation when using many threads per MPI rank but uses more memory because of the out-of-place sort.

Workflow

Contributors are asked to be aware of the following workflow:

  1. Pull requests are accepted into devel upon tests passing
  2. master should reflect the stable state of the code
  3. Periodic releases will be made from devel into master

Feedback

Feedback, comments, or issues can be raised through GitHub issues.

A mailing list for open collaboration can also be found here

Versioning

Version release summary:

V1.2 (October 2020)

  • Improved Neon intrinsics support
  • Added Takizuka-Abe collision operator
  • Threaded hydro_p pipelines
  • Added unit documentation

V1.1 (March 2019)

  • Added V8 and V16 functionality
  • Improved documentation and build processes
  • Significantly improved testing and correctness capabilities

V1.0

Initial release

Release

This software has been approved for open source release and has been assigned LA-CC-15-109.

Copyright

Β© (or copyright) 2020. Triad National Security, LLC. All rights reserved. This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. All rights in the program are reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear Security Administration. The Government is granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.

License

VPIC is distributed under a BSD license.

More Repositories

1

qmasm

Quantum macro assembler for D-Wave systems
Python
332
star
2

pyxDamerauLevenshtein

pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.
Python
241
star
3

quantum_algorithms

Codes accompanying the paper "Quantum algorithm implementations for beginners"
Jupyter Notebook
188
star
4

yelpapi

yelpapi is a pure Python implementation of the Yelp Fusion API (aka Yelp v3 API).
Python
135
star
5

CoARCT

Code Analysis and Refactoring with Clang Tools
C++
122
star
6

LaGriT

Los Alamos Grid Toolbox (LaGriT) is a library of user callable tools that provide mesh generation, mesh optimization and dynamic mesh maintenance in two and three dimensions.
Fortran
122
star
7

goop

Dynamic object-oriented programming support for the Go language
Go
109
star
8

scico

Scientific Computational Imaging COde
Python
92
star
9

MPI-Bash

Parallel version of the Bash shell
C
83
star
10

pyHarmonySearch

pyHarmonySearch is a pure Python implementation of the harmony search (HS) global optimization algorithm.
Python
81
star
11

dfnWorks

dfnWorks is a parallelized computational suite to generate three-dimensional discrete fracture networks (DFN) and simulate flow and transport. If you download the software please fill out our interest form to stay up to date on releases https://goo.gl/forms/VE39oKsyp4LVC6Gj2 and join our google group https://groups.google.com/d/forum/dfnworks-users . Precompiled Docker Container https://hub.docker.com/r/ees16/dfnworks
Python
74
star
12

CLAMR

Cell-Based Adaptive Mesh Refinement
C++
69
star
13

hippynn

python library for atomistic machine learning
Python
67
star
14

mcnptools

C++
66
star
15

parthenon

Parthenon AMR infrastructure
C++
65
star
16

OpenFWI

A collection of codes with OpenFWI project
Python
63
star
17

MF-LBM

MF-LBM: A Portable, Scalable and High-performance Lattice Boltzmann Code for DNS of Flow in Porous Media
Fortran
62
star
18

Byfl

Program analysis tool based on software performance counters
C++
56
star
19

FEHM

Finite Element Heat and Mass Transfer Code
GLSL
52
star
20

PPT

Performance Prediction Toolkit
Python
51
star
21

SNAP

SN Application Proxy
Fortran
48
star
22

vision_transformers_explained

This folder of code contains code and notebooks to supplement the "Vision Transformers Explained" series published on Towards Data Science written by Skylar Callis.
Jupyter Notebook
48
star
23

phoebus

Phifty One Ergs Blows Up A Star
Jupyter Notebook
44
star
24

vpic-kokkos

Kokkos port of VPIC
C++
43
star
25

clp

Go language bindings for the COIN-OR Linear Programming library
Go
43
star
26

LATTE

Developer repository for the LATTE code
Fortran
40
star
27

bml

The Basic Matrix Library (bml)
C
38
star
28

VPSC_code

Visco Plastic Self Consistent code
TeX
36
star
29

Architector

The architector python package - for 3D metal complex design. C22085
Python
35
star
30

PYSEQM

an interface to semi-empirical quantum chemistry methods implemented with pytorch
Python
34
star
31

adscodex

ADS Codex is a codec for organic molecular archives.
Go
34
star
32

Fierro

Fierro is a C++ code designed to aid the research and development of numerical methods, testing of user-specified models, and creating multi-scale models related to quasi-static solid mechanics and compressible material dynamics using low- and high-order meshes.
C++
32
star
33

ExactPack

ExactPack: An Open-Source Software Package for Code Verification
Python
31
star
34

Blackout-Diffusion

Jupyter Notebook
31
star
35

ALF

A framework for performing active learning for training machine-learned interatomic potentials.
Python
30
star
36

nubhlight

General Relativistic Neutrino Radiation Magnetohydrodynamics for Neutron Star Merger Disks
C
29
star
37

SEPIA

Simulation-Enabled Prediction, Inference, and Analysis: physics-informed statistical learning.
Python
29
star
38

QA-Prolog

Quantum Annealing Prolog
Go
27
star
39

APPFL

Auto-parallelizing Pure Functional Language
Haskell
27
star
40

SICM

Simplified Interface to Complex Memory
C
26
star
41

color

studies on color, colormaps, and colorspaces
R
25
star
42

singularity-eos

Performance portable equations of state and mixed cell closures
C++
25
star
43

CGMF

CGMF nuclear fission fragment de-excitation statistical code
Jupyter Notebook
25
star
44

PyBNF

An application for parameterization of biological models available in SBML and BNGL formats. Features include parallelization, metaheuristic optimization algorithms, and an adaptive Markov chain Monte Carlo (MCMC) sampling algorithm.
Python
25
star
45

circle

Go language interface to the Libcircle distributed-queue API
Go
24
star
46

PISTON

A Portable Cross-Platform Framework for Data-Parallel Visualization Operators
C++
24
star
47

MATAR

MATAR is a C++ software library to allow developers to easily create and use dense and sparse data representations that are also portable across disparate architectures using Kokkos.
C++
24
star
48

edif2qmasm

Run hardware descriptions on a quantum annealer
Go
21
star
49

PENNANT

Unstructured mesh hydrodynamics for advanced architectures
C++
21
star
50

libquo

Dynamic execution environments for coupled, thread-heterogeneous MPI+X applications
C
21
star
51

pyDNMFk

Python Distributed Non Negative Matrix Factorization with custom clustering
Python
20
star
52

qmd-progress

PROGRESS: Parallel, Rapid O(N) and Graph-based Recursive Electronic Structure Solver.
Fortran
20
star
53

ELEMENTS

The C++ ELEMENTS library contains a suite of sub-libraries to support mathematical functions (elements), data representations (MATAR), and novel mesh classes (geometry and SWAGE) to support a very broad range of element types, numerical methods, and mesh connectivity data structures useful for computational physics and engineering.
C++
20
star
54

pyBASS

Bayesian Adaptive Spline Surfaces for flexible and automatic regression
Python
19
star
55

ExascaleDocs

Exascale papers and presentations
Shell
19
star
56

FLPR

FLPR: The Fortran Language Program Remodeling system
C++
18
star
57

MILK

MAUD Interface Language Kit (MILK) is a set of Rietveld tools for automated processing of diffraction datasets.
Python
18
star
58

go-papi

Go language interface to the PAPI performance API
Go
18
star
59

VizAly-Foresight

A compression benchmark suite
C++
17
star
60

tycho2

A neutral particle transport mini-app to study performance of sweeps on unstructured, 3D tetrahedral meshes.
C++
17
star
61

Pavilion

HPC testing harness
Python
17
star
62

Zotero-Robust-Links-Extension

Create Robust Links from within Zotero
JavaScript
17
star
63

COVID-19-Predictions

Daily Forecasting of New Cases for Regional Epidemics of Coronavirus Disease 2019 with Bayesian Uncertainty Quantification
16
star
64

BEE

Python
15
star
65

jali

A parallel unstructured mesh framework for multiphysics application
C++
15
star
66

scout

Scout -- Domain Specific Language & Toolchain
C++
15
star
67

CODY

Continuum Dynamics Evaluation and Test Suite
C++
15
star
68

pyDNTNK

Python Distributed Non Negative Tensor Networks
Python
14
star
69

gridder

gridder is a simple interactive grid generation tool for creating 2D and 3D orthogonal grids. Used at Los Alamos National Laboratory (EES Group).
C
14
star
70

branson

A Monte Carlo transport mini-app for studying new parallel algorithms
C++
14
star
71

benchmarks

Benchmarks
C
14
star
72

CompactHash

A compact hash algorithm for CPUs and GPUs using OpenCL
C
14
star
73

Phase-Field-Dislocation-Dynamics-PFDD

Phase field model for material science applications.
C
14
star
74

PerfectHash

A perfect hash code for CPUs and GPUs using OpenCL
C
14
star
75

LAVA

Lava is a general-purpose calculator that provides a python interface to enable one-click calculation of the many common properties with LAMMPS and VASP. The name Lava is derived from the β€œLa” in LAMMPS and β€œva” in VASP. It provides a set of classes and functions to generate configurations, run lammps/vasp calculation, retrieve the output, postprocess and plot the results. All the above tasks are hard-coded into the script, without the need to call additional libraries.
Python
14
star
76

McPhD

A Parallel Haskell framework for particle-based Monte Carlo simulations
Roff
13
star
77

NEXMD

Fortran
13
star
78

RAM-SCB

RAM-SCB is a unique code that combines a kinetic model of ring current plasma with a three dimensional force-balanced model of the terrestrial magnetic field to simulate Earth's inner magnetosphere.
Fortran
13
star
79

cosyr

A tool for coherent synchrotron radiation modeling
C++
12
star
80

EGG

Emulator Generation Gadget
C++
12
star
81

pyCP_APR

CP-APR Tensor Decomposition with PyTorch backend. pyCP_APR can perform non-negative Poisson Tensor Factorization on GPU, and includes an interface for anomaly detection using the extracted latent patterns.
Python
12
star
82

ares

Project ARES represents a joint effort between LANL and ORNL to introduce a common compiler representation and tool-chain for HPC applications. At the project's core is the High Level Intermediate Representation, or HLIR, for common compiler toolchains. HLIR is built ontop of the LLVM IR, using metadata to represent high-level parallel constructs.
C++
11
star
83

tinerator

Intuitive and powerful unstructured geospatial mesh generation from GIS data.
Python
10
star
84

CosmicEmu

Fast predictions for the matter power spectrum.
C
10
star
85

scico-data

Data for the scico project
Jupyter Notebook
10
star
86

pySimFrac

Python module for synthetic generation of rough fracture surfaces
Jupyter Notebook
9
star
87

minervachem

a python library for cheminformatics and machine learning
Python
9
star
88

c2dwave

Translate a subset of C to Verilog
C++
9
star
89

voronoi

Parallel Mesh Preprocessor for Subsurface Codes (LANL Copyright No. C19012)
Fortran
8
star
90

EOSlib

A C++ library, database, and utilities for performing thermodynamic calculations using analytic equations of state.
C
8
star
91

PyFEHM

Easy scripting environment for FEHM
Python
8
star
92

nuflood

An open-source software project for surface water simulation.
C++
8
star
93

TOGA

Tool for Optimization and Group-structure Analysis of nuclear multi-group cross sections
Python
8
star
94

stress-make

Expose race conditions in Makefiles
Go
8
star
95

JuliQAOA.jl

A fast, flexible package for simulating the Quantum Alternating Operator Ansatz
Julia
8
star
96

swiftbat_python

Utilities for BAT instrument on the Neil Gehrels Swift Observatory satellite
Python
8
star
97

bueno

A software framework for conducting well-provenanced computer system benchmarking
Python
8
star
98

singularity-opac

Performance portable routines for opacity, emissivity, and scattering
C++
8
star
99

CompactHashRemap

Fast mesh remapping algorithm based on hashing techniques
C
8
star
100

lca-pytorch

Sparse coding in PyTorch via the Locally Competitive Algorithm (LCA)
Python
7
star