aCe - Data-Centric Parallel Programming
Decoupling domain science from performance optimization.
DaCe is a fast parallel programming framework that takes code in Python/NumPy and other programming languages, and maps it to high-performance CPU, GPU, and FPGA programs, which can be optimized to achieve state-of-the-art. Internally, DaCe uses the Stateful DataFlow multiGraph (SDFG) data-centric intermediate representation: A transformable, interactive representation of code based on data movement. Since the input code and the SDFG are separate, it is possible to optimize a program without changing its source, so that it stays readable. On the other hand, transformations are customizable and user-extensible, so they can be written once and reused in many applications. With data-centric parallel programming, we enable direct knowledge transfer of performance optimization, regardless of the application or the target processor.
DaCe generates high-performance programs for:
- Multi-core CPUs (tested on Intel, IBM POWER9, and ARM with SVE)
- NVIDIA GPUs and AMD GPUs (with HIP)
- Xilinx and Intel FPGAs
DaCe can be written inline in Python and transformed in the command-line/Jupyter Notebooks or SDFGs can be interactively modified using our Visual Studio Code extension.
For more information, see the documentation
Quick Start
Install DaCe with pip: pip install dace
Having issues? See our full Installation and Troubleshooting Guide.
Using DaCe in Python is as simple as adding a @dace
decorator:
import dace
import numpy as np
@dace
def myprogram(a):
for i in range(a.shape[0]):
a[i] += i
return np.sum(a)
Calling myprogram
with any NumPy array or GPU array (e.g., PyTorch, Numba, CuPy) will
generate data-centric code, compile, and run it. From here on out, you can
optimize (interactively or automatically), instrument, and distribute
your code. The code creates a shared library (DLL/SO file) that can readily
be used in any C ABI compatible language (C/C++, FORTRAN, etc.).
For more information on how to use DaCe, see the samples or tutorials below:
- Getting Started
- Benchmarks, Instrumentation, and Performance Comparison with Other Python Compilers
- Explicit Dataflow in Python
- NumPy API Reference
- SDFG API
- Using and Creating Transformations
- Extending the Code Generator
Publication
The paper for the SDFG IR can be found here. Other DaCe-related publications are available on our website.
If you use DaCe, cite us:
@inproceedings{dace,
author = {Ben-Nun, Tal and de~Fine~Licht, Johannes and Ziogas, Alexandros Nikolaos and Schneider, Timo and Hoefler, Torsten},
title = {Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures},
year = {2019},
booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
series = {SC '19}
}
Contributing
DaCe is an open-source project. We are happy to accept Pull Requests with your contributions! Please follow the contribution guidelines before submitting a pull request.
License
DaCe is published under the New BSD license, see LICENSE.