• Stars
    star
    310
  • Rank 134,162 (Top 3 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 8 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A computation-centric distributed graph processing system.

Gemini

A computation-centric distributed graph processing system.

Quick Start

Gemini uses MPI for inter-process communication and libnuma for NUMA-aware memory allocation. A compiler supporting OpenMP and C++11 features (e.g. lambda expressions, multi-threading, etc.) is required.

Implementations of five graph analytics applications (PageRank, Connected Components, Single-Source Shortest Paths, Breadth-First Search, Betweenness Centrality) are inclulded in the toolkits/ directory.

To build:

make

The input parameters of these applications are as follows:

./toolkits/pagerank [path] [vertices] [iterations]
./toolkits/cc [path] [vertices]
./toolkits/sssp [path] [vertices] [root]
./toolkits/bfs [path] [vertices] [root]
./toolkits/bc [path] [vertices] [root]

[path] gives the path of an input graph, i.e. a file stored on a shared file system, consisting of |E| <source vertex id, destination vertex id, edge data> tuples in binary. [vertices] gives the number of vertices |V|. Vertex IDs are represented with 32-bit integers and edge data can be omitted for unweighted graphs (e.g. the above applications except SSSP). Note: CC makes the input graph undirected by adding a reversed edge to the graph for each loaded one; SSSP uses float as the type of weights.

If Slurm is installed on the cluster, you may run jobs like this, e.g. 20 iterations of PageRank on the twitter-2010 graph:

srun -N 8 ./toolkits/pagerank /path/to/twitter-2010.binedgelist 41652230 20

Resources

Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. Gemini: A Computation-Centric Distributed Graph Processing System. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16).

More Repositories

1

GridGraph

Out-of-core graph processing on a single machine.
C++
128
star
2

PET

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
C++
112
star
3

TriCache

A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs
C++
74
star
4

FasterMoE

Python
65
star
5

gscholar-citations-crawler

Crawl all your citations from Google Scholar
Python
54
star
6

LiveGraph

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans
C++
51
star
7

HyQuas

A hybrid partitioner based quantum circuit simulation system on GPU
C++
46
star
8

SmartMoE-AE

ATC23 AE
Python
42
star
9

GraphPi

C++
35
star
10

RisGraph

RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s
C++
34
star
11

Spindle

C++
31
star
12

lab-guide

Everything about PACMAN!
11
star
13

VAPRO

Light-weight Performance Variance Detection for Production-run Parallel Applications
C
10
star
14

self-checkpoint

An in-memory checkpoint method using less space.
C
6
star
15

AIPerf

Python
6
star
16

mpi-profiler

A simple and easy-to-use profiler for MPI programs. It profiles CPU time and MPI time for each process. No source code modification is need, just re-link the program with this library.
C
5
star
17

LiveGraph-Binary

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans
C++
4
star
18

CYPRESS

CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression
C
3
star
19

AIPerf-MoE

MoE Model Benchmark of AIPerf
Python
3
star
20

Mat2Stencil

A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid.
2
star
21

tprint

tprint is a printing library specially designed for SW architecture. Currently providing C and fortran API.
C
2
star