• Stars
    star
    128
  • Rank 281,044 (Top 6 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 10 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Out-of-core graph processing on a single machine.

GridGraph

A large scale graph processing framework on a single machine.

Compilation

Compilers supporting basic C++11 features (lambdas, threads, etc.) and OpenMP are required.

To compile:

make

Preprocessing

Before running applications on a graph, GridGraph needs to partition the original edge list into the grid format.

Two types of edge list files are supported:

  • Unweighted. Edges are tuples of <4 byte source, 4 byte destination>.
  • Weighted. Edges are tuples of <4 byte source, 4 byte destination, 4 byte float typed weight>.

To partition the edge list:

./bin/preprocess -i [input path] -o [output path] -v [vertices] -p [partitions] -t [edge type: 0=unweighted, 1=weighted]

For example, we want to partition the unweighted LiveJournal graph into a 4x4 grid:

./bin/preprocess -i /data/LiveJournal -o /data/LiveJournal_Grid -v 4847571 -p 4 -t 0

You may need to raise the limit of maximum open file descriptors (./tools/raise_ulimit_n.sh).

Running Applications

To run the applications, just give the path of the grid format and the memory budge (unit in GB), as well as other necessary program parameters (e.g. the starting vertex of BFS, the number of iterations of PageRank, etc.):

BFS

./bin/bfs [path] [start vertex id] [memory budget]

WCC

./bin/wcc [path] [memory budget]

SpMV

./bin/spmv [path] [memory budget]

PageRank

./bin/pagerank [path] [number of iterations] [memory budget]

For example, to run 20 iterations of PageRank on the (grid partitioned) LiveJournal graph using a machine with 8 GB RAM:

./bin/pagerank /data/LiveJournal_Grid 20 8

Resources

Xiaowei Zhu, Wentao Han and Wenguang Chen. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. Proceedings of the 2015 USENIX Annual Technical Conference, pages 375-386.

To cite GridGraph, you can use the following BibTeX entry:

@inproceedings {zhu2015gridgraph,
author = {Xiaowei Zhu and Wentao Han and Wenguang Chen},
title = {GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning},
booktitle = {2015 USENIX Annual Technical Conference (USENIX ATC 15)},
year = {2015},
month = Jul,
isbn = {978-1-931971-225},
address = {Santa Clara, CA},
pages = {375--386},
url = {https://www.usenix.org/conference/atc15/technical-session/presentation/zhu},
publisher = {USENIX Association},
}

More Repositories

1

GeminiGraph

A computation-centric distributed graph processing system.
C++
310
star
2

PET

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
C++
114
star
3

TriCache

A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs
C++
74
star
4

FasterMoE

Python
65
star
5

gscholar-citations-crawler

Crawl all your citations from Google Scholar
Python
54
star
6

LiveGraph

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans
C++
51
star
7

HyQuas

A hybrid partitioner based quantum circuit simulation system on GPU
C++
46
star
8

SmartMoE-AE

ATC23 AE
Python
42
star
9

GraphPi

C++
36
star
10

RisGraph

RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s
C++
34
star
11

Spindle

C++
31
star
12

lab-guide

Everything about PACMAN!
11
star
13

VAPRO

Light-weight Performance Variance Detection for Production-run Parallel Applications
C
10
star
14

self-checkpoint

An in-memory checkpoint method using less space.
C
6
star
15

AIPerf

Python
6
star
16

mpi-profiler

A simple and easy-to-use profiler for MPI programs. It profiles CPU time and MPI time for each process. No source code modification is need, just re-link the program with this library.
C
5
star
17

LiveGraph-Binary

LiveGraph: a transactional graph storage system with purely sequential adjacency list scans
C++
4
star
18

CYPRESS

CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression
C
3
star
19

AIPerf-MoE

MoE Model Benchmark of AIPerf
Python
3
star
20

Mat2Stencil

A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid.
2
star
21

tprint

tprint is a printing library specially designed for SW architecture. Currently providing C and fortran API.
C
2
star