• Stars
    star
    110
  • Rank 314,922 (Top 7 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding

GraphZoom

GraphZoom is a framework that aims to improve both performance and scalability of graph embedding techniques. As shown in the following figure, GraphZoom consists of 4 kernels: Graph Fusion, Spectral Coarsening, Graph Embedding, and Embedding Refinement. GraphZoom More details are available in our paper: https://openreview.net/forum?id=r1lGO0EKDH

Overview of the GraphZoom framework

Citation

If you use GraphZoom in your research, please cite our preliminary work published in ICLR'20.

@inproceedings{deng2020graphzoom,
title={GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding},
author={Chenhui Deng and Zhiqiang Zhao and Yongyu Wang and Zhiru Zhang and Zhuo Feng},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=r1lGO0EKDH}
}

Spectral Coarsening Options

  • lamg-based coarsening: This is the spectral coarsening algorithm used in the original paper, but it requires you to download Matlab Compiler Runtime (MCR).
  • simple coarsening: This is a simpler spectral coarsening implemented via python and you do not need to download MCR. This algorithm adopts a similar idea to coarsen the graph (spectrum-preserving), while it may compromise the performance compared to lamg-based coarsening (especially for run-time speedup).

Requirements

  • Matlab Compiler Runtime (MCR) 2018a(Linux), which is a standalone set of shared libraries that enables the execution of compiled MATLAB applications and does not require license to install (only required if you run lamg-based coarsening).
  • python 3.5/3.6/3.7 (We suggest Conda to manage package dependencies.)
  • numpy
  • networkx
  • scipy
  • scikit-learn
  • gensim, only required by deepwalk, node2vec
  • tensorflow, only required by graphsage
  • torch, ogb, pytorch_geometric, only required by Open Graph Benchmark (OGB) examples

Installation

1. wget https://ssd.mathworks.com/supportfiles/downloads/R2018a/deployment_files/R2018a/installers/glnxa64/MCR_R2018a_glnxa64_installer.zip`
2. unzip MCR_R2018a_glnxa64_installer.zip -d YOUR_SAVE_PATH
3. cd YOUR_SAVE_PATH
4. ./install -mode silent -agreeToLicense yes -destinationFolder YOUR_MCR_PATH
  • install PyTorch Geometric (only required if you run OGB examples)
  • create virtual environment (skip if you do not want)
1. conda create -n graphzoom python=3.6
2. conda activate graphzoom
  • install packages for graphzoom
pip install -r requirements.txt

Directory Stucture

GraphZoom/
โ”‚   README.md
โ”‚   requirements.txt
โ”‚   ... 
โ”‚
โ””โ”€โ”€โ”€graphzoom/
โ”‚   โ”‚   graphzoom.py
โ”‚   โ”‚   cora.sh
โ”‚   โ”‚   ...  
โ”‚   โ”‚ 
โ”‚   โ””โ”€โ”€โ”€dataset/
โ”‚   โ”‚   โ”‚    cora
โ”‚   โ”‚   โ”‚    citeseer
โ”‚   โ”‚   โ”‚    pubmed
โ”‚   โ”‚  
โ”‚   โ””โ”€โ”€โ”€embed_methods/
โ”‚       โ”‚    DeepWalk
โ”‚       โ”‚    node2vec
โ”‚       โ”‚    GraphSAGE
โ”‚ 
โ””โ”€โ”€โ”€mat_coarsen/
โ”‚   โ”‚   make.m
โ”‚   โ”‚   LamgSetup.m
โ”‚   โ”‚   ...  
โ”‚
โ””โ”€โ”€โ”€ogb/
โ”‚   โ”‚   ...
โ”‚   โ””โ”€โ”€โ”€ogbn-arxiv/ 
โ”‚   โ”‚    โ”‚   main.py
โ”‚   โ”‚    โ”‚   mlp.py
โ”‚   โ”‚    โ”‚   arxiv.sh   
โ”‚   โ”‚    โ”‚   ...  
โ”‚   โ”‚    
โ”‚   โ””โ”€โ”€โ”€ogbn-products/ 
โ”‚        โ”‚   main.py
โ”‚        โ”‚   mlp.py
โ”‚        โ”‚   products.sh  
โ”‚        โ”‚   ...
โ”‚

Usage

Note: If you run lamg-based coarsening, you have to pass the root directory of matlab compiler runtime to the argument--mcr_dir when running graphzoom.py

Example Usage

  1. cd graphzoom

  2. python graphzoom.py --mcr_dir YOUR_MCR_PATH --dataset citeseer --search_ratio 12 --num_neighs 10 --embed_method deepwalk --coarse lamg

--coarse: choose a specific algorithm for coarsening, [lamg, simple]

--reduce_ratio: the reduction ratio when choosing lamg-based coarsening method

--level: the coarsening level when choosing simple coarsening method

--mcr_dir: root directory of matlab compiler runtime

--dataset: input dataset, currently supports "json" format

--embed_method: choose a specific basic embedding algorithm

--search_ratio: control the search space of graph fusion

--num_neighs: control number of edges in feature graph

Full Command List The full list of command line options is available with python graphzoom.py --help

Highlight in Flexibility

You can easily plug a new unsupervised graph embedding model into GraphZoom, just implement a new function, which takes a graph as input and outputs an embedding matrix, in graphzoom/embed_methods.

The current version of GraphZoom can support the following basic models:

  • DeepWalk
  • node2vec
  • GraphSAGE

Dataset

  • Cora
  • Citeseer
  • Pubmed

You can add your own dataset following the json format in graphzoom/dataset

Experimental Results

Here we evaluate GraphZoom on Cora dataset with DeepWalk as basic embedding model, with lamg-based coarsening method. GraphZoom-i denotes applying GraphZoom with i-th coarsening level.

Method Accuracy Speedup Graph_Size
DeepWalk 71.4 1x 2708
GraphZoom-1 76.9 2.5x 1169
GraphZoom-2 77.3 6.3x 519
GraphZoom-3 75.1 40.8x 218

We also evaluate Graphzoom on ogbn-arxiv and ogbn-products dataset with lamg-based coarsening method, and GraphZoom-1 has better performance and much fewer parameters than the Node2vec baseline.

ogbn-arxiv

Method Accuracy #Params
Node2vec 70.07 ยฑ 0.13 21,818,792
GraphZoom-1 71.18 ยฑ 0.18 8,963,624

ogbn-products

Method Accuracy #Params
Node2vec 72.49 ยฑ 0.10 313,612,207
GraphZoom-1 74.06 ยฑ 0.26 120,251,183

LAMG Coarsening Code

The matlab version of lamg-based spectral coarsening code is available in mat_coarsen/

More Repositories

1

heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
Python
322
star
2

bnn-fpga

Binarized Convolutional Neural Networks on Software-Programmable FPGAs
C
297
star
3

rosetta

Rosetta: A Realistic High-level Synthesis Benchmark Suite for Software Programmable FPGAs
C++
156
star
4

allo

Allo: A Programming Model for Composable Accelerator Design
Python
117
star
5

dnn-quant-ocs

DNN quantization with outlier channel splitting
Python
109
star
6

FracBNN

FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations
Python
87
star
7

HiSparse

High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS
C++
75
star
8

dnn-gating

Conditional channel- and precision-pruning on neural networks
Python
71
star
9

GraphLily

A graph linear algebra overlay
C++
47
star
10

facedetect-fpga

C++
43
star
11

hcl-dialect

HeteroCL-MLIR dialect for accelerator design
C++
37
star
12

GARNET

GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph Neural Networks
Python
33
star
13

UniSparse

Code base for OOPSLA'24 paper: UniSparse: An Intermediate Language for General Sparse Format Customization
MLIR
28
star
14

Polynormer

Polynormer: Polynomial-Expressive Graph Transformer in Linear Time
Python
28
star
15

quickest

QuickEst repository: Quick Estimation of Quality of Results
Python
26
star
16

uptune

A Generic Distributed Auto-Tuning Infrastructure
Python
21
star
17

HOGA

Hop-Wise Graph Attention for Scalable and Generalizable Learning on Circuits
Python
21
star
18

llm-datatypes

Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Python
18
star
19

datuner

DATuner Repository
LLVM
17
star
20

allo-pldi24-artifact

Artifact evaluation of PLDI'24 paper "Allo: A Programming Model for Composable Accelerator Design"
VHDL
14
star
21

GLAIVE

Graph-learning assisted instruction vulnerability estimation published in DATE 2020
C++
13
star
22

tutorial-fccm21

HTML
6
star
23

catalyst2018

Python
3
star