• Stars
    star
    123
  • Rank 290,145 (Top 6 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 4 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Graph Information Bottleneck (GIB) for learning minimal sufficient structural and feature information using GNNs

GIB

This repository reproduces the results in the paper Graph Information Bottleneck (Tailin Wu *, Hongyu Ren *, Pan Li, Jure Leskovec, NeurIPS 2020), whose objective is to learn minimal sufficient structural and feature information using GNNs, which improves the robustness of GNNs.

Representation learning on graphs with graph neural networks (GNNs) is a challenging task. Previous work has shown that GNNs are susceptible to adversarial attack. We here introduce Graph Information Bottleneck (GIB), which learns representation that is maximally informative about the target to predict while using minimal sufficient information of the input data. Concretely, the GIB principle regularizes the representation of the node features as well as the graph structure so that it increases the robustness of GNNs. For more information, see our paper Graph Information Bottleneck (Wu et al. 2020), and our project website at http://snap.stanford.edu/gib/.

GIB_principle

Installation

First clone the directory. Then run the following command to initialize the submodules:

git submodule init; git submodule update

(If showing error of no permission, need to first add a new SSH key to your GitHub account.)

The repository also has the following dependencies, and please refer to the respective page to install:

Additional requirements are in requirements.txt, which can be installed via pip install -r requirements.txt.

After installing the dependencies, cd to the directory "DeepRobust/", and install it by running:

pip install -e .

Usage

The main experiment files are:

which can be run via command line or in Jupyter notebook.

The result files are saved under the "results/" folder.

The definition of GIB-GAT, GAT, GCN are in experiments/GIB_node_model.ipynb.

The analysis script is experiments/GIB_node_analysis.ipynb.

Run adversarial attack experiments

To run multiple attack experiments each with a different hyperparameter combination, run "run_exp/run_nettack_grid.py" by e.g.

python run_exp/run_nettack_grid.py ${Assign_ID} ${GPU_ID}

where each integer ${Assign_ID} (0 to M-1) maps to a hyperparameter setting (M is the total number of hyperparameter settings), and ${GPU_ID} is the ID (e.g. 0, 1, 2) of CUDA driver (set to False if using CPU).

Alternatively, to run a single attack experiment, use "run_exp/run_nettack.py". Below are the commands that produce the adversarial attack results in the paper (For node feature attacks, see the README in run_exp/). For the args, the "exp_id" and "date_time" are used to name the folder "{}_{}".format(exp_id, date_time) in which the results will be saved in. "gpuid" can also be set in a custom way. For each experiment, need to go over seeds of 0, 1, 2, 3, 4 then perform analysis, where in the following for brevity we only provide --seed=0. Also note that the following "data_type" all have suffix of "-bool", which makes the feature Boolean as required by Netteck. After running each experiment, use the script experiments/GIB_node_analysis.ipynb (Section 2) to perform analysis and obtain results.

Cora with GIB-Cat:

python run_exp/run_nettack.py --exp_id=Cora-GIB-Cat --data_type=Cora-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","multi-categorical-sum",1,3,2\)' --seed=0 --gpuid=0

Cora with GIB-Bern:

python run_exp/run_nettack.py --exp_id=Cora-GIB-Bern --data_type=Cora-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","Bernoulli",0.1,0.5,"norm",2\)' --seed=0 --gpuid=0

Pubmed with GIB-Cat:

python run_exp/run_nettack.py --exp_id=Pubmed-GIB-Cat --data_type=Pubmed-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","multi-categorical-sum",1,3,2\)' --seed=0 --gpuid=0

Pubmed with GIB-Bern:

python run_exp/run_nettack.py --exp_id=Pubmed-GIB-Bern --data_type=Pubmed-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","Bernoulli",0.1,0.5,"norm",2\)' --seed=0 --gpuid=0

Citeseer with GIB-Cat:

python run_exp/run_nettack.py --exp_id=Citeseer-GIB-Cat --data_type=citeseer-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","multi-categorical-sum",0.1,2,2\)' --seed=0 --gpuid=0

Citeseer with GIB-Bern:

python run_exp/run_nettack.py --exp_id=Citeseer-GIB-Bern --data_type=citeseer-bool --model_type=GAT --beta1=0.001 --beta2=0.01 --struct_dropout_mode='\("DNsampling","Bernoulli",0.05,0.5,"norm",2\)' --seed=0 --gpuid=0

Other baselines:

Cora with GAT:

python run_exp/run_nettack.py --exp_id=Cora-GAT --data_type=Cora-bool --model_type=GAT --beta1=-1 --beta2=-1 --struct_dropout_mode='\("standard",0.6\)' --seed=0 --gpuid=0

Cora with GCN:

python run_exp/run_nettack.py --exp_id=Cora-GCN --data_type=Cora-bool --model_type=GCN --beta1=-1 --beta2=-1 --seed=0 --gpuid=0

Cora with GCNJaccard:

python run_exp/run_nettack.py --exp_id=Cora-GCNJaccard --data_type=Cora-bool --model_type=GCNJaccard --beta1=-1 --beta2=-1 --latent_size=16 --lr=1e-2 --weight_decay=5e-4 --threshold=0.05 --seed=0 --gpuid=0

Cora with RGCN:

python run_exp/run_nettack.py --exp_id=Cora-RGCN --data_type=Cora-bool --model_type=RGCN --beta1=5e-4 --beta2=-1 --latent_size=64 --lr=1e-2 --weight_decay=5e-4 --gamma=0.3 --seed=0 --gpuid=0

Pubmed with GAT:

python run_exp/run_nettack.py --exp_id=Pubmed-GAT --data_type=Pubmed-bool --model_type=GAT --beta1=-1 --beta2=-1 --struct_dropout_mode='\("standard",0.6\)' --seed=0 --gpuid=0

Pubmed with GCN:

python run_exp/run_nettack.py --exp_id=Pubmed-GCN --data_type=Pubmed-bool --model_type=GCN --beta1=-1 --beta2=-1 --seed=0 --gpuid=0

Pubmed with GCNJaccard:

python run_exp/run_nettack.py --exp_id=Pubmed-GCNJaccard --data_type=Pubmed-bool --model_type=GCNJaccard --beta1=-1 --beta2=-1 --latent_size=16 --lr=1e-2 --weight_decay=5e-4 --threshold=0.05 --seed=0 --gpuid=0

Pubmed with RGCN:

python run_exp/run_nettack.py --exp_id=Pubmed-RGCN --data_type=Pubmed-bool --model_type=RGCN --beta1=5e-4 --beta2=-1 --latent_size=16 --lr=1e-2 --weight_decay=5e-4 --gamma=0.1 --seed=0 --gpuid=0

Citeseer with GAT:

python run_exp/run_nettack.py --exp_id=Citeseer-GAT --data_type=citeseer-bool --model_type=GAT --beta1=-1 --beta2=-1 --struct_dropout_mode='\("standard",0.6\)' --seed=0 --gpuid=0

Citeseer with GCN:

python run_exp/run_nettack.py --exp_id=Citeseer-GCN --data_type=citeseer-bool --model_type=GCN --beta1=-1 --beta2=-1 --seed=0 --gpuid=0

Citeseer with GCNJaccard:

python run_exp/run_nettack.py --exp_id=Citeseer-GCNJaccard --data_type=citeseer-bool --model_type=GCNJaccard --beta1=-1 --beta2=-1 --latent_size=16 --lr=1e-2 --weight_decay=5e-4 --threshold=0.05 --seed=0 --gpuid=0

Citeseer with RGCN:

python run_exp/run_nettack.py --exp_id=Citeseer-RGCN --data_type=citeseer-bool --model_type=RGCN --beta1=5e-4 --beta2=-1 --latent_size=64 --lr=1e-2 --weight_decay=5e-4 --gamma=0.3 --seed=0 --gpuid=0

Ablation study:

Cora with XIB:

python run_exp/run_nettack.py --exp_id=Cora-XIB --data_type=Cora-bool --model_type=GAT --beta1=0.001 --beta2=-1 --struct_dropout_mode='\("standard",0.6,2\)' --seed=0 --gpuid=0

Cora with AIB-Cat:

python run_exp/run_nettack.py --exp_id=Cora-AIB-Cat --data_type=Cora-bool --model_type=GAT --beta1=-1 --beta2=0.01 --struct_dropout_mode='\("DNsampling","multi-categorical-sum",1,3,2\)' --seed=0 --gpuid=0

Cora with AIB-Bern:

python run_exp/run_nettack.py --exp_id=Cora-AIB-Bern --data_type=Cora-bool --model_type=GAT --beta1=-1 --beta2=0.01 --struct_dropout_mode='\("DNsampling","Bernoulli",0.1,0.5,"norm",2\)' --seed=0 --gpuid=0

Citation

If you compare with, build on, or use aspects of the Graph Information Bottleneck, please cite the following:

@inproceedings{wu2020graph,
title={Graph Information Bottleneck},
author={Wu, Tailin and Ren, Hongyu and Li, Pan and Leskovec, Jure},
booktitle={Neural Information Processing Systems},
year={2020},
}

More Repositories

1

snap

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.
C++
2,167
star
2

ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
Python
1,906
star
3

GraphGym

Platform for designing and evaluating Graph Neural Networks (GNN)
Python
1,669
star
4

pretrain-gnns

Strategies for Pre-training Graph Neural Networks
Python
955
star
5

deepsnap

Python library assists deep learning on graphs
Python
546
star
6

GraphRNN

Python
408
star
7

med-flamingo

Python
375
star
8

neural-subgraph-learning-GNN

Jupyter Notebook
327
star
9

stark

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)
Python
297
star
10

snap-python

SNAP Python code, SWIG related files
C++
294
star
11

cs224w-notes

CS224W Course Notes
CSS
292
star
12

KGReasoning

Multi-Hop Logical Reasoning in Knowledge Graphs
Python
274
star
13

GreaseLM

[ICLR 2022 spotlight]GreaseLM: Graph REASoning Enhanced Language Models for Question Answering
Python
229
star
14

MLAgentBench

Python
224
star
15

relbench

RelBench: Relational Deep Learning Benchmark
Python
193
star
16

GEARS

GEARS is a geometric deep learning model that predicts outcomes of novel multi-gene perturbations
Python
189
star
17

distance-encoding

Distance Encoding for GNN Design
Jupyter Notebook
181
star
18

graphwave

Jupyter Notebook
169
star
19

UCE

UCE is a zero-shot foundation model for single-cell gene expression data
Python
158
star
20

covid-mobility

Jupyter Notebook
148
star
21

roland

Jupyter Notebook
125
star
22

mars

Discovering novel cell types across heterogenous single-cell experiments
Jupyter Notebook
119
star
23

comet

[ICLR 2021] Concept Learners for Few-Shot Learning
Python
111
star
24

SATURN

Jupyter Notebook
103
star
25

orca

[ICLR 2022] Open-World Semi-Supervised Learning
Python
85
star
26

prodigy

Python
75
star
27

CAW

Python
72
star
28

snapvx

Python
65
star
29

conformalized-gnn

Uncertainty Quantification over Graph with Conformalized Graph Neural Networks (NeurIPS 2023)
Python
64
star
30

multiscale-interactome

Python
62
star
31

plato

Python
61
star
32

miner-data

Python
60
star
33

stellar

Jupyter Notebook
58
star
34

mambo

Jupyter Notebook
37
star
35

lamp

[ICLR23] First deep learning-based surrogate model that jointly learns the evolution model and optimizes computational cost via remeshing
Python
36
star
36

crust

[NeurIPS 2020] Coresets for Robust Training of Neural Networks against Noisy Labels
Python
33
star
37

bc-emb

Python
32
star
38

csr

Python
30
star
39

zeroc

ZeroC is a neuro-symbolic method that trained with elementary visual concepts and relations, can zero-shot recognize and acquire more complex, hierarchical concepts, even across domains
Jupyter Notebook
28
star
40

masa

Motif-Aware State Assignment in Noisy Time Series Data
Python
24
star
41

le_pde

LE-PDE accelerates PDEs' forward simulation and inverse optimization via latent global evolution, achieving significant speedup with SOTA accuracy
Jupyter Notebook
21
star
42

ConE

Python
20
star
43

BioDiscoveryAgent

BioDiscoveryAgent is an LLM-based AI agent for closed-loop design of genetic perturbation experiments
Python
19
star
44

F-FADE

Python
17
star
45

MetroMaps

MetroMaps Release
Python
16
star
46

MAG

Programs for Microsoft Academic Graph
Python
16
star
47

snap-dev

SNAP repository for Ringo
C++
14
star
48

exposure-segregation

Python
13
star
49

ringo

Next generation graph processing platform
Python
12
star
50

planet

PlaNet: Predicting population response to drugs via clinical knowledge graph
Python
12
star
51

covid-mobility-tool

Jupyter Notebook
10
star
52

llm-social-network

Jupyter Notebook
10
star
53

reddit-processing

preprocessing of Reddit data
Python
7
star
54

ViRel

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy
Python
7
star
55

news-search

search Internet news archive
Java
7
star
56

snap-python-64

C++
6
star
57

snap-dev-64

64-bit SNAP (in development, not intended for general use)
C++
6
star
58

snapworld

Python
6
star
59

lego

5
star
60

yperf

Simple performance monitor for Linux
Python
4
star
61

pebble-fit

become less sedentary with pebble
C
4
star
62

dec2vec

Python
3
star
63

caml

Python
3
star
64

SnapTimeTF

Python
2
star
65

covid-spillovers

Jupyter Notebook
2
star
66

curis-2012

Summer 2012 Curis Project
JavaScript
2
star
67

snaptime

Python
2
star
68

GNN-reading-group

1
star
69

supply-chains

Jupyter Notebook
1
star
70

relbench-user-study

Python
1
star
71

AutoTransfer

Python
1
star
72

hash

C++
1
star