• Stars
    star
    132
  • Rank 274,205 (Top 6 %)
  • Language
    Python
  • Created about 5 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Source code from the NeurIPS 2019 workshop article "Keep It Simple: Graph Autoencoders Without Graph Convolutional Networks" (G. Salha, R. Hennequin, M. Vazirgiannis) + k-core framework implementation from IJCAI 2019 article "A Degeneracy Framework for Scalable Graph Autoencoders" (G. Salha, R. Hennequin, V.A. Tran, M. Vazirgiannis)

Linear Graph Autoencoders

This repository provides Python (Tensorflow) code to reproduce experiments from the article Keep It Simple: Graph Autoencoders Without Graph Convolutional Networks presented at the NeurIPS 2019 Workshop on Graph Representation Learning.

Update: an extended conference version of this article is now available here: Simple and Effective Graph Autoencoders with One-Hop Linear Models (accepted at ECML-PKDD 2020).

Update 2: do you prefer PyTorch? An implementation of Linear Graph AE and VAE is now available in the pytorch_geometric project! See the example here.

Introduction

We release Tensorflow implementations of the following two graph embedding models from the paper:

  • Linear Graph Autoencoders
  • Linear Graph Variational Autoencoders

together with standard Graph Autoencoders (AE) and Graph Variational Autoencoders (VAE) models (with 2-layer or 3-layer Graph Convolutional Networks encoders) from Kipf and Welling (2016).

We evaluate all models on the link prediction and node clustering tasks introduced in the paper. We provide the Cora, Citeseer and Pubmed datasets in the data folder, and refer to section 4 of the paper for direct link to the additional datasets used in our experiments.

Our code builds upon Thomas Kipf's original Tensorflow implementation of standard Graph AE/VAE.

Linear AE and VAE

Scaling-Up Graph AE and VAE

Standard Graph AE and VAE models suffer from scalability issues. In order to scale them to large graphs with millions of nodes and egdes, we also provide an implementation of our framework from the article A Degeneracy Framework for Scalable Graph Autoencoders (IJCAI 2019). In this paper, we propose to train the graph AE/VAE only from a dense subset of nodes, namely the k-core or k-degenerate subgraph. Then, we propagate embedding representations to the remaining nodes using faster heuristics.

Update: in this other repository, we provide an implementation of FastGAE, a new (and more effective) method from our group to scale Graph AE and VAE.

Degeneracy Framework

Installation

python setup.py install

Requirements: tensorflow (1.X), networkx, numpy, scikit-learn, scipy

Run Experiments

cd linear_gae
python train.py --model=gcn_vae --dataset=cora --task=link_prediction
python train.py --model=linear_vae --dataset=cora --task=link_prediction

The above commands will train a standard Graph VAE with 2-layer GCN encoders (line 2) and a Linear Graph VAE (line 3) on Cora dataset and will evaluate embeddings on the Link Prediction task, with all parameters set to default values.

python train.py --model=gcn_vae --dataset=cora --task=link_prediction --kcore=True --k=2
python train.py --model=gcn_vae --dataset=cora --task=link_prediction --kcore=True --k=3
python train.py --model=gcn_vae --dataset=cora --task=link_prediction --kcore=True --k=4

By adding --kcore=True, the model will only be trained on the k-core subgraph instead of using the entire graph. Here, k is a parameter (from 0 to the maximal core number of the graph) to specify using the --k flag.

Complete list of parameters

Parameter Type Description Default Value
model string Name of the model, among:
- gcn_ae: Graph AE from Kipf and Welling (2016), with 2-layer GCN encoder and inner product decoder
- gcn_vae: Graph VAE from Kipf and Welling (2016), with Gaussian distributions, 2-layer GCN encoders for mu and sigma, and inner product decoder
- linear_ae: Linear Graph AE, as introduced in section 3 of NeurIPS workshop paper, with linear encoder, and inner product decoder
- linear_vae: Linear Graph VAE, as introduced in section 3 of NeurIPS workshop paper, with Gaussian distributions, linear encoders for mu and sigma, and inner product decoder
- deep_gcn_ae: Deeper version of Graph AE, with 3-layer GCN encoder, and inner product decoder
- deep_gcn_vae: Deeper version of Graph VAE, with Gaussian distributions, 3-layer GCN encoders for mu and sigma, and inner product decoder
gcn_ae
dataset string Name of the dataset, among:
- cora: scientific publications citation network
- citeseer: scientific publications citation network
- pubmed: scientific publications citation network

We provide the preprocessed versions, coming from the tkipf/gae repository. Please check the LINQS website for raw data

You can specify any additional graph dataset, in edgelist format,
by editing input_data.py
cora
task string Name of the Machine Learning evaluation task, among:
- link_prediction: Link Prediction
- node_clustering: Node Clustering

See section 4 and supplementary material of NeurIPS 2019 workshop paper for details about tasks
link_prediction
dropout float Dropout rate 0.
epoch int Number of epochs in model training 200
features boolean Whether to include node features in encoder False
learning_rate float Initial learning rate (with Adam optimizer) 0.01
hidden int Number of units in GCN encoder hidden layer(s) 32
dimension int Dimension of encoder output, i.e. embedding dimension 16
kcore boolean Whether to run k-core decomposition and use the degeneracy framework from IJCAI paper. If False, the AE/VAE will be trained on the entire graph False
k int Which k-core to use. Higher k => smaller graphs and faster (but maybe less accurate) training 2
nb_run integer Number of model runs + tests 1
prop_val float Proportion of edges in validation set (for Link Prediction) 5.
prop_test float Proportion of edges in test set (for Link Prediction) 10.
validation boolean Whether to report validation results at each epoch (for Link Prediction) False
verbose boolean Whether to print full comments details True

Models from the paper

Cora

python train.py --dataset=cora --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=cora --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=cora --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Cora - with features

python train.py --dataset=cora --features=True --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=cora --features=True --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=cora --features=True --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --features=True --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --features=True --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=cora --features=True --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Citeseer

python train.py --dataset=citeseer --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Citeseer - with features

python train.py --dataset=citeseer --features=True --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --features=True --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --features=True --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --features=True --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --features=True --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=citeseer --features=True --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Pubmed

python train.py --dataset=pubmed --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Pubmed - with features

python train.py --dataset=pubmed --features=True --model=linear_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --features=True --model=linear_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --features=True --model=gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --features=True --model=gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --features=True --model=deep_gcn_ae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5
python train.py --dataset=pubmed --features=True --model=deep_gcn_vae --task=link_prediction --epochs=200 --learning_rate=0.01 --hidden=32 --dimension=16 --nb_run=5

Notes:

  • Set --task=node_clustering with same hyperparameters to evaluate models on node clustering (as in Table 4) instead of link prediction
  • Set --nb_run=100 to report mean AUC and AP along with standard errors over 100 runs, as in the paper
  • We recommend GPU usage for faster learning

Cite

1 - Please cite the following paper(s) if you use linear graph AE/VAE code in your own work.

NeurIPS 2019 workshop version:

@misc{salha2019keep,
  title={Keep It Simple: Graph Autoencoders Without Graph Convolutional Networks},
  author={Salha, Guillaume and Hennequin, Romain and Vazirgiannis, Michalis},
  howpublished={Workshop on Graph Representation Learning, 33rd Conference on Neural Information Processing Systems (NeurIPS)},
  year={2019}
}

and/or the extended conference version:

@inproceedings{salha2020simple,
  title={Simple and Effective Graph Autoencoders with One-Hop Linear Models},
  author={Salha, Guillaume and Hennequin, Romain and Vazirgiannis, Michalis},
  booktitle={European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD)},
  year={2020}
}

2 - Please cite the following paper if you use the k-core framework for scalability in your own work.

@inproceedings{salha2019degeneracy,
  title={A Degeneracy Framework for Scalable Graph Autoencoders},
  author={Salha, Guillaume and Hennequin, Romain and Tran, Viet Anh and Vazirgiannis, Michalis},
  booktitle={28th International Joint Conference on Artificial Intelligence (IJCAI)},
  year={2019}
}

More Repositories

1

spleeter

Deezer source separation library including pretrained models.
Python
25,759
star
2

javascript-samples

Samples for the Deezer Javascript SDK
JavaScript
71
star
3

KustomExport

EXPERIMENTAL : A JS facade generator for KotlinJS / KMP
Kotlin
67
star
4

w2v_reco_hyperparameters_matter

Repository to reproduce results of "Word2vec applied to Recommendation: Hyperparameters Matter" by H. Caselles-Dupré, F. Lesaint and J. Royo-Letelier. The paper will be published on the 12th ACM Conference on Recommender Systems, Vancouver, Canada, 2nd-7th October 2018
Python
53
star
5

carousel_bandits

Source code and data from the RecSys 2020 article "Carousel Personalization in Music Streaming Apps with Contextual Bandits" by W. Bendada, G. Salha and T. Bontempelli
Python
53
star
6

weslang

A language detection Web Service
Java
52
star
7

html-linter

HTML5 Linter based on Google Style Guide
Python
51
star
8

semi_perso_user_cold_start

Source code from the KDD 2021 article "A Semi-Personalized System for User Cold Start Recommendation on Music Streaming Apps" by L. Briand, G. Salha-Galvan, W. Bendada, M. Morlon and V.A. Tran
Python
47
star
9

gravity_graph_autoencoders

Source code from the CIKM 2019 article "Gravity-Inspired Graph Autoencoders for Directed Link Prediction" by G. Salha, S. Limnios, R. Hennequin, V.A. Tran and M. Vazirgiannis
Python
43
star
10

android-sample

Sample application using Deezer Android SDK
Java
40
star
11

deezer_mood_detection_dataset

A dataset of Valence/Arousal detection with deezer Id and MSD Id as input
33
star
12

cover_song_detection

Tools to run experiments around large scale cover detection.
Python
27
star
13

MusicGenreTranslation

Python code for reproducing music genre translation experiments presented in the paper Leveraging knowledge bases and parallel annotations for music genre translation ISMIR 2019.
Python
27
star
14

fastgae

Source code from the article "FastGAE: Scalable Graph Autoencoders with Stochastic Subgraph Decoding" by G. Salha, R. Hennequin, J.B. Remy, M. Moussallam and M. Vazirgiannis (2020)
Python
26
star
15

zeroNoteSamba

Repository for the IEEE/ACM TASLP 2023 Paper "Zero-Note Samba: Self-Supervised Beat Tracking".
Python
24
star
16

sigir23-mojito

Source code from the SIGIR 2023 article "Attention Mixtures for Time-Aware Sequential Recommendation" by V.A. Tran, G. Salha-Galvan, B. Sguerra, and R. Hennequin
Python
24
star
17

similar_artists_ranking

Cold Start Similar Artists Ranking with Gravity-Inspired Graph Autoencoders (RecSys 2021)
Python
19
star
18

sigir2019-2stagesampling

Improving Collaborative Metric Learning for Recommendation by a 2-stage negative sampling strategy.
Python
16
star
19

playntell

Code to reproduce the experiments presented in the article "Data-Efficient Playlist Captioning With Musical and Linguistic Knowledge" (EMNLP 2022)
Python
15
star
20

interpretable_nn_attribution

Source code from our RecSys 2020 paper: "Making neural network interpretable with attribution: application to implicit signals prediction" (D. Afchar, R. Hennequin)
Jupyter Notebook
14
star
21

musicFPaugment

Code for reproducting the paper Music Augmentation and Denoising For Peak-Based Audio Fingerprinting
Python
14
star
22

MultilingualLyricsToAudioAlignment

DALI datasets split used to train models presented in the paper Multilingual lyrics-to-audio alignment (ISMIR 2020).
13
star
23

deezer.github.io

Research team website
SCSS
10
star
24

MultilingualMusicGenreEmbedding

Python code to reproduce the experiments presented in the paper Multilingual Music Genre Embeddings for Effective Cross-Lingual Music Item Annotation (ISMIR 2020).
Python
10
star
25

elasticmsd

Transfer the Million Song Dataset (MSD) in an Elasticsearch index
Python
9
star
26

code-snippets

Code snippets of Deezer SDK / API
JavaScript
8
star
27

template-remover

Template remover
Python
8
star
28

muzeeglot

Web interface application to visualize multilingual music genre embeddings and generated cross-lingual music genre annotations for Wikipedia music entities (artists, bands, albums, tracks).
Python
8
star
29

podcast-topic-modeling

Code to reproduce the experiments presented in the article Topic Modeling on Podcast Short-Text Metadata (ECIR 2022)
Python
8
star
30

APC-RTA

Python
7
star
31

iOS-sdk-samples

Samples applications using Deezer iOS SDK in Objective-C & Swift
HTML
7
star
32

music-ner-eacl2023

Code and data to reproduce the experiments presented in the article "A Human Subject Study of Named Entity Recognition (NER) in Conversational Music Recommendation Queries" (EACL 2023)
Python
7
star
33

deezer-chromaprint

A stand-alone, x86 and x64 compatible, Windows version, of the chromaprint audio library.
C++
6
star
34

recsys21-hlr

Hierarchical Latent Relation Modeling for Collaborative Metric Learning
Python
6
star
35

functional_attribution

Code of our accepted ICML 2021 paper "Towards Rigorous Interpretations: a Formalisation of Feature Attribution" (D. Afchar, R. Hennequin, V. Guigue)
Python
6
star
36

CrossCulturalMusicGenrePerception

Python code to reproduce the experiments presented in the article Modeling the Music Genre Perception across Language-Bound Cultures, presented at the EMNLP 2020 conference.
Python
6
star
37

Counsel

A collection of advices, ready to use via AOP
Java
5
star
38

multi-view-ssl-benchmark

Repository for the ICASSP 2024 paper "An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging".
Python
5
star
39

GroROTI

G(r)oROTI is a selfhosted Return On Time Invested web application written in Go
Go
5
star
40

deepfake-detector

Code repository of our research paper "Detecting music deepfakes is easy but actually hard" D. Afchar, G. Meseguer Brocal, R. Hennequin
4
star
41

net-ner-probing

This repository contains the code and data to reproduce the experiments from the article Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition presented at LREC 2022.
Python
4
star
42

WindowsPhoneToastNotifications

An InApp notifications system for Windows Phone Silverlight applications.
C#
4
star
43

GraceNote2Deezer

An Android sample application, using both GraceNote and Deezer SDK to match your music with Deezer's catalog
4
star
44

Android-Aspectj-Plugin

A Gradle plugin which enables AspectJ for Android builds
Groovy
4
star
45

spiky_svd

Code repository of our RecSys 2023 paper "Of Spiky SVDs and Music Recommendation" - D. Afchar, R. Hennequin, V. Guigue (2023)
Python
4
star
46

concept_hierarchy

Source code of our ISMIR 2022 paper "Learning Unsupervised Hierarchies of Audio Concepts" by D. Afchar, R. Hennequin and V. Guigue (2022)
Jupyter Notebook
4
star
47

SingingLanguageIdentification

Datasets splits for reproducing paper Singing Language Identification Using a Deep Phonotactic Approach (ICASSP 2021).
3
star
48

java-diff-merge-tool

A smart tool to provide diff/merge resolution for Java files
Java
3
star
49

ex2vec

Python
3
star
50

CordovaDeezerSample

Sample for the Deezer Cordova Plugin
JavaScript
3
star
51

terragrunt-example

HCL
3
star
52

CordovaDeezerPlugin

A Plugin for cordova to embed JS apps within Android
Java
3
star
53

audio_based_disambiguation_of_music_genre_tags

Supporting data for the paper: R. Hennequin, J. Royo Letelier, M. Moussallam "Audio Based Disambiguation Of Music Genre Tags", ISMIR 2018
3
star
54

Search-Intent-Exploration

2
star
55

pauzee_taln23

Pauses Prediction in text reading.
Python
2
star
56

Disambiguating-Music-Artists-at-Scale-with-Audio-Metric-Learning

1
star
57

web-samples

1
star
58

aads_french

Code and data to reproduce the experiments presented in "Automatic Annotation of Direct Speech in Written French Narratives" (ACL 2023)
Python
1
star
59

rust-ectoken

Token Generator for Edgecast Token-Based Authentication from Verizon Digital Media Services
Rust
1
star
60

recsys24-pisa

Code for reproducting experiments of the PISA paper published at RecSys 2024
1
star
61

stone

Repository for the ISMIR 2024 Paper "STONE: Self-supervised Tonality Estimator".
1
star
62

new-releases-ecir2024

This repository will contain resources related to the ECIR 2024 Industry Talk titled "Let’s Get It Started: Fostering the Discoverability of New Releases on Deezer" by Léa Briand et al.
1
star