• Stars
    star
    1,184
  • Rank 39,512 (Top 0.8 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space

Introduction

This is the PyTorch implementation of the RotatE model for knowledge graph embedding (KGE). We provide a toolkit that gives state-of-the-art performance of several popular KGE models. The toolkit is quite efficient, which is able to train a large KGE model within a few hours on a single GPU.

A faster multi-GPU implementation of RotatE and other KGE models is available in GraphVite.

Implemented features

Models:

  • RotatE
  • pRotatE
  • TransE
  • ComplEx
  • DistMult

Evaluation Metrics:

  • MRR, MR, HITS@1, HITS@3, HITS@10 (filtered)
  • AUC-PR (for Countries data sets)

Loss Function:

  • Uniform Negative Sampling
  • Self-Adversarial Negative Sampling

Usage

Knowledge Graph Data:

  • entities.dict: a dictionary map entities to unique ids
  • relations.dict: a dictionary map relations to unique ids
  • train.txt: the KGE model is trained to fit this data set
  • valid.txt: create a blank file if no validation data is available
  • test.txt: the KGE model is evaluated on this data set

Train

For example, this command train a RotatE model on FB15k dataset with GPU 0.

CUDA_VISIBLE_DEVICES=0 python -u codes/run.py --do_train \
 --cuda \
 --do_valid \
 --do_test \
 --data_path data/FB15k \
 --model RotatE \
 -n 256 -b 1024 -d 1000 \
 -g 24.0 -a 1.0 -adv \
 -lr 0.0001 --max_steps 150000 \
 -save models/RotatE_FB15k_0 --test_batch_size 16 -de

Check argparse configuration at codes/run.py for more arguments and more details.

Test

CUDA_VISIBLE_DEVICES=$GPU_DEVICE python -u $CODE_PATH/run.py --do_test --cuda -init $SAVE

Reproducing the best results

To reprocude the results in the ICLR 2019 paper RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space, you can run the bash commands in best_config.sh to get the best performance of RotatE, TransE, and ComplEx on five widely used datasets (FB15k, FB15k-237, wn18, wn18rr, Countries).

The run.sh script provides an easy way to search hyper-parameters:

bash run.sh train RotatE FB15k 0 0 1024 256 1000 24.0 1.0 0.0001 200000 16 -de

Speed

The KGE models usually take about half an hour to run 10000 steps on a single GeForce GTX 1080 Ti GPU with default configuration. And these models need different max_steps to converge on different data sets:

Dataset FB15k FB15k-237 wn18 wn18rr Countries S*
MAX_STEPS 150000 100000 80000 80000 40000
TIME 9 h 6 h 4 h 4 h 2 h

Results of the RotatE model

Dataset FB15k FB15k-237 wn18 wn18rr
MRR .797 Β± .001 .337 Β± .001 .949 Β± .000 .477 Β± .001
MR 40 177 309 3340
HITS@1 .746 .241 .944 .428
HITS@3 .830 .375 .952 .492
HITS@10 .884 .533 .959 .571

Using the library

The python libarary is organized around 3 objects:

  • TrainDataset (dataloader.py): prepare data stream for training
  • TestDataSet (dataloader.py): prepare data stream for evluation
  • KGEModel (model.py): calculate triple score and provide train/test API

The run.py file contains the main function, which parses arguments, reads data, initilize the model and provides the training loop.

Add your own model to model.py like:

def TransE(self, head, relation, tail, mode):
    if mode == 'head-batch':
        score = head + (relation - tail)
    else:
        score = (head + relation) - tail

    score = self.gamma.item() - torch.norm(score, p=1, dim=2)
    return score

Citation

If you use the codes, please cite the following paper:

@inproceedings{
 sun2018rotate,
 title={RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space},
 author={Zhiqing Sun and Zhi-Hong Deng and Jian-Yun Nie and Jian Tang},
 booktitle={International Conference on Learning Representations},
 year={2019},
 url={https://openreview.net/forum?id=HkgEQnRqYQ},
}

More Repositories

1

LiteratureDL4Graph

A comprehensive collection of recent papers on graph deep learning
3,068
star
2

torchdrug

A powerful and flexible machine learning platform for drug discovery
Python
1,382
star
3

graphvite

GraphVite: A General and High-performance Graph Embedding System
C++
1,207
star
4

RecommenderSystems

Python
1,058
star
5

ULTRA

A foundation model for knowledge graph reasoning
Python
420
star
6

GMNN

Graph Markov Neural Networks
Python
400
star
7

GearNet

GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
Python
265
star
8

NBFNet

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)
Python
196
star
9

ConfGF

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).
Python
159
star
10

pLogicNet

Python
143
star
11

RNNLogic

C++
123
star
12

AStarNet

Official implementation of A* Networks
Python
121
star
13

GraphAny

GraphAny: A foundation model for node classification on any graph.
Python
101
star
14

GNN-QE

Official implementation of Graph Neural Network Query Executor (ICML 2022)
Python
89
star
15

PEER_Benchmark

PEER Benchmark, appear at NeurIPS 2022 Dataset and Benchmark Track (https://arxiv.org/abs/2206.02096)
Python
79
star
16

ESM-GearNet

ESM-GearNet for Protein Structure Representation Learning (https://arxiv.org/abs/2303.06275)
Python
75
star
17

DiffPack

Implementation of DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing
Python
71
star
18

GraphLoG

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).
Python
68
star
19

ProtST

[ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Python
62
star
20

GraphAF

50
star
21

InductiveQE

Official implementation of Inductive Logical Query Answering in Knowledge Graphs (NeurIPS 2022)
Python
47
star
22

ContinuousGNN

Python
44
star
23

FewShotRE

Python
38
star
24

SiamDiff

Code for Pre-training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction (https://arxiv.org/abs/2301.12068)
Python
38
star
25

SPN

Python
29
star
26

GearBind

Pretrainable geometric graph neural network for antibody affinity maturation
Python
28
star
27

esm-s

Structure-Informed Protein Language Model
Python
26
star
28

DrugTutorial_AAAI2021

Tutorial for Drug Discovery on AAAI 2021.
CSS
8
star
29

DeepGraphLearning

Homepage
7
star
30

torchdrug-site

Website for TorchDrug
SCSS
6
star
31

GraphRepresentationLiterature

The literature on graph representation learning
4
star
32

ultra_torchdrug

A TorchDrug version of ULTRA for reproducibility
Python
4
star
33

AAAI19Tutorial

Tutorial "graph representation learning" given at AAAI'19
3
star
34

torchprotein-site

Website for TorchProtein
SCSS
3
star
35

coursewebsite

Course website for Deep Learning and Applications
CSS
2
star
36

Math80600A_2021W

Python
1
star