• Stars
    star
    191
  • Rank 201,567 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[WWW 2022] KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction

KnowPrompt

Code and datasets for the WWW2022 paper KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction.

  • NOTE: We provide a paper-list at PromptKG.

What's New

Apr,5 2023

Dec,7 2022

Sept,21 2022

March,30 2022

Jan,14 2022

Requirements

It is recommended to use a virtual environment to run KnowPrompt.

conda create -n knowprompt python=3.8

conda activate knowprompt

To install requirements:

pip install -r requirements.txt

Datasets

We provide all the datasets and prompts used in our experiments.

The expected structure of files is:

knowprompt
 |-- dataset
 |    |-- semeval
 |    |    |-- train.txt       
 |    |    |-- dev.txt
 |    |    |-- test.txt
 |    |    |-- temp.txt
 |    |    |-- rel2id.json
 |    |-- dialogue
 |    |    |-- train.json       
 |    |    |-- dev.json
 |    |    |-- test.json
 |    |    |-- rel2id.json
 |    |-- tacred
 |    |    |-- train.txt       
 |    |    |-- dev.txt
 |    |    |-- test.txt
 |    |    |-- temp.txt
 |    |    |-- rel2id.json
 |    |-- tacrev
 |    |    |-- train.txt       
 |    |    |-- dev.txt
 |    |    |-- test.txt
 |    |    |-- temp.txt
 |    |    |-- rel2id.json
 |    |-- retacred
 |    |    |-- train.txt       
 |    |    |-- dev.txt
 |    |    |-- test.txt
 |    |    |-- temp.txt
 |    |    |-- rel2id.json
 |-- scripts
 |    |-- semeval.sh
 |    |-- dialogue.sh
 |    |-- ...
 

Run the experiments

Initialize the answer words

Use the comand below to get the answer words to use in the training.

python get_label_word.py --model_name_or_path bert-large-uncased  --dataset_name semeval

The {answer_words}.ptwill be saved in the dataset, you need to assign the model_name_or_path and dataset_name in the get_label_word.py.

Split dataset

Download the data first, and put it to dataset folder. Run the comand below, and get the few shot dataset.

python generate_k_shot.py --data_dir ./dataset --k 8 --dataset semeval
cd dataset
cd semeval
cp rel2id.json val.txt test.txt ./k-shot/8-1

You need to modify the k and dataset to assign k-shot and dataset. Here we default seed as 1,2,3,4,5 to split each k-shot, you can revise it in the generate_k_shot.py

Let's run

Our script code can automatically run the experiments in 8-shot, 16-shot, 32-shot and standard supervised settings with both the procedures of train, eval and test. We just choose the random seed to be 1 as an example in our code. Actually you can perform multiple experments with different seeds.

Example for SEMEVAL

Train the KonwPrompt model on SEMEVAL with the following command:

>> bash scripts/semeval.sh  # for roberta-large

As the scripts for TACRED-Revist, Re-TACRED, Wiki80 included in our paper are also provided, you just need to run it like above example.

Example for DialogRE

As the data format of DialogRE is very different from other dataset, Class of processor is also different. Train the KonwPrompt model on DialogRE with the following command:

>> bash scripts/dialogue.sh  # for roberta-base

More emperical results

We report emperical results on more datasets in the EMNLP 2022 (Findings) paper "Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study" [code].

Acknowledgement

Part of our code is borrowed from code of PTR: Prompt Tuning with Rules for Text Classification, many thanks.

Citation

If you use the code, please cite the following paper:

@inproceedings{DBLP:conf/www/ChenZXDYTHSC22,
  author    = {Xiang Chen and
               Ningyu Zhang and
               Xin Xie and
               Shumin Deng and
               Yunzhi Yao and
               Chuanqi Tan and
               Fei Huang and
               Luo Si and
               Huajun Chen},
  editor    = {Fr{\'{e}}d{\'{e}}rique Laforest and
               Rapha{\"{e}}l Troncy and
               Elena Simperl and
               Deepak Agarwal and
               Aristides Gionis and
               Ivan Herman and
               Lionel M{\'{e}}dini},
  title     = {KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
               for Relation Extraction},
  booktitle = {{WWW} '22: The {ACM} Web Conference 2022, Virtual Event, Lyon, France,
               April 25 - 29, 2022},
  pages     = {2778--2788},
  publisher = {{ACM}},
  year      = {2022},
  url       = {https://doi.org/10.1145/3485447.3511998},
  doi       = {10.1145/3485447.3511998},
  timestamp = {Tue, 26 Apr 2022 16:02:09 +0200},
  biburl    = {https://dblp.org/rec/conf/www/ChenZXDYTHSC22.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

More Repositories

1

DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
Python
3,353
star
2

EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
Jupyter Notebook
1,758
star
3

LLMAgentPapers

Must-read Papers on LLM Agents.
1,396
star
4

KnowLM

An Open-sourced Knowledgable Large Language Model Framework.
Python
1,181
star
5

Prompt4ReasoningPapers

[ACL 2023] Reasoning with Language Model Prompting: A Survey
836
star
6

KnowledgeEditingPapers

Must-read Papers on Knowledge Editing for Large Language Models.
822
star
7

PromptKG

PromptKG Family: a Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list.
Python
681
star
8

EasyInstruct

[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.
Python
355
star
9

OpenUE

[EMNLP 2020] OpenUE: An Open Toolkit of Universal Extraction from Text
Python
321
star
10

AutoKG

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities
Python
292
star
11

Mol-Instructions

[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
Python
214
star
12

MKGformer

[SIGIR 2022] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
Python
159
star
13

KnowAgent

KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
Python
144
star
14

OntoProtein

[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
Python
141
star
15

AutoAct

[ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning
Python
132
star
16

DART

[ICLR 2022] Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Python
125
star
17

DocuNet

[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
Python
125
star
18

IEPile

[ACL 2024] IEPile: A Large-Scale Information Extraction Corpus
Python
125
star
19

MolGen

[ICLR 2024] Domain-Agnostic Molecular Generation with Chemical Feedback
Python
118
star
20

Low-resource-KEPapers

A Paper List of Low-resource Information Extraction
111
star
21

Relphormer

[Neurocomputing 2023] Relational Graph Transformer for Knowledge Graph Representation
Python
108
star
22

Generative_KG_Construction_Papers

[EMNLP 2022] Generative Knowledge Graph Construction: A Review
99
star
23

HVPNeT

[NAACL 2022 Findings] Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction
Python
94
star
24

MachineSoM

[ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
Python
80
star
25

MKG_Analogy

[ICLR 2023] Multimodal Analogical Reasoning over Knowledge Graphs
Python
78
star
26

FactCHD

[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
Python
72
star
27

NLP4SciencePapers

Must-read papers on NLP for science.
50
star
28

KNN-KG

[NLPCC 2023] Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings with Language Models
Python
49
star
29

DeepEE

DeepEE: Deep Event Extraction Algorithm Gallery (基于深度学习的开源中文事件抽取算法汇总)
Python
39
star
30

ChatCell

ChatCell: Facilitating Single-Cell Analysis with Natural Language
Python
39
star
31

RAP

[SIGIR 2023] Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
Python
37
star
32

DocED

[ACL 2021] MLBiNet: A Cross-Sentence Collective Event Detection Network
Python
35
star
33

Kformer

[NLPCC 2022] Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Python
33
star
34

LREBench

[EMNLP 2022 Findings] Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study
Python
33
star
35

TRICE

[NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback
Python
32
star
36

IEDatasetZoo

Information Extraction Dataset Zoo.
31
star
37

DiagnoseRE

[CCKS 2021] On Robustness and Bias Analysis of BERT-based Relation Extraction
Python
27
star
38

ContinueMKGC

[IJCAI 2024] Continual Multimodal Knowledge Graph Construction
Python
27
star
39

WKM

Agent Planning with World Knowledge Model
22
star
40

KnowledgeCircuits

Knowledge Circuits in Pretrained Transformers
Python
20
star
41

PitfallsKnowledgeEditing

[ICLR 2024] Unveiling the Pitfalls of Knowledge Editing for Large Language Models
Python
19
star
42

AdaKGC

[EMNLP 2023 (Findings)] Schema-adaptable Knowledge Graph Construction
Python
16
star
43

knowledge-rumination

[EMNLP 2023] Knowledge Rumination for Pre-trained Language Models
Python
14
star
44

SPEECH

[ACL 2023] SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Python
13
star
45

NLPCC2024_RegulatingLLM

[NLPCC 2024] Shared Task 10: Regulating Large Language Models
13
star
46

SemEval2021Task4

The 4th rank system of the SemEval 2021 Task4.
Python
10
star
47

Revisit-KNN

[CCL 2023] Revisiting k-NN for Fine-tuning Pre-trained Language Models
Python
10
star
48

EasyDetect

[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
Python
8
star
49

EasyEval

An Easy-to-use Intelligence Evaluation Framework for LLMs.
Python
6
star
50

zjunlp.github.io

HTML
3
star
51

BiasEdit

Debiasing Stereotyped Language Models via Model Editing
Python
3
star
52

project

Project homepages for the NLP & KG Group of Zhejiang University
JavaScript
3
star
53

DQSetGen

[TASLP 2024] Sequence Labeling as Non-autoregressive Dual-Query Set Generation
Python
3
star
54

KnowUnDo

2
star
55

L2A

Python
2
star
56

KnowFM

2
star
57

EditBias

EditBias: Debiasing Stereotyped Language Models via Model Editing
Python
1
star