• Stars
    star
    1,717
  • Rank 26,931 (Top 0.6 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 1 year ago
  • Updated 18 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

An Easy-to-use Knowledge Editing Framework for Large Language Models.

License: MIT Static Badge


OverviewInstallationHow To UseDocsColab TutorialPaperCitationContributorsSlidesVideo

Table of Contents

🔔News

This repository is a subproject of KnowLM.

EasyEdit is now publicly open-sourced, with a demo video and long-term maintenance.


Editing Demo

There is a demonstration of editing. The GIF file is created by Terminalizer.

Knowledge Editing

Task Definition

Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.

Knowledge editing aims to adjust an initial base model's $(f_\theta)$ behavior on the particular edit descriptor $[x_e, y_e]$ efficiently, such as(The president of USA: Donald Trump -> Joe Biden):

  • $x_e$: "Who is the president of the US?
  • $y_e$: "Joe Biden."

without influencing the model behavior on unrelated samples. The ultimate goal is to create an edited model $(f_\theta')$.

Evaluation

The knowledge editing process generally impacts the predictions for a broad set of inputs that are closely associated with the edit example, called the editing scope.

A successful edit should adjust the model’s behavior within the editing scope while remaining unrelated inputs(as below formula).

$$ f_{\theta_{e}}(x) = \begin{cases} y_e & \text{if } x \in I(x_e,y_e) \\ f_{\theta}(x) & \text{if } x \in O(x_e, y_e) \end{cases} $$

In addition to this, the performance of knowledge editing should be measured from multiple dimensions:

  • Reliability: the success rate of editing with a given editing description
  • Generalization: the success rate of editing within the editing scope
  • Locality: whether the model's output changes after editing for unrelated inputs
  • Portability: the success rate of editing for factual reasoning(one hop, synonym, one-to-one relation)
  • Efficiency: time and memory consumption required during the editing process

🌟Overview

EasyEdit is a Python package for edit Large Language Models (LLM) like GPT-J, Llama, GPT-NEO, GPT2, T5(support models from 1B to 65B), the objective of which is to alter the behavior of LLMs efficiently within a specific domain without negatively impacting performance across other inputs. It is designed to be easy to use and easy to extend.

  • EasyEdit contains a unified framework for Editor, Method and Evaluate, respectively representing the editing scenario, editing technique, and evaluation method.

  • Each Knowledge Editing scenario comprises of three components:

    • Editor: such as BaseEditor(Factual Knowledge and Generation Editor) for LM, MultiModalEditor(MultiModal Knowledge).
    • Method: the specific knowledge editing technique used(such as ROME, MEND, ..).
    • Evaluate: Metrics for evaluating knowledge editing performance.
      • Reliability, Generalization, Locality, Portability
  • The current supported knowledge editing techniques are as follows:

    • FT-L: Fine-Tuning with $L_\infty$ constraint
    • SERAC: Mitchell et al. Memory-based
    • IKE: Ce Zheng et al. In-Context Editing
    • MEND: Mitchell et al. Hypernetwork
    • KN: Damai Dai et al. Locate then Edit
    • ROME: Kevin Meng et al. Locate and Edit
    • MEMIT: Kevin Meng et al. Locate and Edit

      Due to the limited compatibility of this toolkit and limited by the transformer version, some knowledge editing methods are not supported. You can find relevant editing methods in the following links

    • T-Patcher | KE | CaliNet

Current Implementation

You can choose different editing methods according to your specific needs.

Method T5 GPT-2 GPT-J GPT-NEO LlaMA LlaMA-2 Baichuan ChatGLM2
FT-L
SERAC
IKE
MEND
KN
ROME
MEMIT

Dataset

dataset Google Drive BaiduNetDisk Description
ZsRE [Google Drive] [BaiduNetDisk] Question Answering dataset using question rephrasings
Counterfact [Google Drive] [BaiduNetDisk] Counterfact dataset using Entity replacement

We provide zsre and counterfact datasets to verify the effectiveness of knowledge editing. You can download them here. [Google Drive], [BaiduNetDisk].

  • for locality, in addition to testing unrelated instances, we also provide tests on distracting (reference: Detecting Edit Failures...), other attribution, and other downstream tasks (such as commonsense reasoning).
  • for portability, it tests whether the model can apply edited instances for inference. We provide evaluations for one-hop reasoning, subject alias, and inverse relation (eg, a one-to-one relationship between spouses should be bidirectionally edited).

Tutorial notebook

Method Description GPT-2 LlaMA
IKE In-Context Learning (ICL) Edit [Colab-gpt2] [Colab-llama]
ROME Locate-Then-Edit Neurons [Colab-gpt2] [Colab-llama]
MEMIT Locate-Then-Edit Neurons [Colab-gpt2] [Colab-llama]

Editing Performance

We present editing results of the four metrics on LlaMA-2-7B using EasyEdit. We adopt ZsRE as the test dataset.

❗️❗️Editing llama-2-7B requires 40G+ VRAM on GPU. (OOM solution)

Reliability Generalization Locality Portability
FT-L 56.94 52.02 96.32 0.07
SERAC 99.49 99.13 100.00 0.13
IKE 100.00 99.98 69.19 67.56
MEND 94.24 90.27 97.04 0.14
KN 28.95 28.43 65.43 0.07
ROME 92.45 87.04 99.63 10.46
MEMIT 92.94 85.97 99.49 6.03

Requirements

🔧Pip Installation

Note: Please use Python 3.9+ for EasyEdit To get started, simply install conda and run:

git clone https://github.com/zjunlp/EasyEdit.git
conda create -n EasyEdit python=3.9.7
...
pip install -r requirements.txt

🐳Docker Installation

We packaged the environment, you can download Docker from this link.

Pull the Docker image from Docker Hub or Aliyun:

docker pull zjunlp/easyedit
docker pull registry.cn-hangzhou.aliyuncs.com/zjunlp/easyedit:v1

If you want to build the Docker image locally, you can clone the project to your local machine and build the Docker image:

git clone https://github.com/zjunlp/EasyEdit.git
cd EasyEdit
docker build -t your-image-name .

Then run the Docker image as a container:

docker run -p 8080:80 your-image-name

📌Use EasyEdit

  • Edit large language models(LLMs) around 5 seconds

  • Following example shows you how to perform editing with EasyEdit. More examples and tutorials can be found at examples

BaseEditor

BaseEditoris the class for Language Modality Knowledge Editing. You can choose the appropriate editing method based on your specific needs.

  • Due to different transformer versions and different GPU models, the editing results may fluctuate slightly.

Introduction by a Simple Example

With the modularity and flexibility of EasyEdit, you can easily use it to edit model.

Step1: Define a PLM as the object to be edited. Choose the PLM to be edited. EasyEdit supports partial models(T5, GPTJ, GPT-NEO, LlaMA so far) retrievable on HuggingFace. The corresponding configuration file directory is hparams/YUOR_METHOD/YOUR_MODEL.YAML, such as hparams/MEND/gpt2-xl, set the corresponding model_name to select the object for knowledge editing.

model_name: gpt2-xl
model_class: GPT2LMHeadModel
tokenizer_class: GPT2Tokenizer
tokenizer_name: gpt2-xl

Step2: Choose the appropriate Knowledge Editing Method The selection of editing methods is a crucial step, as different methods have their own strengths and weaknesses. Users need to consider the trade-off between editing success rate, generalization, and maintaining unrelated performance. For specific performance details of each method, please refer to the paper: Editing Large Language Models: Problems, Methods, and Opportunities.

## In this case, we use MEND method, so you should import `MENDHyperParams`
from easyeditor import MENDHyperParams
## Loading config from hparams/MEMIT/gpt2-xl.yaml
hparams = MENDHyperParams.from_hparams('./hparams/MEND/gpt2-xl')

Step3: Provide the edit descriptor and edit target

## edit descriptor: prompt that you want to edit
prompts = [
    'What university did Watts Humphrey attend?',
    'Which family does Ramalinaceae belong to',
    'What role does Denny Herzig play in football?'
]
## You can set `ground_truth` to None !!!(or set to original output)
ground_truth = ['Illinois Institute of Technology', 'Lecanorales', 'defender']
## edit target: expected output
target_new = ['University of Michigan', 'Lamiinae', 'winger']

Step4: Combine them into a BaseEditor EasyEdit provides a simple and unified way to init Editor, like huggingface: from_hparams.

## Construct Language Model Editor
editor = BaseEditor.from_hparams(hparams)

Step5: Provide the data for evaluation Note that the data for portability and locality are both optional(set to None for basic editing success rate evaluation only). The data format for both is a dict, for each measurement dimension, you need to provide the corresponding prompt and its corresponding ground truth. Here is an example of the data:

locality_inputs = {
    'neighborhood':{
        'prompt': ['Joseph Fischhof, the', 'Larry Bird is a professional', 'In Forssa, they understand'],
        'ground_truth': ['piano', 'basketball', 'Finnish']
    },
    'distracting': {
        'prompt': ['Ray Charles, the violin Hauschka plays the instrument', 'Grant Hill is a professional soccer Magic Johnson is a professional', 'The law in Ikaalinen declares the language Swedish In Loviisa, the language spoken is'],
        'ground_truth': ['piano', 'basketball', 'Finnish']
    }
}

In the above example, we evaluate the performance of the editing methods about "neighborhood" and "distracting".

Step6: Edit and Evaluation Done! We can conduct Edit and Evaluation for your model to be edited. The edit function will return a series of metrics related to the editing process as well as the modified model weights.

metrics, edited_model, _ = editor.edit(
    prompts=prompts,
    ground_truth=ground_truth,
    target_new=target_new,
    locality_inputs=locality_inputs,
    keep_original_weight=True
)
## metrics: edit success, rephrase success, locality e.g.
## edited_model: post-edit model

Evaluation

We specify the return metrics as dict format, including model prediction evaluations before and after editing. For each edit, it will include the following metrics:

  • rewrite_acc $\rightarrow$ Reliablilty
  • rephrase_acc $\rightarrow$ Generalization
  • locality $\rightarrow$ Locality
  • portablility $\rightarrow$ Portablility
{
    "post": {
        "rewrite_acc": ,
        "rephrase_acc": ,
        "locality": {
            "YOUR_LOCALITY_KEY": ,
            //...
        },
        "portablility": {
            "YOUR_PORTABILITY_KEY": ,
            //...
        },
    },
    "pre": {
        "rewrite_acc": ,
        "rephrase_acc": ,
        "portablility": {
            "YOUR_PORTABILITY_KEY": ,
            //...
        },
    }
}
  • For evaluation for Reliablilty, you only need to provide the corresponding editing prompts and editing target_new.
  • For evaluation for Generalization, rephrase_prompts are required.
  • For evaluation for Locality and Portablility, you need to define the name of the corresponding metric, as well as prompts and ground_truth.
    • Note: the length needs to be equal to the edit prompts

Trainer

  • meta-learning based: MEND
  • memory-based routing: SERAC

For above editing methods, pre-training of corresponding meta-networks or classifiers is required. Therefore, in EasyEdit, we provide a unified framework for pretraining the relevant network structures. Take the training MEND for example:

  • Step 1 and Step 2 are the same as the example above, which involves selecting the appropriate editing model and editing method.

Step3: Provide the edit training set The currently supported and available datasets are: zsre and counterfact(Google Drive). Please place them in the "data" directory and initialize the dataset_class (ZsreDataset for zsre and CounterFactDataset for counterfact) to load the corresponding training set.

train_ds = ZsreDataset('./data/zsre_mend_train.json', config=training_hparams)
eval_ds = ZsreDataset('./data/zsre_mend_eval.json', config=training_hparams)

Step4: Combine them into a Trainer

trainer = EditTrainer(
    config=training_hparams,
    train_set=train_ds,
    val_set=eval_ds
)

Step6: Run and Edit Done! We can conduct Run and Evaluation.

trainer.run()
  • Run: The CHECKPOINT will be saved to the path RESULTS_DIR(in global.yml).
  • Edit: Set the archive field in the hparams file to CHECKPOINT. EasyEdit will automatically load the corresponding pre-trained weights during the editing process(Go to edit).
TO DO In next version, we plan to:
  • release a multimodal Editor for LLMs.
  • support more editing methods for BaiChuan, FALCON, etc.
  • knowledge editing for other tasks(except factual editing), like textual knowledge editing, personality editing, etc.

Meanwhile, we will offer long-term maintenance to fix bugs, solve issues and meet new requests. So if you have any problems, please put issues to us.

Citation

Please cite our paper if you use EasyEdit in your work.

@article{DBLP:journals/corr/abs-2308-07269,
  author       = {Peng Wang and
                  Ningyu Zhang and
                  Xin Xie and
                  Yunzhi Yao and
                  Bozhong Tian and
                  Mengru Wang and
                  Zekun Xi and
                  Siyuan Cheng and
                  Kangwei Liu and
                  Guozhou Zheng and
                  Huajun Chen},
  title        = {EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language
                  Models},
  journal      = {CoRR},
  volume       = {abs/2308.07269},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2308.07269},
  doi          = {10.48550/arXiv.2308.07269},
  eprinttype    = {arXiv},
  eprint       = {2308.07269},
  timestamp    = {Wed, 23 Aug 2023 14:43:32 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2308-07269.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

@article{DBLP:journals/corr/abs-2305-13172,
  author       = {Yunzhi Yao and
                  Peng Wang and
                  Bozhong Tian and
                  Siyuan Cheng and
                  Zhoubo Li and
                  Shumin Deng and
                  Huajun Chen and
                  Ningyu Zhang},
  title        = {Editing Large Language Models: Problems, Methods, and Opportunities},
  journal      = {CoRR},
  volume       = {abs/2305.13172},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2305.13172},
  doi          = {10.48550/arXiv.2305.13172},
  eprinttype    = {arXiv},
  eprint       = {2305.13172},
  timestamp    = {Tue, 30 May 2023 17:04:46 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2305-13172.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

🎉Contributors

We thank all the contributors to this project, more contributors are welcome!

Other Related Projects

🙌 We would like to express our heartfelt gratitude for the contribution of ROME to our project, as we have utilized portions of their source code in our project.

More Repositories

1

DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
Python
3,353
star
2

LLMAgentPapers

Must-read Papers on LLM Agents.
1,396
star
3

KnowLM

An Open-sourced Knowledgable Large Language Model Framework.
Python
1,181
star
4

Prompt4ReasoningPapers

[ACL 2023] Reasoning with Language Model Prompting: A Survey
836
star
5

KnowledgeEditingPapers

[知识编辑] Must-read Papers on Knowledge Editing for Large Language Models.
748
star
6

PromptKG

PromptKG Family: a Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list.
Python
674
star
7

EasyInstruct

[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.
Python
350
star
8

OpenUE

[EMNLP 2020] OpenUE: An Open Toolkit of Universal Extraction from Text
Python
319
star
9

AutoKG

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities
Python
292
star
10

Mol-Instructions

[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
Python
214
star
11

KnowPrompt

[WWW 2022] KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction
Python
191
star
12

MKGformer

[SIGIR 2022] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
Python
159
star
13

KnowAgent

KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
Python
144
star
14

OntoProtein

[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
Python
141
star
15

AutoAct

[ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning
Python
132
star
16

DART

[ICLR 2022] Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Python
125
star
17

DocuNet

[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
Python
125
star
18

IEPile

[ACL 2024] IEPile: A Large-Scale Information Extraction Corpus
Python
125
star
19

MolGen

[ICLR 2024] Domain-Agnostic Molecular Generation with Chemical Feedback
Python
118
star
20

Low-resource-KEPapers

A Paper List of Low-resource Information Extraction
111
star
21

Relphormer

[Neurocomputing 2023] Relational Graph Transformer for Knowledge Graph Representation
Python
108
star
22

Generative_KG_Construction_Papers

[EMNLP 2022] Generative Knowledge Graph Construction: A Review
99
star
23

HVPNeT

[NAACL 2022 Findings] Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction
Python
94
star
24

MachineSoM

[ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
Python
80
star
25

MKG_Analogy

[ICLR 2023] Multimodal Analogical Reasoning over Knowledge Graphs
Python
78
star
26

FactCHD

[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
Python
72
star
27

NLP4SciencePapers

Must-read papers on NLP for science.
50
star
28

KNN-KG

[NLPCC 2023] Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings with Language Models
Python
48
star
29

DeepEE

DeepEE: Deep Event Extraction Algorithm Gallery (基于深度学习的开源中文事件抽取算法汇总)
Python
39
star
30

ChatCell

ChatCell: Facilitating Single-Cell Analysis with Natural Language
Python
39
star
31

RAP

[SIGIR 2023] Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
Python
37
star
32

DocED

[ACL 2021] MLBiNet: A Cross-Sentence Collective Event Detection Network
Python
35
star
33

Kformer

[NLPCC 2022] Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Python
33
star
34

LREBench

[EMNLP 2022 Findings] Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study
Python
33
star
35

TRICE

[NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback
Python
32
star
36

IEDatasetZoo

Information Extraction Dataset Zoo.
31
star
37

DiagnoseRE

[CCKS 2021] On Robustness and Bias Analysis of BERT-based Relation Extraction
Python
27
star
38

ContinueMKGC

[IJCAI 2024] Continual Multimodal Knowledge Graph Construction
Python
27
star
39

WKM

Agent Planning with World Knowledge Model
22
star
40

KnowledgeCircuits

Knowledge Circuits in Pretrained Transformers
Python
20
star
41

PitfallsKnowledgeEditing

[ICLR 2024] Unveiling the Pitfalls of Knowledge Editing for Large Language Models
Python
19
star
42

AdaKGC

[EMNLP 2023 (Findings)] Schema-adaptable Knowledge Graph Construction
Python
16
star
43

knowledge-rumination

[EMNLP 2023] Knowledge Rumination for Pre-trained Language Models
Python
14
star
44

SPEECH

[ACL 2023] SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Python
13
star
45

NLPCC2024_RegulatingLLM

[NLPCC 2024] Shared Task 10: Regulating Large Language Models
13
star
46

SemEval2021Task4

The 4th rank system of the SemEval 2021 Task4.
Python
10
star
47

Revisit-KNN

[CCL 2023] Revisiting k-NN for Fine-tuning Pre-trained Language Models
Python
10
star
48

EasyDetect

[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
Python
8
star
49

EasyEval

An Easy-to-use Intelligence Evaluation Framework for LLMs.
Python
6
star
50

zjunlp.github.io

HTML
3
star
51

BiasEdit

Debiasing Stereotyped Language Models via Model Editing
Python
3
star
52

project

Project homepages for the NLP & KG Group of Zhejiang University
JavaScript
3
star
53

DQSetGen

[TASLP 2024] Sequence Labeling as Non-autoregressive Dual-Query Set Generation
Python
3
star
54

KnowUnDo

2
star
55

L2A

Python
2
star
56

KnowFM

2
star
57

EditBias

EditBias: Debiasing Stereotyped Language Models via Model Editing
Python
1
star