• Stars
    star
    204
  • Rank 192,063 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 1 year ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Tianyu Yu*, Chengyue Jiang*, Chao Lou*, Shen Huang*, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang†
DAMO Academy, Alibaba Group
*Equal Contribution; † Corresponding Author

license paper

Spotlights


  • A bilingual model (English and Chinese) specially enhanced for open-domain NLU.
  • Trained with diverse synthesized data and high-quality NLU dataset.
  • Handle all NLU tasks that can be transformed into a combination of atomic tasks, classification and extraction.

📰 Update News

SeqGPT is continuously updating. We have provided online demos for everyone. In the future, we will provide new versions of models with upgraded capabilities. Please continue to pay attention!

Performance

We perform a human evaluation on SeqGPT-7B1 and ChatGPT using the held-out datasets. Ten annotators are tasked to decide which model gives the better answer or two models are tied with each other. SeqGPT-7B1 outperforms ChatGPT on 7/10 NLU tasks but lags behind in sentiment analysis (SA), slot filling (SF) and natural language inference (NLI).

Usage

Install

conda create -n seqgpt python==3.8.16

conda activate seqgpt
pip install -r requirements.txt

Inference

from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
import torch

model_name_or_path = 'DAMO-NLP/SeqGPT-560M'
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path)
tokenizer.padding_side = 'left'
tokenizer.truncation_side = 'left'

if torch.cuda.is_available():
    model = model.half().cuda()
model.eval()
GEN_TOK = '[GEN]'

while True:
    sent = input('输入/Input: ').strip()
    task = input('分类/classify press 1, 抽取/extract press 2: ').strip()
    labels = input('标签集/Label-Set (e.g, labelA,LabelB,LabelC): ').strip().replace(',', ',')
    task = '分类' if task == '1' else '抽取'

    # Changing the instruction can harm the performance
    p = '输入: {}\n{}: {}\n输出: {}'.format(sent, task, labels, GEN_TOK)
    input_ids = tokenizer(p, return_tensors="pt", padding=True, truncation=True, max_length=1024)
    input_ids = input_ids.to(model.device)
    outputs = model.generate(**input_ids, num_beams=4, do_sample=False, max_new_tokens=256)
    input_ids = input_ids.get('input_ids', input_ids)
    outputs = outputs[0][len(input_ids[0]):]
    response = tokenizer.decode(outputs, skip_special_tokens=True)
    print('BOT: ========== \n{}'.format(response))

Citation

If you found this work useful, consider giving this repository a star and citing our paper as followed:

@misc{yu2023seqgpt,
      title={SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding}, 
      author={Tianyu Yu and Chengyue Jiang and Chao Lou and Shen Huang and Xiaobin Wang and Wei Liu and Jiong Cai and Yangning Li and Yinghui Li and Kewei Tu and Hai-Tao Zheng and Ningyu Zhang and Pengjun Xie and Fei Huang and Yong Jiang},
      year={2023},
      eprint={2308.10529},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

More Repositories

1

ACE

[ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction
Python
299
star
2

EcomGPT

An Instruction-tuned Large Language Model for E-commerce
Python
221
star
3

HiAGM

Hierarchy-Aware Global Model for Hierarchical Text Classification
Python
206
star
4

KB-NER

Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.
Python
177
star
5

Multi-CPR

[SIGIR 2022] Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval
Python
164
star
6

CLNER

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning
Python
91
star
7

MultilangStructureKD

[ACL 2020] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Python
71
star
8

MuVER

[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations
Python
30
star
9

ProtoRE

Code for 'Prototypical Representation Learning for Relation Extraction'.
Python
30
star
10

RankingGPT

code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》
Python
28
star
11

DAAT-CWS

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation
Python
22
star
12

AISHELL-NER

[ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech
21
star
13

HLATR

Hybrid List Aware Transformer Reranking
18
star
14

AIN

Code for our EMNLP 2020 Paper "AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network"
Python
18
star
15

MANNER

[ACL 2023] MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition
Python
17
star
16

EBM-Net

Codes for the EMNLP'2020 paper "Predicting Clinical Trial Results by Implicit Evidence Integration".
Python
14
star
17

CDQA

CDQA: Chinese Dynamic Question Answering Benchmark
Python
13
star
18

StructuralKD

[ACL-IJCNLP 2021] Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor
Python
9
star
19

MarCo-Dialog

Python
3
star
20

IBKD

This is the official repository for the IBKD knowledge distillation method, as described in the paper .
Python
3
star
21

Vec-RA-ODQA

Source code of paper Improving "Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts
Python
2
star
22

Key-Point-Analysis

Python
1
star