• Stars
    star
    523
  • Rank 84,684 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Knowledge triples extraction and knowledge base construction based on dependency syntax for open domain text.

open-entity-relation-extraction

Knowledge triples extraction (entities and relations extraction) and knowledge base construction based on dependency syntax for open domain text.

基于依存句法分析,实现面向开放域文本的知识三元组抽取(实体和关系抽取)及知识库构建。

Welcome to watch, star or fork.

Example

"中国国家主席习近平访问韩国,并在首尔大学发表演讲"

We can extract knowledge triples from the sentence as follows:

  • (中国, 国家主席, 习近平)
  • (习近平, 访问, 韩国)
  • (习近平, 发表演讲, 首尔大学)

Project Structure

knowledge_extraction/
|-- code/  # code directory
|   |-- bean/
|   |-- core/
|   |-- demo/  # procedure entry
|   |-- tool/
|-- data/ # data directory
|   |-- input_text.txt  # input text file
|   |-- knowledge_triple.json  # output knowledge triples file
|-- model/  # ltp models, can be downloaded from http://ltp.ai/download.html, select ltp_data_v3.4.0.zip
|-- resource  # dictionaries dirctory
|-- requirements.txt  # dependent python libraries
|-- README.md  # project description

Requirements

This repo was tested on Python 3.5+. The requirements are:

  • jieba>=0.39
  • pyltp>=0.2.1

Quickstart

cd ./code/demo/
python extract_demo.py

Seven DSNF paradigms

DSNF

References

If you use the code, please kindly cite the following paper:

Jia S, Li M, Xiang Y. Chinese Open Relation Extraction and Knowledge Base Establishment[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2018, 17(3): 15.