• Stars
    star
    2
  • Language
    Java
  • Created over 5 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

利用熵计算查询与文档的相关性。Entropy is used to calculate the relevance of a query to a document. This program is mainly based on 《Content-based relevance estimation on the web using inter-document similarities》(2012-CIKM).

More Repositories

1

learning_to_rank

利用lightgbm做(learning to rank)排序学习,包括数据处理、模型训练、模型决策可视化、模型可解释性以及预测等。Use LightGBM to learn ranking, including data processing, model training, model decision visualization, model interpretability and prediction, etc.
Python
221
star
2

albert_lstm_crf_ner

albert + lstm + crf实体识别,pytorch实现。识别的主要实体是人名、地名、机构名和时间。albert + lstm + crf (named entity recognition)
Python
120
star
3

movie_knowledge_graph_app

电影知识图谱,主要包括实体识别、实体查询、关系查询以及智能问答等。movie knowledge graph(Entity identification, graph display, and intelligent question and answer)
JavaScript
87
star
4

education_knowledge_graph_app

Education knowledge graph(graph display, knowledge point tracking, intelligent question and answer,questions knowledge point prediction)。k12教育学科知识图谱,图谱展示,知识点追踪,智能问答以及题目知识点预测。
JavaScript
48
star
5

intent_detection_and_slot_filling

intent detection and slot filling 意图识别与槽填充联合模型
Jupyter Notebook
30
star
6

spark_data_mining

spark tutorial for big data mining。包括app流量运营分析、als推荐、smote样本采样、RFM客户价值分群、AHP层次分析客户价值得分、手机定位数据商圈挖掘、马尔可夫智能邮件预测、时序预测、关联规则、推荐电影好友等。
Java
29
star
7

movie_kg

基于知识图谱的电影智能问答。neo4j构建电影图谱,spark ml完成问答意图分类,将问答语句转为cypher查询语句完成匹配查询。
Java
28
star
8

recommendation_methods

个性化推荐模型,主要包括als、als_wr、biaslfm、lfm、nmf、svdpp、基于内容、基于内容回归、user-cf、item-cf、slopeone、关联规则以及基于内容和cf的混合等模型。
Python
24
star
9

java-springboot-paddleocr

本项目利用java加载paddle-ocr的C++编译的exe文件,并利用springboot进行web部署访问。This project loads the C++ compiled version of paddle-ocr in java and makes use of springboot for web deployment.
Java
24
star
10

intelligent_medical

intelligent medical,智慧医疗,包括疾病搜索、相关推荐、疾病医疗问答以及智能疾病诊断等功能。
Java
23
star
11

gnn4lp

gnn for link prediction,图神经网络用于链接预测。
Python
21
star
12

python_search

利用sklearn和gensim中的tfidf,lsa,doc2vec进行查询与文档匹配搜索
Python
21
star
13

jcorrector

jcorrector 中文文本纠错工具, Text Error Correction Tool,Spelling Check
Java
20
star
14

albert_re

albert-fc for RE(Relation Extraction),中文关系抽取
Python
15
star
15

java-springboot-paddleocr-v2

本项目利用JNI加载paddle-ocr的C++编译的dll库,并利用springboot进行web部署访问。This project uses JNI to load the C++ compiled dll libraries of paddle-ocr, and uses springboot for web deployment
Java
15
star
16

punctuation_prediction

chinese sentence punctuation prediction,中文句子标点符号预测。
Python
14
star
17

knowledge-automatic-tagging

题目知识点预测标注。Question knowledge point prediction.
Jupyter Notebook
13
star
18

text_grapher

利用java对文章进行分析并图谱化展示(主要提取关键词、实体、依存分析等)。
Java
11
star
19

gcn_for_prediction_of_protein_interactions

gcn for prediction of protein interactions,图卷积用于蛋白质相互作用。
Python
11
star
20

text_generation

Title and keywords are used to generate text.
Python
11
star
21

model2onnx

model2onnx,将roberta和macbert模型转为onnx格式,并进行推理。
Python
8
star
22

intent_classification

深度网络实现意图分类。
Jupyter Notebook
8
star
23

chatbot_chinese

Chinese chatbot for neural machine translation in PyTorch.Including basic seq2seq、seq2seq with attention、pointer generator、seq2seq with cnn and so on.
PLSQL
8
star
24

t5-onnx-corrector

t5-model-onnx,中文拼写纠错,Chinese spelling correction。
Python
7
star
25

onnx-java

onnx-java,这里利用java加载onnx模型,并进行推理。
Java
7
star
26

macbert-java-onnx

MacBERT for Chinese Spelling Correction, macbert中文拼写纠错
Java
7
star
27

NewsSummary

一个改进的新闻摘要程序(an improved method of news summary)
Java
7
star
28

CNN4IE

Chinese Information Extraction Toolkit。中文信息抽取工具。利用CNN各种变体进行实体抽取。
Python
6
star
29

chinese_sentence_paraphrase

sentence paraphrase
Python
6
star
30

albert_link_prediction

albert-fc for LP(Link Prediction),中文实体链接预测
Python
6
star
31

AutoText

智能文本自动处理工具(Intelligent text automatic processing tool)。AutoText的功能主要有文本纠错,图片ocr、版面检测以及表格结构识别等。The main functions of this project include text error correction, ocr, layout-detection and table structure recognition.
Java
6
star
32

sentence_rewriting

chinese sentence rewriting
Python
5
star
33

knowledge_point_graph

spark neo4j java 知识图谱数据处理
Java
5
star
34

layout_analysis

中文版面检测(Chinese layout detection),yolov8 is used to detect the layout of Chinese document images。
Python
4
star
35

albert_ner

albert-crf for NER(Named Entity Recognition),中文实体识别。
Python
4
star
36

text-de-duplication

text de-duplication 文本去重
4
star
37

pdf_to_docx

ocr,pdf转docx,pdf to docx
Python
4
star
38

albert_srl

albert-crf for SRL(Semantic Role Labeling),中文语义角色标注。
Python
4
star
39

layout_analysis4j

利用java-yolov8实现版面检测(Chinese layout detection),java-yolov8 is used to detect the layout of Chinese document images
Java
4
star
40

gec_check_template

grammatical correction,中文语法纠错模板
Java
4
star
41

chatbot

pytorch前馈网络分类预测chatbot
Jupyter Notebook
3
star
42

j4nlp

java for nlp,java自然语言处理
Java
3
star
43

triple_event_extract

EventExtraction & TriplesExtraction,复合事件抽取,依存关系三元组抽取
Java
2
star
44

bert_ndcg_lp

bert-ndcg for LP(Link Prediction),链接预测
Python
2
star
45

easyKG

deep learning of knowledge graph ,知识图谱深度学习相关技术
Python
2
star
46

llm_corpus_quality

大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning
Java
2
star
47

RecomSys

A simple recommendation system
Java
2
star
48

table_ocr_java

TABLE DETECTION IN IMAGES AND OCR TO CSV WITH JAVA
Java
2
star
49

micrograd4j

A micro scalar-valued Autograd engine developed with java, and a neural net library on top of it.
Java
2
star
50

similarity_words

计算词间的相关性,并进行图谱化展示。calculate the relevance between words
Python
2
star
51

vehicle_license_plate_recognition

车牌识别(vehicle license plate recognition)
Python
1
star
52

pediatrics_llm_qa

Small model of pediatric consultation
Python
1
star
53

semantic_matching

semantic matching,语义匹配
Jupyter Notebook
1
star
54

doc_ai

这里将paddle中的ocr等模型转为onnx格式,并利用java版深度框架djl加载这些onnx模型进行推理预测尝试。
Java
1
star
55

spark-smote

The program uses spark to implement smote sampling.利用spark实现训练样本smote采样。
1
star
56

llm_security

利用分类法和敏感词检测法对生成式大模型的输入和输出内容进行安全检测,尽早识别风险内容。The input and output contents of generative large model are checked by classification method and sensitive word detection method to identify content risk as early as possible.
Java
1
star