• Stars
    star
    161
  • Rank 233,470 (Top 5 %)
  • Language
    Python
  • Created almost 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CoSENT、STS、SentenceBERT

CoSENT_Pytorch

比Sentence-BERT更有效的句向量方案

实验结果

实验效果来了。 预训练模型用的是孟子(换成其他模型同样可以。如google-bert、roberta等), 学习率2e-5,batch_size=64,等价苏神代码中的batch_size=32. 只用了训练集训练,然后在测试集上做测试。 分别训练了5个epoch,使用斯皮尔曼系数评价

指定不同数据集,只需在config.py文件中,修改下面两个参数:
parser.add_argument('--train_data', default='./data/PAWSX/PAWSX.train.data', type=str, help='训练数据集')
parser.add_argument('--test_data', default='./data/PAWSX/PAWSX.test.data', type=str, help='测试数据集')

另外说明: 本实验的句子编码向量是取embedding和最后一层池化后的结果。 也可以试试其他方式,如CLS, 最后一层池化等。 最近做了一些实现,发现cls更好一些。

我的实验结果

ATEC BQ LCQMC PAWSX STS-B Avg
MengZi+CoSENT 50.5270 72.2789 78.6981 60.1437 80.1544 68.3604
Sentence-MengZi 40.7809 70.6998 77.2590 46.31491 49.9348 56.9978
Roberta+CoSENT 50.5969 72.5191 79.3777 60.5475 80.4344 68.6951
Sentence-Roberta 48.5157 67.8545 79.6023 60.1675 71.0148 65.4309

苏神的结果: train训练、test测试:

ATEC BQ LCQMC PAWSX STS-B Avg
BERT+CoSENT 49.74 72.38 78.69 60.00 80.14 68.19
Sentence-BERT 46.36 70.36 78.72 46.86 66.41 61.74
RoBERTa+CoSENT 50.81 71.45 79.31 61.56 81.13 68.85
Sentence-RoBERTa 48.29 69.99 79.22 44.10 72.42 62.80

使用

  1. 运行CoSENT模型
sh start.sh
  1. 运行SentenceBert模型
首先,执行 python sentence_bert/data_helper.py  生成对应的数据
再执行 CUDA_VISIBLE_DEVICES=0 python sentence_bert/run_sentence_bert_transformers_reg_loss.py

更多句子表示学习的模型见: 链接

Star History

Star History Chart

More Repositories

1

NLP_pytorch_project

Embedding, NMT, Text_Classification, Text_Generation, NER etc.
Python
556
star
2

Semantic-Textual-Similarity-Pytorch

experiments of some semantic matching models and comparison of experimental results.
Python
153
star
3

DeepCTR-pytorch

Here are the models listed in CTR. Example: FM、DeepFM、xDeepFM etc.
Python
61
star
4

Python-Library-Learning

Here we will sort out a variety of interesting Python library learning
Python
61
star
5

NLP-Project

Here I sort out some small projects I did in the process of learning NLP.
Python
36
star
6

NLP_tensorflow_project

Use tensorflow to achieve some NLP project, eg: classification chatbot ner attention QAetc.
xBase
34
star
7

Text-Classification-Pytorch

Summary and comparison of Chinese classification models
Python
34
star
8

Keras-Learning-Summary

Summary of keras knowledge points.
Python
19
star
9

Text-Generation-Chinese-Pytorch

Python
13
star
10

NER-Pytorch

Python
7
star
11

Community-Detection

社团检测算法总结
Python
7
star
12

GAN-pytorch

Implementation of GAN
Python
6
star
13

GraphNeuralNetwork

The repository includes GNN, GAT, GCN, GraphSAGE, PinSAGE, etc algorithm implementation.
Python
6
star
14

MinProject

关于pyqt的一些小项目
Python
5
star
15

Pytorch-Learning-Summary

pytorch学习总结
Python
4
star
16

Algorithm

面试中一些算法题总结
Python
4
star
17

Tensorflow-Learning-Summary

This will describle some element knowlege about tensorflow.
Python
3
star
18

Python_Crawler

Summary of Python crawler practice.
Python
3
star
19

Weather-forecasting-system

The weather forecast system involves pyqt5 + crawler + SQLite database operation.
Python
3
star
20

CV_pytorch_project

Here we will sort out the items related to CV, including image classification, objection detection, semantic segmentation, instance segmentation, etc.
Python
2
star
21

Competition-Summary

参加各种比赛的总结,以及代码分享
Python
1
star
22

notebook

c++, data_analysis, deep_learning, docker, git, python etc.
1
star
23

DeepLearning-with-CV

This is project about CV. I will summary skills which is some processing CV.
Python
1
star