• Stars
    star
    133
  • Rank 272,600 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Trans-Encoder

[arxiv] ยท [amazon.science blog] ยท [5min-video] ยท [talk@RIKEN] ยท [openreview]

Code repo for ICLR 2022 paper Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
by Fangyu Liu, Yunlong Jiao, Jordan Massiah, Emine Yilmaz, Serhii Havrylov.

Trans-Encoder is a state-of-the-art unsupervised sentence similarity model. It conducts self-knowledge-distillation on top of pretrained language models by alternating between their bi- and cross-encoder forms.

Huggingface pretrained models for STS

base models large models
model STS avg.
baseline: unsup-simcse-bert-base 76.21
trans-encoder-bi-simcse-bert-base 80.41
trans-encoder-cross-simcse-bert-base 79.90
baseline: unsup-simcse-roberta-base 76.10
trans-encoder-bi-simcse-roberta-base 80.47
trans-encoder-cross-simcse-roberta-base 81.15
model STS avg.
baseline: unsup-simcse-bert-large 78.42
trans-encoder-bi-simcse-bert-large 82.65
trans-encoder-cross-simcse-bert-large 82.52
baseline: unsup-simcse-roberta-large 78.92
trans-encoder-bi-simcse-roberta-large 82.93
trans-encoder-cross-simcse-roberta-large 82.93

Dependencies

torch==1.8.1
transformers==4.9.0
sentence-transformers==2.0.0

Please view requirements.txt for more details.

Data

All training and evaluation data will be automatically downloaded when running the scripts. See src/data.py for details.

Train

--task options: sts (STS2012-2016 and STS-b), sickr, sts_sickr (STS2012-2016, STS-b, and SICK-R), qqp, qnli, mrpc, snli, custom. See src/data.py for task data details. By default using all STS data (sts_sickr).

Self-distillation

>> bash train_self_distill.sh 0

0 denotes GPU device index.

Mutual-distillation

>> bash train_mutual_distill.sh 0,1

Two GPUs needed; by default using SimCSE BERT & RoBERTa base models for ensembling. Add --use_large for switching to large models.

Train with your custom corpus

>> CUDA_VISIBLE_DEVICES=0,1 python src/mutual_distill_parallel.py \
         --batch_size_bi_encoder 128 \
         --batch_size_cross_encoder 64 \
         --num_epochs_bi_encoder 10 \
         --num_epochs_cross_encoder 1 \
         --cycle 3 \
         --bi_encoder1_pooling_mode cls \
         --bi_encoder2_pooling_mode cls \
         --init_with_new_models \
         --task custom \
         --random_seed 2021 \
         --custom_corpus_path CORPUS_PATH

CORPUS_PATH should point to your custom corpus in which every line should be a sentence pair in the form of sent1||sent2.

Evaluate

Evaluate a single model

Bi-encoder:

>> python src/eval.py \
--model_name_or_path "cambridgeltl/trans-encoder-bi-simcse-roberta-large"  \
--mode bi \
--task sts_sickr

Cross-encoder:

>> python src/eval.py \
--model_name_or_path "cambridgeltl/trans-encoder-cross-simcse-roberta-large"  \
--mode cross \
--task sts_sickr

Evaluate ensemble

Bi-encoder:

>> python src/eval.py \
--model_name_or_path1 "cambridgeltl/trans-encoder-bi-simcse-bert-large"  \
--model_name_or_path2 "cambridgeltl/trans-encoder-bi-simcse-roberta-large"  \
--mode bi \
--ensemble \
--task sts_sickr

Cross-encoder:

>> python src/eval.py \
--model_name_or_path1 "cambridgeltl/trans-encoder-cross-simcse-bert-large"  \
--model_name_or_path2 "cambridgeltl/trans-encoder-cross-simcse-roberta-large"  \
--mode cross \
--ensemble \
--task sts_sickr

Authors

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

More Repositories

1

style-dictionary

A build system for creating cross-platform styles.
JavaScript
3,880
star
2

computer-vision-basics-in-microsoft-excel

Computer Vision Basics in Microsoft Excel (using just formulas)
2,394
star
3

selling-partner-api-docs

This repository contains documentation for developers to use to call Selling Partner APIs.
1,543
star
4

smoke-framework

A light-weight server-side service framework written in the Swift programming language.
Swift
1,443
star
5

alexa-skills-kit-js

SDK and example code for building voice-enabled skills for the Amazon Echo.
1,134
star
6

ion-java

Java streaming parser/serializer for Ion.
Java
840
star
7

selling-partner-api-models

This repository contains OpenAPI models for developers to use when developing software to call Selling Partner APIs.
Mustache
590
star
8

sketch-constructor

Read/write/manipulate Sketch files in Node without Sketch plugins!
JavaScript
542
star
9

pecos

PECOS - Prediction for Enormous and Correlated Spaces
Python
509
star
10

amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
C
455
star
11

ion-js

A JavaScript implementation of Amazon Ion.
TypeScript
323
star
12

convolutional-handwriting-gan

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)
Python
265
star
13

xfer

Transfer Learning library for Deep Neural Networks.
Python
253
star
14

awsssmchaosrunner

Amazon's light-weight library for chaos engineering on AWS. It can be used for EC2 and ECS (with EC2 launch type).
Kotlin
249
star
15

ion-python

A Python implementation of Amazon Ion.
Python
210
star
16

amazon-pay-sdk-php

Amazon Pay PHP SDK
PHP
209
star
17

kotlin-inject-anvil

Extensions for the kotlin-inject dependency injection framework
Kotlin
191
star
18

fire-app-builder

Fire App Builder is a framework for building java media apps for Fire TV, allowing you to add your feed of media content to a configuration file and build an app to browse and play it quickly.
Java
182
star
19

exoplayer-amazon-port

Official port of ExoPlayer for Amazon devices
Java
173
star
20

oss-dashboard

A dashboard for viewing many GitHub organizations at once.
Ruby
159
star
21

ion-c

A C implementation of Amazon Ion.
C
149
star
22

metalearn-leap

Original PyTorch implementation of the Leap meta-learner (https://arxiv.org/abs/1812.01054) along with code for running the Omniglot experiment presented in the paper.
Python
148
star
23

ion-go

A Go implementation of Amazon Ion.
Go
146
star
24

auction-gym

AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.
Jupyter Notebook
144
star
25

distance-assistant

Pedestrian monitor that provides visual feedback to help ensure proper social distancing guidelines are being observed
Python
135
star
26

hawktracer

HawkTracer is a highly portable, low-overhead, configurable profiling tool built in Amazon Video for getting performance metrics from low-end devices.
C++
133
star
27

smoke-aws

AWS services integration for the Smoke Framework
Swift
111
star
28

amazon-payments-magento-2-plugin

Extension to enable Amazon Pay on Magento 2
PHP
108
star
29

MXFusion

Modular Probabilistic Programming on MXNet
Python
103
star
30

amazon-weak-ner-needle

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Python
100
star
31

amazon-advertising-api-php-sdk

โ›”๏ธ DEPRECATED - Amazon Advertising API PHP Client Library
PHP
93
star
32

ads-advanced-tools-docs

Code samples and supplements for the Amazon Ads advanced tools center
Jupyter Notebook
91
star
33

ion-rust

Rust implementation of Amazon Ion
Rust
86
star
34

image-to-recipe-transformers

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
Python
81
star
35

oss-attribution-builder

The OSS Attribution Builder is a website that helps teams create attribution documents (notices, "open source screens", credits, etc) commonly found in software products.
TypeScript
80
star
36

smoke-http

Specialised HTTP Client for service operations abstracted from the HTTP protocol.
Swift
70
star
37

amazon-ray

Staging area for ongoing enhancements to Ray focused on improving integration with AWS and other Amazon technologies.
Python
66
star
38

alexa-coho

Sample code for building skill adapters for Alexa Connected Home using the Lighting API
JavaScript
62
star
39

amazon-pay-sdk-ruby

Amazon Pay Ruby SDK
Ruby
58
star
40

selling-partner-api-samples

Sample code for Amazon Selling Partner API use cases
Java
56
star
41

amazon-pay-sdk-python

Amazon Pay Python SDK
Python
53
star
42

amazon-pay-sdk-java

Amazon Pay Java SDK
Java
53
star
43

zero-shot-rlhr

Python
51
star
44

supply-chain-simulation-environment

Python
50
star
45

amazon-pay-api-sdk-php

Amazon Pay API SDK (PHP)
PHP
48
star
46

amazon-pay-sdk-csharp

Amazon Pay C# SDK
C#
47
star
47

ion-dotnet

A .NET implementation of Amazon Ion.
C#
47
star
48

multiconer-baseline

Python
47
star
49

zeek-plugin-enip

Zeek network security monitor plugin that enables parsing of the Ethernet/IP and Common Industrial Protocol standards
Zeek
44
star
50

amazon-pay-sdk-samples

Amazon Pay SDK Sample Code
PHP
43
star
51

oss-contribution-tracker

Track contributions made to external projects and manage CLAs
TypeScript
40
star
52

amazon-s3-gst-plugin

A collection of Amazon S3 GStreamer elements.
C
40
star
53

fashion-attribute-disentanglement

Python
39
star
54

zeek-plugin-s7comm

Zeek network security monitor plugin that enables parsing of the S7 protocol
Zeek
39
star
55

milan

Milan is a Scala API and runtime infrastructure for building data-oriented systems, built on top of Apache Flink.
Scala
39
star
56

orthogonal-additive-gaussian-processes

Light-weighted code for Orthogonal Additive Gaussian Processes
Python
38
star
57

jekyll-doc-project

This repository contains an open-source Jekyll theme for authoring and publishing technical documentation. This theme is used by Appstore/Alexa tech writers and other community members. Most of the theme's files are stored in a Ruby Gem (called jekyll-doc-project).
HTML
37
star
58

amazon-pay-api-sdk-nodejs

Amazon Pay API SDK (Node.js)
JavaScript
36
star
59

smoke-dynamodb

SmokeDynamoDB is a library to make it easy to use DynamoDB from Swift-based applications, with a particular focus on usage with polymorphic database tables (tables that do not have a single schema for all rows).
Swift
34
star
60

chalet-charging-location-for-electric-trucks

Optimization tool to identify charging locations for electric trucks
Python
34
star
61

sparse-vqvae

Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper
Python
31
star
62

ss-aga-kgc

Python
31
star
63

amazon-pay-api-sdk-java

Amazon Pay API SDK (Java)
Java
30
star
64

zeek-plugin-bacnet

Zeek network security monitor plugin that enables parsing of the BACnet standard building controls protocol
Zeek
29
star
65

credence-to-causal-estimation

A framework for generating complex and realistic datasets for use in evaluating causal inference methods.
Python
29
star
66

basis-point-sets

Python
28
star
67

buy-with-prime-cdk-constructs

This package extends common CDK constructs with opinionated defaults to help create an organization strategy around infrastructure as code.
TypeScript
28
star
68

zeek-plugin-profinet

Zeek network security monitor plugin that enables parsing of the Profinet protocol
Zeek
27
star
69

differential-privacy-bayesian-optimization

This repo contains the underlying code for all the experiments from the paper: "Automatic Discovery of Privacy-Utility Pareto Fronts"
Python
26
star
70

ion-tests

Test vectors for testing compliant Ion implementations.
25
star
71

ion-hive-serde

A Apache Hive SerDe (short for serializer/deserializer) for the Ion file format.
Java
24
star
72

zeek-plugin-tds

Zeek network security monitor plugin that enables parsing of the Tabular Data Stream (TDS) protocol
Zeek
24
star
73

smoke-framework-application-generate

Code generator to generate SmokeFramework-based applications from service models.
Swift
24
star
74

ion-intellij-plugin

Support for Ion in Intellij IDEA.
Kotlin
23
star
75

ion-schema-kotlin

A Kotlin reference implementation of the Ion Schema Specification.
Kotlin
23
star
76

ftv-livetv-sample-tv-app

Java
23
star
77

emukit-playground

A web page explaining concepts of statistical emulation and making decisions under uncertainty in an interactive way.
JavaScript
22
star
78

ion-hash-go

A Go implementation of Amazon Ion Hash.
Go
22
star
79

pretraining-or-self-training

Codebase for the paper "Rethinking Semi-supervised Learning with Language Models"
Python
22
star
80

smoke-framework-examples

Sample applications showing the usage of the SmokeFramework and related libraries.
Swift
21
star
81

confident-sinkhorn-allocation

Pseudo-labeling for tabular data
Jupyter Notebook
21
star
82

tiny-attribution-generator

A small tool and library to create attribution notices from various formats
TypeScript
20
star
83

smoke-aws-generate

Code generator to generate the SmokeAWS library from service models.
Swift
19
star
84

ion-docs

Source for the GitHub Pages for Ion.
Java
19
star
85

autotrail

AutoTrail is a highly modular, partial automation workflow engine providing run time execution control
Python
19
star
86

smoke-aws-credentials

A library to obtain and assume automatically rotating AWS IAM roles written in the Swift programming language.
Swift
19
star
87

amazon-codeguru-profiler-for-spark

A Spark plugin for CPU and memory profiling
Java
18
star
88

git-commit-template

Set commit templates for git
JavaScript
18
star
89

service-model-swift-code-generate

Modular code generator to generate Swift applications from service models.
Swift
18
star
90

amazon-pay-api-sdk-dotnet

Amazon Pay API SDK (.NET)
C#
18
star
91

sample-fire-tv-app-video-skill

This sample Fire TV app shows how to integrate an Alexa video skill in a simple, basic way.
Java
16
star
92

amazon-template-library

A collection of general purpose C++ utilities that play well with the Standard Library and Boost.
C++
16
star
93

rheoceros

Cloud-based AI / ML workflow and data application development framework
Python
16
star
94

ion-cli

Rust
15
star
95

refuel-open-domain-qa

Python
15
star
96

amazon-instant-access-sdk-php

PHP SDK to aid in 3p integration with Instant Access
PHP
14
star
97

amazon-mcf-plugin-for-magento-1

Plugin code to enable Amazon MCF in Magento 1.
PHP
14
star
98

login-with-amazon-wordpress

A pre-integrated plugin that can be installed into a Wordpress powered website to integrate with Login with Amazon.
PHP
14
star
99

firetv-sample-touch-app

This sample Android project demonstrates how to build the main UI of a Fire TV application in order to support both Touch interactions and Remote D-Pad controls.
Java
14
star
100

amzn-ec2-ena-utilities

Python
14
star