• Stars
    star
    406
  • Rank 106,421 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The entmax mapping and its loss, a family of sparse softmax alternatives.

Build Status

PyPI version

entmax


This package provides a pytorch implementation of entmax and entmax losses: a sparse family of probability mappings and corresponding loss functions, generalizing softmax / cross-entropy.

Features:

  • Exact partial-sort algorithms for 1.5-entmax and 2-entmax (sparsemax).
  • A bisection-based algorithm for generic alpha-entmax.
  • Gradients w.r.t. alpha for adaptive, learned sparsity!

Requirements: python 3, pytorch >= 1.0 (and pytest for unit tests)

Example

In [1]: import torch

In [2]: from torch.nn.functional import softmax

In [2]: from entmax import sparsemax, entmax15, entmax_bisect

In [4]: x = torch.tensor([-2, 0, 0.5])

In [5]: softmax(x, dim=0)
Out[5]: tensor([0.0486, 0.3592, 0.5922])

In [6]: sparsemax(x, dim=0)
Out[6]: tensor([0.0000, 0.2500, 0.7500])

In [7]: entmax15(x, dim=0)
Out[7]: tensor([0.0000, 0.3260, 0.6740])

Gradients w.r.t. alpha (continued):

In [1]: from torch.autograd import grad

In [2]: x = torch.tensor([[-1, 0, 0.5], [1, 2, 3.5]])

In [3]: alpha = torch.tensor(1.33, requires_grad=True)

In [4]: p = entmax_bisect(x, alpha)

In [5]: p
Out[5]:
tensor([[0.0460, 0.3276, 0.6264],
        [0.0026, 0.1012, 0.8963]], grad_fn=<EntmaxBisectFunctionBackward>)

In [6]: grad(p[0, 0], alpha)
Out[6]: (tensor(-0.2562),)

Installation

pip install entmax

Citations

Sparse Sequence-to-Sequence Models

@inproceedings{entmax,
  author    = {Peters, Ben and Niculae, Vlad and Martins, Andr{\'e} FT},
  title     = {Sparse Sequence-to-Sequence Models},
  booktitle = {Proc. ACL},
  year      = {2019},
  url       = {https://www.aclweb.org/anthology/P19-1146}
}

Adaptively Sparse Transformers

@inproceedings{correia19adaptively,
  author    = {Correia, Gon\c{c}alo M and Niculae, Vlad and Martins, Andr{\'e} FT},
  title     = {Adaptively Sparse Transformers},
  booktitle = {Proc. EMNLP-IJCNLP (to appear)},
  year      = {2019},
}

Further reading:

More Repositories

1

infinite-former

Python
65
star
2

tutorial

Web page for our tutorial on latent structure for NLP
47
star
3

lp-sparsemap

LP-SparseMAP: Differentiable sparse structured prediction in coarse factor graphs
C++
41
star
4

UA_COMET

Repository for "Uncertainty-Aware Machine Translation Evaluation", accepted to Findings of EMNLP 2021.
Python
34
star
5

OpenNMT-APE

Python
33
star
6

sparse-marginalization-lvm

Official PyTorch (Lightning) implementation of the NeurIPS 2020 paper "Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity".
Python
28
star
7

scheduled-sampling-transformers

Code for the paper "Scheduled Sampling for Transformers"
Python
25
star
8

tower-eval

Python
24
star
9

mcan-vqa-continuous-attention

Python
21
star
10

uncertainties_MT_eval

Code and data for the paper "Disentangling Uncertainty in Machine Translation Evaluation", accepted at EMNLP 2022.
Python
21
star
11

sparse_text_generation

Python
19
star
12

hallucinations-in-nmt

17
star
13

sparse_continuous_distributions

This repository provides open-source code for sparse continuous distributions and corresponding Fenchel-Young losses.
Python
16
star
14

robust_MT_evaluation

Repository for "BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation", accepted at EAMT 2023.
Jupyter Notebook
16
star
15

lmt_hallucinations

Shell
14
star
16

qaware-decode

A repository for experiments in quality-aware decoding
Python
14
star
17

OpenNMT-entmax

Python
14
star
18

qe-evaluation

Evaluation scripts for the 2019 machine translation quality estimation shared task
Python
12
star
19

sparse-communication

Jupyter Notebook
12
star
20

understanding-spigot

Code for the paper "Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning"
Python
11
star
21

entmax-jax

The entmax mapping in JAX
Python
11
star
22

spectra-rationalization

Repository for SPECTRA: Sparse Structured Text Rationalization, accepted at EMNLP 2021 main conference.
Python
10
star
23

explainable-qe-shared-task

IST-Unbabel 2021 Submission for the Quality Estimation Shared Task
Jupyter Notebook
9
star
24

translation_llm

Jupyter Notebook
8
star
25

non-exchangeable-crc

Jupyter Notebook
7
star
26

crest

Code for CREST: A Joint Framework for Rationalization and Counterfactual Text Generation, accepted at ACL 2023.
Python
7
star
27

pyturbo

Neural dependency parser with higher-order features
Python
6
star
28

S7

Smoothing and Shrinking the Sparse Seq2Seq Search Space
Python
6
star
29

ot-hallucination-detection

Python
5
star
30

efficient_kNN_MT

Python
5
star
31

unn

Code for the paper "Modeling Structure with Undirected Neural Networks"
Python
5
star
32

chunk-based_knn-mt

Python
5
star
33

sigmorphon-seq2seq

DeepSPIN's submission to SIGMORPHON 2020
Python
4
star
34

spec

The Explanation Game: Towards Prediction Explainability through Sparse Communication
Python
4
star
35

SIGMORPHON2019

IT-IST's submission to SIGMORPHON 2019 Task 1
Python
3
star
36

speech-continuous-attention

Speech Classification using Continuous Attention Mechanisms
Python
3
star
37

tutorial-latent-struct-src

Sources for our slides for the latent structure in NLP tutorial
TeX
3
star
38

quest-decoding

A package for sampling from Gibbs distributions during inference with LLMs.
Python
3
star
39

translation-hypothesis-ensembling

Shell
3
star
40

vqa-multimodal-continuous-attention

Python
3
star
41

quati

Simple and modular library for document classification and sequence tagging.
Python
2
star
42

TVmax

Python
2
star
43

SSHN

Sparse and Structured Hopfield Networks
Python
2
star
44

doce

This is the a repo of DOCE
Jupyter Notebook
2
star
45

deep-spin.github.io

Website of the DeepSPIN ERC project.
HTML
1
star