• Stars
    star
    756
  • Rank 60,056 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast, differentiable sorting and ranking in PyTorch

Torchsort

Tests

Fast, differentiable sorting and ranking in PyTorch.

Pure PyTorch implementation of Fast Differentiable Sorting and Ranking (Blondel et al.). Much of the code is copied from the original Numpy implementation at google-research/fast-soft-sort, with the isotonic regression solver rewritten as a PyTorch C++ and CUDA extension.

Install

pip install torchsort

To build the CUDA extension you will need the CUDA toolchain installed. If you want to build in an environment without a CUDA runtime (e.g. docker), you will need to export the environment variable TORCH_CUDA_ARCH_LIST="Pascal;Volta;Turing;Ampere" before installing.

Conda Installation On some systems the package my not compile with `pip` install in conda environments. If this happens you may need to:
  1. Install g++ with conda install -c conda-forge gxx_linux-64=9.40
  2. Run export CXX=/path/to/miniconda3/envs/env_name/bin/x86_64-conda_cos6-linux-gnu-g++
  3. Run export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/miniconda3/lib
  4. pip install --force-reinstall --no-cache-dir --no-deps torchsort

Thanks to @levnikmyskin, @sachit-menon for pointing this out!

Pre-built Wheels

Pre-built wheels are currently available on Linux for some Python/PyTorch/CUDA combinations:

# torchsort version, supports >= 0.1.9
export TORCHSORT=0.1.9
# PyTorch version, supports pt20 and pt113 for versions 2.0 and 1.13 respectively
export TORCH=pt20
# CUDA version, supports cpu, cu113, cu117, cu118 for CPU-only, CUDA 11.3, CUDA 11.7 and CUDA 11.8 respectively
export CUDA=cu118
# Python version, supports cp310 and cp311 for versions 3.10 and 3.11 respectively
export PYTHON=cp310

pip install https://github.com/teddykoker/torchsort/releases/download/v${TORCHSORT}/torchsort-${TORCHSORT}+${TORCH}${CUDA}-${PYTHON}-${PYTHON}-linux_x86_64.whl

Thanks to @siddharthab for the help creating the build action!

Usage

torchsort exposes two functions: soft_rank and soft_sort, each with parameters regularization ("l2" or "kl") and regularization_strength (a scalar value). Each will rank/sort the last dimension of a 2-d tensor, with an accuracy dependent upon the regularization strength:

import torch
import torchsort

x = torch.tensor([[8, 0, 5, 3, 2, 1, 6, 7, 9]])

torchsort.soft_sort(x, regularization_strength=1.0)
# tensor([[0.5556, 1.5556, 2.5556, 3.5556, 4.5556, 5.5556, 6.5556, 7.5556, 8.5556]])
torchsort.soft_sort(x, regularization_strength=0.1)
# tensor([[-0., 1., 2., 3., 5., 6., 7., 8., 9.]])

torchsort.soft_rank(x)
# tensor([[8., 1., 5., 4., 3., 2., 6., 7., 9.]])

Both operations are fully differentiable, on CPU or GPU:

x = torch.tensor([[8., 0., 5., 3., 2., 1., 6., 7., 9.]], requires_grad=True).cuda()
y = torchsort.soft_sort(x)

torch.autograd.grad(y[0, 0], x)
# (tensor([[0.1111, 0.1111, 0.1111, 0.1111, 0.1111, 0.1111, 0.1111, 0.1111, 0.1111]],
#         device='cuda:0'),)

Example

Spearman's Rank Coefficient

Spearman's rank coefficient is a very useful metric for measuring how monotonically related two variables are. We can use Torchsort to create a differentiable Spearman's rank coefficient function so that we can optimize a model directly for this metric:

import torch
import torchsort

def spearmanr(pred, target, **kw):
    pred = torchsort.soft_rank(pred, **kw)
    target = torchsort.soft_rank(target, **kw)
    pred = pred - pred.mean()
    pred = pred / pred.norm()
    target = target - target.mean()
    target = target / target.norm()
    return (pred * target).sum()

pred = torch.tensor([[1., 2., 3., 4., 5.]], requires_grad=True)
target = torch.tensor([[5., 6., 7., 8., 7.]])
spearman = spearmanr(pred, target)
# tensor(0.8321)

torch.autograd.grad(spearman, pred)
# (tensor([[-5.5470e-02,  2.9802e-09,  5.5470e-02,  1.1094e-01, -1.1094e-01]]),)

Benchmark

Benchmark

torchsort and fast_soft_sort each operate with a time complexity of O(n log n), each with some additional overhead when compared to the built-in torch.sort. With a batch size of 1 (see left), the Numba JIT'd forward pass of fast_soft_sort performs about on-par with the torchsort CPU kernel, however its backward pass still relies on some Python code, which greatly penalizes its performance.

Furthermore, the torchsort kernel supports batches, and yields much better performance than fast_soft_sort as the batch size increases.

Benchmark

The torchsort CUDA kernel performs quite well with sequence lengths under ~2000, and scales to extremely large batch sizes. In the future the CUDA kernel can likely be further optimized to achieve performance closer to that of the built in torch.sort.

Reference

@inproceedings{blondel2020fast,
  title={Fast differentiable sorting and ranking},
  author={Blondel, Mathieu and Teboul, Olivier and Berthet, Quentin and Djolonga, Josip},
  booktitle={International Conference on Machine Learning},
  pages={950--959},
  year={2020},
  organization={PMLR}
}

More Repositories

1

pedalnet

Deep Learning for Guitar Effect Emulation
Python
333
star
2

image-gpt

PyTorch Implementation of OpenAI's Image GPT
Python
250
star
3

blog

Source code for my personal blog
Jupyter Notebook
180
star
4

survivorship-free-spy

Python
108
star
5

cryptopunks-gan

Simple SN-GAN to generate CryptoPunks
Python
71
star
6

unsupervised-deep-homography

PyTorch implementation of Unsupervised Deep Homography: https://arxiv.org/abs/1709.03966
Python
61
star
7

tinyloader

Python
54
star
8

u-noise

Official PyTorch code for U-Noise: Learnable Noise Masks for Interpretable Image Segmentation (ICIP 2021)
Python
39
star
9

performer

Simply Numpy implementation of the FAVOR+ attention mechanism, https://teddykoker.com/2020/11/performers/
Python
36
star
10

evidential-learning-pytorch

Evidential Deep Learning in PyTorch
Python
35
star
11

grokking

PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
Python
30
star
12

learning-to-learn-jax

JAX implementation of Learning to learn by gradient descent by gradient descent
Python
25
star
13

bookflip

Textbook trading website built with Flask and Vue.js
Python
20
star
14

image-forensics

PyTorch implementation of ImageForensics: https://hms-idac.github.io/ImageForensics/
Python
12
star
15

mpnn-for-quantum-chem

Pytorch implementation of MPNN for Quantum Chemistry
Python
12
star
16

e3nn.c

Pure C implementation of e3nn
C
11
star
17

trumpy

A command line program for impersonating people on Twitter, in this case Donald Trump.
Python
6
star
18

pipedal

HTML
4
star
19

go-react

Boilerplate to get started with React and Golang
Go
3
star
20

crappermapper

A website for rating and locating toilets.
JavaScript
3
star
21

lift

A new way to share fitness programs and measure progress. Golang + React + MongoDB
JavaScript
2
star
22

representme

A simple application to see your local representives based on location.
Objective-C
2
star
23

thenetwork

Highschool capstone project.
PHP
2
star
24

webswitch

Control LEDs over web sockets.
HTML
2
star
25

photocell-graph

Graph a live feed of RCtime readings from a photocell over web sockets.
HTML
2
star
26

teddykoker

1
star
27

llama-rs-server

Rust
1
star
28

nodebike

JavaScript
1
star