• Stars
    star
    115
  • Rank 305,916 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Distributed Deep Learning in Open Collaborations

This repository contains the code for the NeurIPS 2021 paper

"Distributed Deep Learning in Open Collaborations"

Michael Diskin*, Alexey Bukhtiyarov*, Max Ryabinin*, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

Link: ArXiv

Note

This repository contains a snapshot of the code used to conduct experiments in the paper.

Please use the up-to-date version of our library if you want to try out collaborative training and/or set up your own experiment. It contains many substantial improvements, including better documentation and fixed bugs.

Installation

Before running the experiments, please set up the environment by following the steps below:

  • Prepare an environment with python 3.7-3.9. Anaconda is recommended, but not required
  • Install the hivemind library from the master branch or by running pip install hivemind==0.9.9.post1

For all distributed experiments, the installation procedure must be repeated on every machine that participates in the experiment. We recommend using machines with at least 2 CPU cores, 16 GB RAM and, when applicable, a low/mid-tier NVIDIA GPU.

Experiments

The code is divided into several sections matching the corresponding experiments:

  • albert contains the code for controlled experiments with ALBERT-large on WikiText-103;
  • swav is for training SwAV on ImageNet data;
  • sahajbert contains the code used to conduct a public collaborative experiment for the Bengali language ALBERT;
  • p2p is a step-by-step tutorial that explains decentralized NAT traversal and circuit relays.

We recommend running albert experiments first: other experiments build on top of its code and may reqire more careful setup (e.g. for public participation). Furthermore, for this experiment, we provide a script for launching experiments using preemptible GPUs in the cloud.

Acknowledgements

This project is the result of a collaboration between Yandex, Hugging Face, MIPT, HSE University, University of Toronto, Vector Institute, and Neuropark.

We also thank Stas Bekman, Dmitry Abulkhanov, Roman Zhytar, Alexander Ploshkin, Vsevolod Plokhotnyuk and Roman Kail for their invaluable help with building the training infrastructure. Also, we thank Abhishek Thakur for helping with downstream evaluation and Tanmoy Sarkar with Omar Sanseviero, who helped us organize the collaborative experiment and gave regular status updates to the participants over the course of the training run.

Contacts

Feel free to ask any questions in our Discord chat or by email.

Citation

@inproceedings{diskin2021distributed,
    title = {Distributed Deep Learning In Open Collaborations},
    author = {Michael Diskin and Alexey Bukhtiyarov and Max Ryabinin and Lucile Saulnier and Quentin Lhoest and Anton Sinitsin and Dmitry Popov and Dmitriy Pyrkin and Maxim Kashirin and Alexander Borzunov and Albert Villanova del Moral and Denis Mazur and Ilia Kobelev and Yacine Jernite and Thomas Wolf and Gennady Pekhimenko},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
    year = {2021},
    url = {https://openreview.net/forum?id=FYHktcK-7v}
}

More Repositories

1

rtdl

Research on Tabular Deep Learning: Papers & Packages
Python
874
star
2

ddpm-segmentation

Label-Efficient Semantic Segmentation with Diffusion Models (ICLR'2022)
Python
657
star
3

tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Python
375
star
4

rtdl-num-embeddings

(NeurIPS 2022) On Embeddings for Numerical Features in Tabular Deep Learning
Python
302
star
5

navigan

Navigating the GAN Parameter Space for Semantic Image Editing
Jupyter Notebook
296
star
6

tabular-dl-tabr

The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
Python
258
star
7

rtdl-revisiting-models

(NeurIPS 2021) Revisiting Deep Learning Models for Tabular Data
Python
206
star
8

swarm

Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Python
123
star
9

RuLeanALBERT

RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture.
Python
90
star
10

heterophilous-graphs

A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress?
Python
89
star
11

invertible-cd

[NeurIPS'2024] Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
Python
82
star
12

GBDT-uncertainty

Jupyter Notebook
51
star
13

graph-glove

PyTorch code for the EMNLP 2020 paper "Embedding Words in Non-Vector Space with Unsupervised Graph Learning"
Python
40
star
14

specexec

Python
38
star
15

tabred

A Benchmark of Tabular Machine Learning in-the-Wild with real-world industry-grade tabular datasets
Python
37
star
16

DVAR

Official implementation of "Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics" (NeurIPS 2023)
Python
36
star
17

sparqling-queries

This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin
Python
34
star
18

moshpit-sgd

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation
Jupyter Notebook
28
star
19

adaptive-diffusion

[CVPR'2024] Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Python
28
star
20

gan-transfer

Supplementary code for "When, Why, and Which Pretrained GANs Are Useful?" (ICLR'22)
Jupyter Notebook
24
star
21

vqdm

Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper
Python
18
star
22

btard

Code for the paper "Secure Distributed Training at Scale" (ICML 2022)
Python
14
star
23

structural-graph-shifts

Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts (NeurIPS'23)
Python
11
star
24

crosslingual_winograd

"It's All in the Heads" (Findings of ACL 2021), official implementation and data
Python
10
star
25

gan_vs_diff_sr

Does Diffusion Beat GAN in Image Super Resolution?
10
star
26

distill-nf

Code for the paper: Distilling the Knowledge from Conditional Normalizing Flows
Jupyter Notebook
9
star
27

classification-measures

Official implementation and data for 'Good Classification Measures and How to Find Them' (NeurIPS 2021)
Python
7
star
28

text-to-img-hypernymy

Official code for "Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy"
Jupyter Notebook
6
star
29

tabm

TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling
Python
6
star
30

dnar

The implementation of "Discrete Neural Algorithmic Reasoning"
Python
6
star
31

learnable-init

Code for the paper: Discovering Weight Initializers with Meta-Learning
Jupyter Notebook
5
star
32

mind-your-format

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements
Jupyter Notebook
5
star
33

proxy-dirichlet-distillation

Implementation of "Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets" (NeurIPS 2021) and "Uncertainty Estimation in Autoregressive Structured Prediction" (ICLR 2021)
Python
4
star
34

tabgraphs

A new benchmark of meaningful tabular datasets with known graph structure
Python
3
star
35

msr

An official repository of "Multi-Sentence Resampling: A Simple Approach to Alleviate Dataset Length Bias and Beam-Search Degradation"
Python
2
star