• Stars
    star
    114
  • Rank 306,280 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Distributed Deep Learning in Open Collaborations

This repository contains the code for the NeurIPS 2021 paper

"Distributed Deep Learning in Open Collaborations"

Michael Diskin*, Alexey Bukhtiyarov*, Max Ryabinin*, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

Link: ArXiv

Note

This repository contains a snapshot of the code used to conduct experiments in the paper.

Please use the up-to-date version of our library if you want to try out collaborative training and/or set up your own experiment. It contains many substantial improvements, including better documentation and fixed bugs.

Installation

Before running the experiments, please set up the environment by following the steps below:

  • Prepare an environment with python 3.7-3.9. Anaconda is recommended, but not required
  • Install the hivemind library from the master branch or by running pip install hivemind==0.9.9.post1

For all distributed experiments, the installation procedure must be repeated on every machine that participates in the experiment. We recommend using machines with at least 2 CPU cores, 16 GB RAM and, when applicable, a low/mid-tier NVIDIA GPU.

Experiments

The code is divided into several sections matching the corresponding experiments:

  • albert contains the code for controlled experiments with ALBERT-large on WikiText-103;
  • swav is for training SwAV on ImageNet data;
  • sahajbert contains the code used to conduct a public collaborative experiment for the Bengali language ALBERT;
  • p2p is a step-by-step tutorial that explains decentralized NAT traversal and circuit relays.

We recommend running albert experiments first: other experiments build on top of its code and may reqire more careful setup (e.g. for public participation). Furthermore, for this experiment, we provide a script for launching experiments using preemptible GPUs in the cloud.

Acknowledgements

This project is the result of a collaboration between Yandex, Hugging Face, MIPT, HSE University, University of Toronto, Vector Institute, and Neuropark.

We also thank Stas Bekman, Dmitry Abulkhanov, Roman Zhytar, Alexander Ploshkin, Vsevolod Plokhotnyuk and Roman Kail for their invaluable help with building the training infrastructure. Also, we thank Abhishek Thakur for helping with downstream evaluation and Tanmoy Sarkar with Omar Sanseviero, who helped us organize the collaborative experiment and gave regular status updates to the participants over the course of the training run.

Contacts

Feel free to ask any questions in our Discord chat or by email.

Citation

@inproceedings{diskin2021distributed,
    title = {Distributed Deep Learning In Open Collaborations},
    author = {Michael Diskin and Alexey Bukhtiyarov and Max Ryabinin and Lucile Saulnier and Quentin Lhoest and Anton Sinitsin and Dmitry Popov and Dmitriy Pyrkin and Maxim Kashirin and Alexander Borzunov and Albert Villanova del Moral and Denis Mazur and Ilia Kobelev and Yacine Jernite and Thomas Wolf and Gennady Pekhimenko},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
    year = {2021},
    url = {https://openreview.net/forum?id=FYHktcK-7v}
}

More Repositories

1

rtdl

Research on Tabular Deep Learning: Papers & Packages
Python
864
star
2

ddpm-segmentation

Label-Efficient Semantic Segmentation with Diffusion Models (ICLR'2022)
Python
655
star
3

tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Python
346
star
4

navigan

Navigating the GAN Parameter Space for Semantic Image Editing
Jupyter Notebook
296
star
5

rtdl-num-embeddings

(NeurIPS 2022) On Embeddings for Numerical Features in Tabular Deep Learning
Python
295
star
6

tabular-dl-tabr

The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
Python
231
star
7

rtdl-revisiting-models

(NeurIPS 2021) Revisiting Deep Learning Models for Tabular Data
Python
191
star
8

swarm

Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Python
115
star
9

RuLeanALBERT

RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture.
Python
89
star
10

heterophilous-graphs

A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress?
Python
86
star
11

invertible-cd

Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
Python
63
star
12

GBDT-uncertainty

Jupyter Notebook
50
star
13

graph-glove

PyTorch code for the EMNLP 2020 paper "Embedding Words in Non-Vector Space with Unsupervised Graph Learning"
Python
40
star
14

DVAR

Official implementation of "Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics" (NeurIPS 2023)
Python
35
star
15

sparqling-queries

This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin
Python
34
star
16

moshpit-sgd

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation
Jupyter Notebook
28
star
17

adaptive-diffusion

[CVPR2024] Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Python
26
star
18

gan-transfer

Supplementary code for "When, Why, and Which Pretrained GANs Are Useful?" (ICLR'22)
Jupyter Notebook
24
star
19

specexec

Python
15
star
20

btard

Code for the paper "Secure Distributed Training at Scale" (ICML 2022)
Python
14
star
21

structural-graph-shifts

Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts (NeurIPS'23)
Python
10
star
22

crosslingual_winograd

"It's All in the Heads" (Findings of ACL 2021), official implementation and data
Python
10
star
23

distill-nf

Code for the paper: Distilling the Knowledge from Conditional Normalizing Flows
Jupyter Notebook
9
star
24

gan_vs_diff_sr

Does Diffusion Beat GAN in Image Super Resolution?
8
star
25

classification-measures

Official implementation and data for 'Good Classification Measures and How to Find Them' (NeurIPS 2021)
Python
7
star
26

tabred

A Benchmark of Tabular Machine Learning in-the-Wild with real-world industry-grade tabular datasets
Python
7
star
27

text-to-img-hypernymy

Official code for "Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy"
Jupyter Notebook
6
star
28

learnable-init

Code for the paper: Discovering Weight Initializers with Meta-Learning
Jupyter Notebook
5
star
29

mind-your-format

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements
Jupyter Notebook
5
star
30

dnar

The implementation of "Discrete Neural Algorithmic Reasoning"
Python
5
star
31

proxy-dirichlet-distillation

Implementation of "Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets" (NeurIPS 2021) and "Uncertainty Estimation in Autoregressive Structured Prediction" (ICLR 2021)
Python
3
star
32

msr

An official repository of "Multi-Sentence Resampling: A Simple Approach to Alleviate Dataset Length Bias and Beam-Search Degradation"
Python
2
star
33

vqdm

Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper
1
star