• Stars
    star
    152
  • Rank 244,685 (Top 5 %)
  • Language
    TypeScript
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 27 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.

DISCO - Dis​tributed Co​llaborative Machine Learning

DISCO leverages federated 🌟 and decentralized ✨ learning to allow several data owners to collaboratively build machine learning models without sharing any original data.

The latest version is always running on the following link, directly in your browser, for web and mobile:

🕺 https://epfml.github.io/disco/ 🕺


🪄 DEVELOPERS: Contribute or customize DISCO HERE


WHY DISCO?

  • To build deep learning models across private datasets without compromising data privacy, ownership, sovereignty, or model performance
  • To create an easy-to-use platform that allows non-specialists to participate in collaborative learning

⚙️ HOW DISCO WORKS

  • DISCO has a public model – private data approach
  • Private and secure model updates – not data – are communicated to either:
    • a central server : federated learning ( 🌟 )
    • directly between users : decentralized learning ( ✨ ) i.e. no central coordination
  • Model updates are then securely aggregated into a trained model
  • See more HERE

DISCO TECHNOLOGY

  • DISCO supports arbitrary deep learning tasks and model architectures, via TF.js
  • ✨ relies on peer2peer communication
  • Learn more about secure aggregation and differential privacy for privacy-respecting training HERE

🧪 RESEARCH-BASED DESIGN

DISCO aims to enable open-access and easy-use distributed training which is

  • 🌪️ efficient (R1, R2)
  • 🔒 privacy-preserving (R3, R4)
  • 🛠️ fault-tolerant and dynamic over time (R5)
  • 🥷 robust to malicious actors and data poisoning (R6, R7)
  • 🍎 🍌 interpretable in imperfectly interoperable data distributions (R8)
  • 🪞 personalizable (R9)
  • 🥕 fairly incentivizes participation

🏁 HOW TO USE DISCO

  • Start by exploring our example DISCOllaboratives in the Tasks tab.
  • The example models are based on popular datasets such as Titanic, MNIST or CIFAR-10
  • It is also possible to create a custom task without coding. Just upload the following 2 files:
    • A TensorFlow.js model file in JSON format (useful links to create and save your model)
    • A weight file in .bin format
      • These are the initial weights provided to new users joining your task (pre-trained or random initialisation)
    • You can choose from several existing dataloaders
    • Then...select your DISCO training scheme (🌟 or ✨) ... connect your data and... 📊

Note: Currently only CSV and Image data types are supported. Adding new data types, preprocessing code or dataloaders, is accessible in developer mode (see developer guide). Specific instructions on how to build a custom task can be found HERE

__

JOIN US

  • You are welcome on slack

More Repositories

1

ML_course

EPFL Machine Learning Course, Fall 2024
Jupyter Notebook
1,254
star
2

sent2vec

General purpose unsupervised sentence representations
C++
1,192
star
3

OptML_course

EPFL Course - Optimization for Machine Learning - CS-439
Jupyter Notebook
1,122
star
4

attention-cnn

Source code for "On the Relationship between Self-Attention and Convolutional Layers"
Python
1,073
star
5

landmark-attention

Landmark Attention: Random-Access Infinite Context Length for Transformers
Python
258
star
6

federated-learning-public-code

Python
157
star
7

collaborative-attention

Code for Multi-Head Attention: Collaborate Instead of Concatenate
Python
148
star
8

powersgd

Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
Python
137
star
9

dynamic-sparse-flash-attention

Jupyter Notebook
129
star
10

DenseFormer

Python
74
star
11

llm-baselines

Python
68
star
12

ChocoSGD

Decentralized SGD and Consensus with Communication Compression: https://arxiv.org/abs/1907.09356
Python
59
star
13

sparsifiedSGD

Sparsified SGD with Memory: https://arxiv.org/abs/1809.07599
Jupyter Notebook
54
star
14

optML-pku

summer school materials
42
star
15

LocalSGD-Code

Python
42
star
16

error-feedback-SGD

SGD with compressed gradients and error-feedback: https://arxiv.org/abs/1901.09847
Jupyter Notebook
28
star
17

interpret-lm-knowledge

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance (NeurIPS XAI 2021).
Jupyter Notebook
22
star
18

byzantine-robust-optimizer

Learning from history for Byzantine Robustness
Jupyter Notebook
21
star
19

Bi-Sent2Vec

Robust Cross-lingual Embeddings from Parallel Sentences
C++
20
star
20

opt-summerschool

Short Course on Optimization for Machine Learning - Slides and Practical Labs - DS3 Data Science Summer School, June 24 to 28, 2019, Paris, France
Jupyter Notebook
20
star
21

cola

CoLa - Decentralized Linear Learning: https://arxiv.org/abs/1808.04883
Python
18
star
22

opt-shortcourse

Short Course on Optimization for Machine Learning - Slides and Practical Lab - Pre-doc Summer School on Learning Systems, July 3 to 7, 2017, Zürich, Switzerland
Jupyter Notebook
18
star
23

powergossip

Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"
Python
15
star
24

byzantine-robust-noniid-optimizer

Python
15
star
25

X2Static

X2Static embeddings
Python
12
star
26

kubernetes-setup

MLO group setup for kubernetes cluster
Dockerfile
12
star
27

topology-in-decentralized-learning

Code related to ’Beyond spectral gap: The role of the topology in decentralized learning‘.
Python
10
star
28

quasi-global-momentum

Python
10
star
29

relaysgd

Code for the paper “RelaySum for Decentralized Deep Learning on Heterogeneous Data”
Jupyter Notebook
10
star
30

piecewise-affine-multiplication

Python
7
star
31

rotational-optimizers

Python
6
star
32

byzantine-robust-decentralized-optimizer

Jupyter Notebook
6
star
33

uncertainity-estimation

Code for the paper “The Peril of Popular Deep Learning Uncertainty Estimation Methods”
Jupyter Notebook
6
star
34

getting-started

Python
6
star
35

text_to_image_generation

Python
5
star
36

easy-summary

difficulty-guided text summarization
Python
5
star
37

FeAI

Federated Learning with TensorFlow.js
Vue
4
star
38

ghost-noise

Python
3
star
39

autoTrain

Open Challenge - Automatic Training for Deep Learning
Python
3
star
40

pax

JAX-like API for PyTorch
Python
3
star
41

personalized-collaborative-llms

Python
2
star
42

phantomedicus

MedSurge: medical survey generator
Jupyter Notebook
1
star