• Stars
    star
    12
  • Rank 1,597,372 (Top 32 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 2 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A collection of generative and training notebooks getting mirrored to google colab.

More Repositories

1

Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Python
37,011
star
2

audio-dataset

Audio Dataset for training CLAP and other models
Python
616
star
3

CLIP_benchmark

CLIP-like model evaluation
Jupyter Notebook
601
star
4

dalle2-laion

Pretrained Dalle2 from laion
Python
499
star
5

CLAP

Contrastive Language-Audio Pretraining
Python
479
star
6

natural_voice_assistant

Python
439
star
7

laion-3d

Collect large 3d dataset and build models
253
star
8

phenaki

A phenaki reproduction using pytorch.
Python
218
star
9

aesthetic-predictor

A linear estimator on top of clip to predict the aesthetic quality of pictures
Jupyter Notebook
199
star
10

Open-Instruction-Generalist

Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
Python
195
star
11

ldm-finetune

Home of `erlich` and `ongo`. Finetune latent-diffusion/glid-3-xl text2image on your own data.
Python
169
star
12

scaling-laws-openclip

Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
Jupyter Notebook
152
star
13

CLIP-based-NSFW-Detector

Python
135
star
14

laion-datasets

Description and pointers of laion datasets
HTML
131
star
15

laion-dreams

Aim for the moon. If you miss, you may hit a star.
121
star
16

laion.ai

HTML
110
star
17

AIW

Alice in Wonderland code base for experiments and raw experiments data
Python
108
star
18

LAION-5B-WatermarkDetection

Python
102
star
19

video-clip

Let's make a video clip
92
star
20

Open-GIA

O-GIA is an umbrella for research, infrastructure and projects ecosystem that should provide open source, reproducible datasets, models, applications & safety tools for Open Generalist Interactive Agents (O-GIA). O-GIA systems will act in collaboration with human or autonomously, supporting various kind of validated decision making and assistance.
91
star
21

General-GPT

Jupyter Notebook
64
star
22

Discord-Scrapers

Implementation of a discord channel scraper to generate datasets.
Python
60
star
23

Text-to-speech

Python
58
star
24

Big-Interleaved-Dataset

Big-Interleaved-Dataset
Python
57
star
25

riverbed

Tools for content datamining and NLP at scale
Python
41
star
26

OCR-ensemble

Jupyter Notebook
38
star
27

Conditional-Pretraining-of-Large-Language-Models

Python
37
star
28

interesting-text-datasets

33
star
29

blade2blade

Adversarial Training and SFT for Bot Safety Models
Python
32
star
30

temporal-embedding-aggregation

Aggregating embeddings over time
Python
31
star
31

deep-image-diffusion-prior

Inverts CLIP text embeds to image embeds and visualizes with deep-image-prior.
Jupyter Notebook
31
star
32

watermark-detection

A repository containing datasets and tools to train a watermark classifier.
Python
31
star
33

medical

This repository will be a summary and outlook on all our open, medical, AI advancements.
Jupyter Notebook
28
star
34

Anh

Anh - LAION's multilingual assistant datasets and models
Python
27
star
35

laion50BU

Un-*** 50 billions multimodality dataset
24
star
36

conditioned-prior

(wip) Use LAION-AI's CLIP "conditoned prior" to generate CLIP image embeds from CLIP text embeds.
Python
18
star
37

LAION-SAFETY

An open toolbox for NSFW & toxicity detection
Jupyter Notebook
16
star
38

opendream

Frontend (and soon also midleware and backend) for a new, opensource image generation platform.
TypeScript
14
star
39

laion5B-paper

Building the laion5B paper
13
star
40

laion-dedup

Python
13
star
41

laionide

This repository contains training code and checkpoitns for finetuning glide.
Python
12
star
42

super-resolution

This is the LAION repository for creating open super-resolution models with the help of LAION-5B subsets.
11
star
43

dataset-spec

Describe the format of image/text datasets
Python
10
star
44

LAION-PEOPLE

This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally it provides clusters based on the poses and face meshes and pose-related captions based on these cluster assignments.
10
star
45

image-deduplication-testset

HTML
8
star
46

project-menu

Projects at LAION
8
star
47

laion-ai.github.io

laion github website
Svelte
6
star
48

dataset-usage

This repository is a summary of all systems and scientific papers that use LAION datasets.
6
star
49

repository-overview

This repository will give a quick overview of all projects and repositories from LAION.
5
star
50

LionizeR

Experiments with Summarization, Long Context and Retrieval
Python
4
star
51

KAISER

Knowledge Acquisition and Interlinking via Semantic Embeddings and Reasoning
4
star
52

lucidrains-projects

A summary of all lucidrains repositores and links to training / research approaches by LAION or other communities.
Jupyter Notebook
3
star
53

decentralized-learning

A basic setup for decentralized-learning that can be used for training future DALLE/CLIP/CLAP models.
3
star
54

diffusion-prior

DALL-E2 diffusion prior
Python
3
star
55

GIF

General / Global Inference Framework
Python
3
star
56

website

This is the development repository of the LAION-AI website.
HTML
3
star
57

safety-pipeline

A collection of safety classifiers and models to process image and texts.
Python
3
star
58

NeoGen

3
star
59

laion5b-subsets

Creating subsets from laion5b via embeddings search
Jupyter Notebook
2
star
60

human_artifacts

A repo containing images for artifact annotation.
2
star
61

public-relations

All media / publicity on LAION and related stuff!
2
star
62

public-domain-images

A collection of public domain images donated for ML training.
2
star
63

math_problems-step-by-step_solutions

Here we provide and collect many functions to generate math problem and step by step solutions for LLM training
Python
2
star
64

language-models

2
star
65

dataset-inference

The new repository for the genral inference pipeline.
Python
2
star
66

introduction-resources

Recommended intro resources
2
star
67

balanced-laion5b

This repository shall help finding a good distribution for huge datasets like LAION-5B for more efficient training.
2
star
68

hand-inference

A model to run hand inference on a cluster.
Jupyter Notebook
2
star
69

BUD-E_V1.0

BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the creation and integration of diverse skills for educational and research applications.
1
star
70

laion5b-bias

This repository is a collection of found biases in the LAION-5B dataset.
1
star
71

dataset-tasks

datasets that should be downloaded & converted to our standard training formart.
1
star
72

LAION-AUDIO

This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
1
star
73

AIW_webpage

Alice in Wonderland project and initiative webpage
1
star