DDPM |
DDPM: Denoising Diffussion Probalisitic Model |
UC Berkeley |
Jun 2020 |
NeurIPS 2020 |
To do |
Vision - Image generation |
Improved diffusion |
Improved Denoising Diffusion Probabilistic Models |
OpenAI |
Feb 2021 |
PMLR 2021 |
|
Guided diffusion |
Diffusion Models Beat Gans on Image Synthesis |
OpenAI |
Apr 2021 |
NeurIPS 2021 |
|
ADM |
Diffusion Models Beat GANs on Image Synthesis |
OpenAI |
Apr 2021 |
NeurIPS 2021 |
|
FastDPM |
On Fast Sampling of Diffusion Probabilistic Models |
NVIDIA |
May 2021 |
ICLR Workshop 2021 |
|
LSGM |
Score-based Generative Modeling in Latent Space |
NVIDIA |
Jun 2021 |
NeurIPS 2021 |
|
Distilled-DM |
Progressive Distillation for Fast Sampling of Diffusion Models |
Google Brain |
Feb 2022 |
ICLR 2022 |
|
GGDM |
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality |
Google Brain |
Feb 2022 |
ICLR 2022 |
|
Vision - Text to Image |
Stable Diffusion/LDM |
High-Resolution Image Synthesis with Latent Diffusion Models |
Stability.AI |
Dec 2021 |
|
|
Glide |
Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models |
OpenAI |
Dec 2021 |
|
|
Dalle-2 |
Hierarchical Text-conditional Image Generation with Clip Latents |
OpenAI |
Apr 2022 |
|
|
KNN Diffusion |
Image Generation via Large-Scale Retrieval |
Meta AI |
Apr 2022 |
|
|
Imagen |
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding |
Google Brain |
May 2022 |
|
|
LAION-RDM |
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models |
Ludwig-Maximilian University of Munich |
Jul 2022 |
|
|
DreamBooth |
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation |
Google Research |
Aug 2022 |
|
|
DreamFusion |
DreamFusion: Text-to-3D using 2D Diffusion |
Google Research |
29 Sep 2022 |
|
|
Vision - Image Editing |
SDEdit |
SDEdit: Image Synthesis and Editing with Stochastic Differential Equations |
Stanford U & CMU |
Aug 2021 |
ICLR 2022 |
|
RePaint |
RePaint: Inpainting using Denoising Diffusion Probabilistic Models |
ETH Zurich |
Jan 2022 |
CVPR 2022 |
|
Vision - Video Genereation |
Video diffusion models |
Video diffusion models |
Google Brain |
Apr 2022 |
ICLR 2022 Workshop |
|
MCVD |
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation |
University of Montreal |
May 2022 |
|
|
Make-A-Video |
Make-A-Video: Text-to-Video Generation without Text-Video Data |
Meta AI |
29 Sep 2022 |
|
|
Imagen Video |
Imagen Video: High Definition Video Generation with Diffusion Models |
Google Brain |
5 Oct 2022 |
|
|
Natural language |
Diffusion-LM |
Diffusion-LM Improves Controllable Text Generation |
Stanford University |
May 2022 |
|
|
Audio - Audio Generation |
DiffWave |
DiffWave: A Versatile Diffusion Model for Audio Synthesis |
Nvidia & Baidu |
Jun 2020 |
ISMIR 2021 |
|
WaveGrad |
WaveGrad: Estimating Gradients for Waveform Generation |
Google Brain |
Sep 2020 |
ICLR 2021 |
|
Symbolic Music Generation |
Symbolic Music Generation with Diffusion Models |
Google Brain |
Mar 2021 |
ISMIR 2021 |
|
DiffSinger |
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism |
Zhejiang University |
May 2021 |
AAAI 2022 |
|
VDM |
Variational Diffusion Models |
Google Brain |
Jul 2021 |
NeurIPS 2021 |
|
FastDiff |
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis |
Tencent AI Lab |
Apr 2022 |
IJCAI 2022 |
|
BDDMs |
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis |
Tencent AI Lab |
May 2022 |
ICLR 2022 |
|
SawSing |
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation |
|
AUG 2022 |
ISMIR 2022 |
|
Prodiff |
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech |
Zhejiang University |
JUL 2022 |
ACM Multimedia 2022 |
|
Audio - Audio Conversion |
DiffVC |
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme |
Huawei Noah |
Sep 2021 |
ICLR 2022 |
|
Audio - Audio Enhancement |
NU-Wave |
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling |
MINDSLAB |
Apr 2021 |
Interspeech 2021 |
|
CDiffSE |
Conditional Diffusion Probabilistic Model for Speech Enhancement |
CMU |
Feb 2022 |
IEEE 2022 |
|
Audio - Text to Speech |
Grad-TTS |
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech |
Huawei Noah |
May 2021 |
|
|
EdiTTS |
EdiTTS: Score-based Editing for Controllable Text-to-Speech |
Yale University |
Oct 2021 |
|
|
DiffGAN-TTS |
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs |
Tencent AI Lab |
Jan 2022 |
|
|
Diffsound |
Diffsound: Discrete Diffusion Model for Text-to-sound Generation |
Tencent AI Lab |
Jul 2022 |
|
|