Try out deep learning models online on Colab with a single click.
TTS
- An English female voice (LJSpeech) demo using NVIDIA/tacotron2 and NVIDIA/waveglow
- LibriTTS trained multi speaker TTS demo using NVIDIA/flowtron
- An English female voice (LJSpeech) demo using Rayhane-mamah/Tacotron-2 and r9y9/wavenet_vocoder
- A Mongolian male voice demo using Rayhane-mamah/Tacotron-2 with the Griffin-Lim algorithm
- An English female voice (LJSpeech) demo using tugstugi/pytorch-dc-tts with the Griffin-Lim algorithm
- An English female voice (LJSpeech) demo using fatchord/WaveRNN (Tacotron + WaveRNN)
- An English female voice (LJSpeech) demo using mozilla/TTS (Tacotron + WaveRNN)
- NVIDIA/mellotron notebook
- Voice clone demo using CorentinJ/Real-Time-Voice-Cloning
- Official ESPnet English/Chinese/Japanese TTS notebook
- Official ForwardTacotron LJSpeech TTS notebook
Speech Recognition
- mozilla/DeepSpeech with LM on Youtube videos
- Wav2Letter+ from NVIDIA/OpenSeq2Seq without LM on Youtube videos
- Jasper from NVIDIA/OpenSeq2Seq without LM on Youtube videos
- QuartzNet from NVIDIA/Nemo without LM on Youtube videos
- QuartzNet from NVIDIA/Nemo without LM with microphone
- CitriNet from NVIDIA/Nemo without LM with microphone
- Official ESPnet Spanish->English speech translation notebook
- English/German/Spain Silero speech recognition with snakers4/silero-models
Object Detection
- Tensorflow object detection: FasterRCNN+InceptionResNet and ssd+mobilenet
- Cascade RCNN demo using open-mmlab/mmdetection
- YOLO demo using ayooshkathuria/pytorch-yolo-v3
- Object detection on Youtube videos using amdegroot/ssd.pytorch (SSD300)
- Mask RCNN demo using matterport/Mask_RCNN
- Mask RCNN demo using Detectron
- Official Mask RCNN demo from Detectron2
- Mask RCNN demo from torchvision
- CenterNet (Objects as Points) demo using xingyizhou/CenterNet
- CenterNet (Objects as Points) 3D car detection demo using xingyizhou/CenterNet
- works only on a Kitti image because of camera parameters
- Official DEβ«ΆTR demo notebook facebookresearch/detr
- Official Google EfficientDet notebook
Segmentation
- For Mask RCNN, see Object Detection
- Semantic segmentation trained on ADE20K using CSAILVision/semantic-segmentation-pytorch
- DeepLabV3 from torchvision
- Fast tracking and segmentation with SiamMask on Youtube videos
- Real-time semantic segmentation with LightNet++ on Youtube videos
- Real-time instance segmentation with YOLACT on Youtube videos
- Instance segmentation with CenterMask
Multi Object Tracking
- Pedestrian tracking using ZQPei/deep_sort_pytorch (DeepSORT + YOLOv3)
Pose Detection
- OpenPose on Youtube videos
- AlphaPose on Youtube videos
- DensePose demo notebook
- HRNet using lxy5513/hrnet on Youtube videos
- Keypoint R-CNN from torchvision
Scene Text Detection
- PixelLink demo notebook
- Scene text detection using argman/EAST
- Scene text detection using CRAFT-pytorch
GAN
- BigGAN Large Scale GAN Training for High Fidelity Natural Image Synthesis
- DeOldify: A Deep Learning based project for colorizing and restoring old images
- Generates a talking face video from an image and an audio using Rudrabha/LipGAN
- PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
Misc
- Slow motion using avinashpaliwal/Super-SloMo on Youtube videos
- Finetune GPT2 ak9250/gpt-2-colab
- Music Source Separation sigsep/open-unmix-pytorch
- Image Super Resolution idealo/image-super-resolution
- First Order Motion Model for Image Animation AliaksandrSiarohin/first-order-model
- Official notebook of 3D Photography using Context-aware Layered Depth Inpainting vt-vl-lab/3d-photo-inpainting
- Image-GPT notebook
- Background Matting: The World is Your Green Screen senguptaumd/Background-Matting