rishikksh20/MLP-Mixer-pytorch

Stars
207
Rank 189,769 (Top 4 %)
Language
Python
License
MIT License
Created over 3 years ago
Updated over 3 years ago

rishikksh20/MLP-Mixer-pytorch

rishikksh20

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

MLP-Mixer: An all-MLP Architecture for Vision

This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision.

Usage :

import torch
import numpy as np
from mlp-mixer import MLPMixer

img = torch.ones([1, 3, 224, 224])

model = MLPMixer(in_channels=3, image_size=224, patch_size=16, num_classes=1000,
                 dim=512, depth=8, token_dim=256, channel_dim=2048)

parameters = filter(lambda p: p.requires_grad, model.parameters())
parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
print('Trainable Parameters: %.3fM' % parameters)

out_img = model(img)

print("Shape of out :", out_img.shape)  # [B, in_channels, image_size, image_size]

Citation :

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Some component borrowed from ViT code of @lucidrains repo : https://github.com/lucidrains/vit-pytorch

ViViT-pytorch

Implementation of ViViT: A Video Vision Transformer

ResUnet

Pytorch implementation of ResUnet and ResUnet ++

VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

FNet-pytorch

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

convolution-vision-transformers

PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

CrossViT-pytorch

Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

Jupyter Notebook

HiFiplusplus-pytorch

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

SoundStorm-pytorch

Google's SoundStorm: Efficient Parallel Audio Generation

Avocodo-pytorch

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

CeiT-pytorch

Implementation of Convolutional enhanced image Transformer

vae_tacotron2

VAE Tacotron 2, an alternative of GST Tacotron

TalkNet2-pytorch

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

LightSpeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

HiFi-GAN

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

NaturalSpeech2

UnivNet-pytorch

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

AdaSpeech2

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Jupyter Notebook

AudioMAE-pytorch

Unofficial PyTorch implementation of Masked Autoencoders that Listen

melgan

MelGAN implementation with Multi-Band and Full Band supports...

Jupyter Notebook

Liveness-Detection

Liveness Detection for human face

gmvae_tacotron

Gaussian Mixture VAE Tacotron

iSTFT-Avocodo-pytorch

Ultrafast GAN based Vocoder for Text to Speech

Phone-Level-Mixture-Density-Network-for-TTS

Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

Jupyter Notebook

LSTM-Time-Series-Analysis

Using LSTM network for time series forecasting

Jupyter Notebook

NU-Wave-pytorch

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

ResMLP-pytorch

ResMLP: Feedforward networks for image classification with data-efficient training

PPSpeech

PPSpeech: Phrase based Parallel End-to-End TTS System

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

NU-Wave2-pytorch

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]

SiT-pytorch

SiT: Self-supervised vision Transformer

rectified-linear-attention

Sparse Attention with Linear Units

CoaT-pytorch

CoaT: Co-Scale Conv-Attentional Image Transformers

Movie-Recommender-System

LocalViT-pytorch

LocalViT: Bringing Locality to Vision Transformers

Bidirectional-LEM-pytorch

Pytorch Implementation of Bidirectional Long Expressive Memory

compact-convolution-transformer

Compact Convolution Transformers

WaveFlow

WaveFlow : A Compact Flow-based Model for Raw Audio

McKinsey-Hiring-Hack-Challenge

My solution for Online McKinsey Hiring Hack Challenge hosted by Analytics Vidhya.

Jupyter Notebook

Word2Vec

Word2Vec tutorial using tensorflow

Jupyter Notebook

IMDB-Movie-Review-sentiment-Analysis

Jupyter Notebook

Meme-recognizer

Recognize the given image is Meme or not

Jupyter Notebook

Introduction-to-Tensorflow

Tensorflow tutorial from scratch

Jupyter Notebook

fastspeech2_samples

MyApplication

Android application in which audio and image play simultaneously

Loan-Prediction-Challenge

Jupyter Notebook

CNN-Visualization

Jupyter Notebook

Twins-SVT-pytorch

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Image-classifier-for-all

Universal Image classifier

Jupyter Notebook

PropertySetUp

Document-Classifier

Classify documents using Machine learning

Jupyter Notebook

Data-Analysis

Jupyter Notebook

Avito-Duplicate-ads

Jupyter Notebook

Natural-Language-Processing

Jupyter Notebook

Keras

Predictive analysis using Keras a powerful Neural network library run over theano for python

Jupyter Notebook

Data-Mining-Algos

Famous Data Mining Algos written in python using scikit-learn library

Identify-Question-Type

Given a question, the aim is to identify the category it belongs to. The four categories to handle for this assignment are : Who, What, When, Affirmation(yes/no). Label any sentence that does not fall in any of the above four as "Unknown" type.

Jupyter Notebook

Inception-Transformer-pytorch

iFormer: Inception Transformer

Email-Classification-Statement-Contract

classify emails into statements and contracts

Movie-Recommendation-System

Hybrid Movie recommendation system

Jupyter Notebook

SystemInfo

Jupyter Notebook

rishikksh20.github.io

LSTM_syntheic_gradient

Jupyter Notebook