• Stars
    star
    389
  • Rank 106,726 (Top 3 %)
  • Language
    MATLAB
  • License
    Other
  • Created over 2 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Discover pretrained models for deep learning in MATLAB

MATLAB Deep Learning Model Hub

Discover pretrained models for deep learning in MATLAB.

Models

Computer Vision

Natural Language Processing

Audio

Lidar

Image Classification

Pretrained image classification networks have already learned to extract powerful and informative features from natural images. Use them as a starting point to learn a new task using transfer learning.

Inputs are RGB images, the output is the predicted label and score:

These networks have been trained on more than a million images and can classify images into 1000 object categories.

Models available in MATLAB:

Note 1: Since R2024a, please use the imagePretrainedNetwork function instead and specify the pretrained model.

Network Size (MB) Classes Accuracy % Location
googlenet1 27 1000 66.25 Doc
GitHub
squeezenet1 5.2 1000 55.16 Doc
alexnet1 227 1000 54.10 Doc
resnet181 44 1000 69.49 Doc
GitHub
resnet501 96 1000 74.46 Doc
GitHub
resnet1011 167 1000 75.96 Doc
GitHub
mobilenetv21 13 1000 70.44 Doc
GitHub
vgg161 515 1000 70.29 Doc
vgg191 535 1000 70.42 Doc
inceptionv31 89 1000 77.07 Doc
inceptionresnetv21 209 1000 79.62 Doc
xception1 85 1000 78.20 Doc
darknet191 78 1000 74.00 Doc
darknet531 155 1000 76.46 Doc
densenet2011 77 1000 75.85 Doc
shufflenet1 5.4 1000 63.73 Doc
nasnetmobile1 20 1000 73.41 Doc
nasnetlarge1 332 1000 81.83 Doc
efficientnetb01 20 1000 74.72 Doc
ConvMixer 7.7 10 - GitHub
Vison Transformer Large-16 - 1100
Base-16 - 331.4
Small-16 - 84.7
Tiny-16 - 22.2
1000 Large-16 - 85.59
Base-16 - 85.49
Small-16 - 83.73
Tiny-16 - 78.22
Doc

Tips for selecting a model

Pretrained networks have different characteristics that matter when choosing a network to apply to your problem. The most important characteristics are network accuracy, speed, and size. Choosing a network is generally a tradeoff between these characteristics. The following figure highlights these tradeoffs:

Figure. Comparing image classification model accuracy, speed and size.

Back to top

Object Detection

Object detection is a computer vision technique used for locating instances of objects in images or videos. When humans look at images or video, we can recognize and locate objects of interest within a matter of moments. The goal of object detection is to replicate this intelligence using a computer.

Inputs are RGB images, the output is the predicted label, bounding box and score:

These networks have been trained to detect 80 objects classes from the COCO dataset. These models are suitable for training a custom object detector using transfer learning.

Network Network variants Size (MB) Mean Average Precision (mAP) Object Classes Location
EfficientDet-D0 efficientnet 15.9 33.7 80 GitHub
YOLO v8 yolo8n
yolo8s
yolo8m
yolo8l
yolo8x
10.7
37.2
85.4
143.3
222.7
37.3
44.9
50.2
52.9
53.9
80 GitHub
YOLOX YoloX-s
YoloX-m
YoloX-l
32
90.2
192.9
39.8
45.9
48.6
80 Doc
GitHub
YOLO v4 yolov4-coco
yolov4-tiny-coco
229
21.5
44.2
19.7
80 Doc
GitHub
YOLO v3 darknet53-coco
tiny-yolov3-coco
220.4
31.5
34.4
9.3
80 Doc
YOLO v2 darknet19-COCO
tiny-yolo_v2-coco
181
40
28.7
10.5
80 Doc
GitHub

Tips for selecting a model

Pretrained object detectors have different characteristics that matter when choosing a network to apply to your problem. The most important characteristics are mean average precision (mAP), speed, and size. Choosing a network is generally a tradeoff between these characteristics.

Application Specific Object Detectors

These networks have been trained to detect specific objects for a given application.

Network Application Size (MB) Location Example Output
Spatial-CNN Lane detection 74 GitHub
RESA Road Boundary detection 95 GitHub
Single Shot Detector (SSD) Vehicle detection 44 Doc
Faster R-CNN Vehicle detection 118 Doc

Back to top

Semantic Segmentation

Segmentation is essential for image analysis tasks. Semantic segmentation describes the process of associating each pixel of an image with a class label, (such as flower, person, road, sky, ocean, or car).

Inputs are RGB images, outputs are pixel classifications (semantic maps).

This network has been trained to detect 20 objects classes from the PASCAL VOC dataset:

Network Size (MB) Mean Accuracy Object Classes Location
DeepLabv3+ 209 0.87 20 GitHub

Application Specific Semantic Segmentation Models

Network Application Size (MB) Location Example Output
U-net Raw Camera Processing 31 Doc
3-D U-net Brain Tumor Segmentation 56.2 Doc
AdaptSeg (GAN) Model tuning using 3-D simulation data 54.4 Doc

Back to top

Instance Segmentation

Instance segmentation is an enhanced type of object detection that generates a segmentation map for each detected instance of an object. Instance segmentation treats individual objects as distinct entities, regardless of the class of the objects. In contrast, semantic segmentation considers all objects of the same class as belonging to a single entity.

Inputs are RGB images, outputs are pixel classifications (semantic maps), bounding boxes and classification labels.

Network Object Classes Location
Mask R-CNN 80 Doc
Github

Back to top

Image Translation

Image translation is the task of transferring styles and characteristics from one image domain to another. This technique can be extended to other image-to-image learning operations, such as image enhancement, image colorization, defect generation, and medical image analysis.

Inputs are images, outputs are translated RGB images. This example workflow shows how a semantic segmentation map input translates to a synthetic image via a pretrained model (Pix2PixHD):

Network Application Size (MB) Location Example Output
Pix2PixHD(CGAN) Synthetic Image Translation 648 Doc
UNIT (GAN) Day-to-Dusk Dusk-to-Day Image Translation 72.5 Doc
UNIT (GAN) Medical Image Denoising 72.4 Doc
CycleGAN Medical Image Denoising 75.3 Doc
VDSR Super Resolution (estimate a high-resolution image from a low-resolution image) 2.4 Doc

Back to top

Pose Estimation

Pose estimation is a computer vision technique for localizing the position and orientation of an object using a fixed set of keypoints.

All inputs are RGB images, outputs are heatmaps and part affinity fields (PAFs) which via post processing perform pose estimation.

Network Backbone Networks Size (MB) Location
OpenPose vgg19 14 Doc
HR Net human-full-body-w32
human-full-body-w48
106.9
237.7
Doc

Back to top

3D Reconstruction

3D reconstruction is the process of capturing the shape and appearance of real objects.

Network Size (MB) Location Example Output
NeRF 3.78 GitHub NeRF

Back to top

Video Classification

Video classification is a computer vision technique for classifying the action or content in a sequence of video frames.

All inputs are Videos only or Video with Optical Flow data, outputs are gesture classifications and scores.

Network Inputs Size(MB) Classifications (Human Actions) Description Location
SlowFast Video 124 400 Faster convergence than Inflated-3D Doc
R(2+1)D Video 112 400 Faster convergence than Inflated-3D Doc
Inflated-3D Video & Optical Flow data 91 400 Accuracy of the classifier improves when combining optical flow and RGB data. Doc

Back to top

Text Detection and Recognition

Text detection is a computer vision technique used for locating instances of text within in images.

Inputs are RGB images, outputs are bounding boxes that identify regions of text.

Network Application Size (MB) Location
CRAFT Trained to detect English, Korean, Italian, French, Arabic, German and Bangla (Indian). 3.8 Doc
GitHub

Application Specific Text Detectors

Network Application Size (MB) Location Example Output
Seven Segment Digit Recognition Seven segment digit recognition using deep learning and OCR. This is helpful in industrial automation applications where digital displays are often surrounded with complex background. 3.8 Doc
GitHub

Back to top

Transformers (Text)

Transformer pretained models have already learned to extract powerful and informative features features from text. Use them as a starting point to learn a new task using transfer learning.

Inputs are sequences of text, outputs are text feature embeddings.

Network Applications Size (MB) Location
BERT Feature Extraction (Sentence and Word embedding), Text Classification, Token Classification, Masked Language Modeling, Question Answering 390 GitHub
Doc
all-MiniLM-L6-v2 Document Embedding, Clustering, Information Retrieval 80 Doc
all-MiniLM-L12-v2 Document Embedding, Clustering, Information Retrieval 120 Doc

Application Specific Transformers

Network Application Size (MB) Location Output Example
FinBERT The FinBERT model is a BERT model for financial sentiment analysis 388 GitHub
GPT-2 The GPT-2 model is a decoder model used for text summarization. 1.2GB GitHub

Back to top

Audio Embeddings

Audio embedding pretrained models have already learned to extract powerful and informative features from audio signals. Use them as a starting point to learn a new task using transfer learning.

Inputs are audio signals, outputs are audio feature embeddings.

Note 2: Since R2024a, please use the audioPretrainedNetwork function instead and specify the pretrained model.

Network Application Size (MB) Location
VGGish2 Feature Embeddings 257 Doc
OpenL32 Feature Embeddings 200 Doc

Application Specific Audio Models

Network Application Size (MB) Output Classes Location Output Example
vadnet2 Voice Activity Detection (regression) 0.427 - Doc
YAMNet2 Sound Classification 13.5 521 Doc
CREPE2 Pitch Estimation (regression) 132 - Doc

Speech to Text

Speech-to-text models provide a fast, efficient method to convert spoken language into written text, enhancing accessibility for individuals with disabilities, enabling downstream tasks like text summarization and sentiment analysis, and streamlining documentation processes. As a key element of human-machine interfaces, including personal assistants, it allows for natural and intuitive interactions, enabling machines to understand and execute spoken commands, improving usability and broadening inclusivity across various applications.

Inputs are audio signals, outputs is text.

Network Application Size (MB) Word Error Rate (WER) Location
wav2vec Speech to Text 236 3.2 GitHub
deepspeech Speech to Text 167 5.97 GitHub

Back to top

Lidar

Point cloud data is acquired by a variety of sensors, such as lidar, radar, and depth cameras. Training robust classifiers with point cloud data is challenging because of the sparsity of data per object, object occlusions, and sensor noise. Deep learning techniques have been shown to address many of these challenges by learning robust feature representations directly from point cloud data.

Inputs are Lidar Point Clouds converted to five-channels, outputs are segmentation, classification or object detection results overlayed on point clouds.

Network Application Size (MB) Object Classes Location
PointNet Classification 5 14 Doc
PointNet++ Segmentation 3 8 Doc
PointSeg Segmentation 14 3 Doc
SqueezeSegV2 Segmentation 5 12 Doc
SalsaNext Segmentation 20.9 13 GitHub
PointPillars Object Detection 8 3 Doc
Complex YOLO v4 Object Detection 233 (complex-yolov4)
21 (tiny-complex-yolov4)
3 GitHub

Back to top

Model requests

If you'd like to request MATLAB support for additional pretrained models, please create an issue from this repo.

Alternatively send the request through to:

Jianghao Wang
Deep Learning Product Manager
[email protected]

Copyright 2023, The MathWorks, Inc.

More Repositories

1

transformer-models

Deep Learning Transformer models in MATLAB
MATLAB
173
star
2

reinforcement_learning_financial_trading

MATLAB example on how to use Reinforcement Learning for developing a financial trading model
MATLAB
142
star
3

Fault-Detection-Using-Deep-Learning-Classification

This demo shows how to prepare, model, and deploy a deep learning LSTM based classification algorithm to identify the condition or output of a mechanical air compressor.
C++
70
star
4

llms-with-matlab

Connect MATLAB to the OpenAI Chat Completions API (which powers ChatGPT)
MATLAB
60
star
5

Image-Classification-in-MATLAB-Using-TensorFlow

This example shows how to call a TensorFlow model from MATLAB using co-execution with Python.
MATLAB
45
star
6

pretrained-yolo-v4

Object detection and transfer learning using pretrained YOLO v4 models in MATLAB.
MATLAB
43
star
7

rl-agent-based-traffic-control

Develop agent-based traffic management system by model-free reinforcement learning
MATLAB
42
star
8

pose-estimation-3d-with-stereo-camera

This demo uses a deep neural network and two generic cameras to perform 3D pose estimation.
MATLAB
41
star
9

Abnormal-EEG-Signal-Classification-Using-CNNs

This example shows how to build and train a convolutional neural network (CNN) from scratch to perform a classification task with an EEG dataset.
MATLAB
41
star
10

Industrial-Machinery-Anomaly-Detection

Extract features and detect anomalies in industrial machinery vibration data using a biLSTM autoencoder
MATLAB
39
star
11

Object-Detection-Using-YOLO-v2-Deep-Learning

MATLAB example of deep learning based object detection using Yolo v2 with ResNet50 Base Network
Cuda
35
star
12

Brain-MRI-Age-Classification-using-Deep-Learning

MATLAB example using deep learning to classify chronological age from brain MRI images
MATLAB
32
star
13

playing-Pong-with-deep-reinforcement-learning

Train a reinforcement learning agent to play a variation of Pong®
MATLAB
31
star
14

Explore-Deep-Network-Explainability-Using-an-App

This repository provides an app for exploring the predictions of an image classification network using several deep learning visualization techniques. Using the app, you can: explore network predictions with occlusion sensitivity, Grad-CAM, and gradient attribution methods, investigate misclassifications using confusion and t-SNE plots, visualize layer activations, and many more techniques to help you understand and explain your deep network’s predictions.
MATLAB
30
star
15

COVID19-Face-Mask-Detection-using-deep-learning

The entire workflow of developing deep learning model for detecting face mask.
MATLAB
28
star
16

Human-Pose-Estimation-with-Deep-Learning

MATLAB example of deep learning based human pose estimation.
HTML
27
star
17

mask-rcnn

Mask-RCNN training and prediction in MATLAB for Instance Segmentation
MATLAB
26
star
18

mtcnn-face-detection

Face detection and alignment using deep learning
HTML
25
star
19

pix2pix

Image to Image Translation Using Generative Adversarial Networks
MATLAB
25
star
20

resnet-50

Repo for ResNet-50
MATLAB
24
star
21

constrained-deep-learning

Constrained deep learning is an advanced approach to training deep neural networks by incorporating domain-specific constraints into the learning process.
MATLAB
23
star
22

Pretrained-YOLOX-Network-For-Object-Detection

YOLOX inference in MATLAB for Object Detection with yolox_s, yolox_m & yolox_l networks
23
star
23

deep-sudoku-solver

Find and solve sudoku puzzles in images using deep learning and computer vision
MATLAB
22
star
24

Lidar-object-detection-using-complex-yolov4

Object detection and transfer learning on point clouds using pretrained Complex-YOLOv4 models in MATLAB
MATLAB
21
star
25

pretrained-spatial-CNN

Spatial-CNN for lane detection in MATLAB.
MATLAB
21
star
26

googlenet

Repo for GoogLeNet
MATLAB
20
star
27

pretrained-deeplabv3plus

DeepLabv3+ inference and training in MATLAB for Semantic Segmentation
MATLAB
17
star
28

Image-domain-conversion-using-CycleGAN

MATLAB example of deep learning for image domain conversion
MATLAB
16
star
29

Object-Detection-Using-Pretrained-YOLO-v2

YOLO v2 prediction and training in MATLAB for Object Detection with darknet19 & tinyYOLOv2 base networks
16
star
30

resnet-18

Repo for ResNet-18
MATLAB
15
star
31

pretrained-salsanext

Semantic segmentation and transfer learning using pretrained SalsaNext model in MATLAB
MATLAB
14
star
32

Inverse-Problems-using-Physics-Informed-Neural-Networks-PINNs

MATLAB
14
star
33

pretrained-efficientdet-d0

Object detection and transfer learning using pretrained EfficientDet-D0 model in MATLAB.
MATLAB
14
star
34

Social-Distancing-Monitoring-System

AI-enabled social distancing detection tool that can detect if people are keeping a safe distance from each other by analyzing real-time video streams from the camera. - Idea Credit : Landing AI (https://landing.ai/)
MATLAB
14
star
35

fourier-neural-operator

MATLAB
13
star
36

Hyperparameter-Tuning-in-MATLAB-using-Experiment-Manager-and-TensorFlow

This example shows how to use MATLAB to train a TensorFlow model and tune it's hyperparameters using co-execution with Python.
MATLAB
13
star
37

transformer-networks-for-time-series-prediction

Deep Learning in Quantitative Finance: Transformer Networks for Time Series Prediction
MATLAB
13
star
38

Text-Detection-using-Deep-Learning

Text Detection using Pretrained CRAFT model in MATLAB
12
star
39

resnet-101

Repo for ResNet-101 model
MATLAB
11
star
40

mobilenet-v2

Repo for MobileNet-v2
MATLAB
11
star
41

Automate-Labeling-in-Image-Labeler-using-a-Pretrained-TensorFlow-Object-Detector

This example shows how to automate object labeling in the Image Labeler app using a TensorFlow object detector model trained in Python.
MATLAB
10
star
42

Image-Classification-in-MATLAB-Using-Converted-TensorFlow-Model

This repository shows how to import a pretrained TensorFlow model in the SavedModel format, and use the imported network to classify an image.
MATLAB
10
star
43

Quantized-Deep-Neural-Network-on-Jetson-AGX-Xavier

How to create, train and quantize network, then integrate it into pre/post image processing and generate CUDA C++ code for targeting Jetson AGX Xavier
MATLAB
10
star
44

Hamiltonian-Neural-Network

MATLAB
9
star
45

pillQC

A pill quality control dataset and associated anomaly detection example
MATLAB
9
star
46

stride-measurement-of-runner

This sample script is to measure the stride of a runner in video by using pretrained deep learning model and simple signal processing.
MATLAB
9
star
47

neuron-coverage-for-deep-learning

Compute the neuron coverage of a deep learning network in MATLAB.
MATLAB
8
star
48

quantization-aware-training

This example shows how to perform quantization aware training for transfer learned MobileNet-v2 network.
HTML
8
star
49

compare-PyTorch-models-from-MATLAB

Compare PyTorch models from MATLAB using co-execution
MATLAB
8
star
50

Classification-of-SARS-COVID-19-and-Other-Lung-Infections-from-Chest-X-Ray-Scan-Images-with-DenseNet

This example shows how to train a deep neural network to classify SARS COVID-19 and other lung infections using chest X-ray (CXR) images.
MATLAB
8
star
51

coexecution_speech_command

PyTorch and TensorFlow Co-Execution for Speech Command Recognition
Python
7
star
52

Pretrained-YOLOv8-Network-For-Object-Detection

YOLO v8 inference in MATLAB for Object Detection with yolov8n, yolov8s, yolov8m, yolov8l, yolov8x, networks
MATLAB
7
star
53

deepspeech

This repo provides the pretrained DeepSpeech model in MATLAB. The model is compatible with transfer learning and C/C++ code generation.
MATLAB
7
star
54

nerf

NeRF - Neural Radiance Fileds in MATLAB
MATLAB
6
star
55

Physical-Concepts-Scinet

This repository provides implementation of SciNet network described in arXiv:1807.10300v3
MATLAB
6
star
56

wav2vec-2.0

This repo provides the pretrained baseline 960 hours wav2vec 2.0 model in MATLAB.
6
star
57

convmixer-patches-are-all-you-need

ConvMixer - Patches Are All You Need?
MATLAB
6
star
58

artistic-style-transfer

Artistic fast style transfer with a webcam
MATLAB
5
star
59

Seven-Segment-Digit-Recognition

Seven Segment Digit Recognition in MATLAB
5
star
60

speech-based-information-retrieval

Retrieve answers from a knowledge base via speech recognition and information retrieval.
MATLAB
5
star
61

Deep_Learning_Poker_Player_using_MATLAB_and_Raspberry_Pi

This example shows how to use automatic code generation to deploy a deep learning model from MATLAB to Raspberry Pi
MATLAB
5
star
62

CSINet-Channel-Compression-in-MATLAB-Using-Keras

This example shows how to co-execute MATLAB and Python to simulate the effect of channel estimate compression on precoding in a MIMO OFDM channel.
MATLAB
5
star
63

Use-a-Python-Speech-Command-Recognition-System-to-MATLAB

Use a Python speech command recognition system in MATLAB
MATLAB
5
star
64

Biomass-Estimation-Starter-Code

4
star
65

satellite-image-semantic-segmentation

Semantic Segmentation of Large Satellite Images with blockedImage Datastores
MATLAB
4
star
66

compare-gradient-attribution-maps

Sanity checks for comparing gradient attribution maps
MATLAB
3
star
67

Pretrained-RESA-Network-For-Road-Boundary-Detection

Pretrained RESA model for road boundary detection in MATLAB
MATLAB
3
star
68

Animated-Adaptive-Linear-Neuron

Animating how Adaline classification works by minimizing cost. Showing comparison of three kinds of gradient descent.
MATLAB
2
star
69

Convert-librosa-Audio-Feature-Extraction-To-MATLAB

Convert librosa Audio Feature Extraction To MATLAB
MATLAB
2
star