• Stars
    star
    302
  • Rank 138,030 (Top 3 %)
  • Language
  • License
    MIT License
  • Created about 2 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep learning based content moderation from text, audio, video & image input modalities.

deep-learning-content-moderation

Various sources for deep learning based content moderation, sensitive content detection, scene genre classification, nudity detection, violence detection, substance detection from text, audio, video & image input modalities.

citation

If you find this source useful, please consider citing it in your work as:

@article{akyon2022contentmoderation,
  title={Deep Architectures for Content Moderation and Movie Content Rating},
  author={Akyon, Fatih Cagatay and Temizel, Alptekin},
  journal={arXiv},
  doi={https://doi.org/10.48550/arXiv.2212.04533},
  year={2022}
}

table of contents

datasets

movie and content moderation datasets

name paper year url input modality task labels
LSPD pdf 2022 page image, video image/video classification, instance segmentation porn, normal, sexy, hentai, drawings, female/male genital, female breast, anus
MM-Trailer pdf 2021 page video video classification age rating
Movienet scholar 2021 page image, video, text object detection, video classification scene level actions and places, character bboxes
Movie script severity dataset pdf 2021 github text text classification frightening, mild, moderate, severe
LVU pdf 2021 page video video classification relationship, place, like ration, view count, genre, writer, year per movie scene
Violence detection dataset scholar 2020 github video video classification violent, not-violent
Movie script dataset pdf 2019 github text text classification violent or not
Nudenet github 2019 archive.org image image classification nude or not
Adult content dataset pdf 2017 contact image image classification nude or not
Substance use dataset pdf 2017 first author image image classification drug related or not
NDPI2k dataset pdf 2016 contact video video classification porn or not
Violent Scenes Dataset springer 2014 page video video classification blood, fire, gun, gore, fight
VSD2014 pdf 2014 download video video classification blood, fire, gun, gore, fight
AIIA-PID4 pdf 2013 - image image classification bikini, porn, skin, non-skin
NDPI800 dataset scholar 2013 page video video classification porn or not
HMDB-51 scholar 2011 page video video classification smoke, drink

techniques

sensitive content detection

movie content rating

name paper year model features datasets tasks context
Movies2Scenes: Learning Scene Representations Using Movie Similarities scholar 2022 ViT-like video encoder + MLP ViT-like video encoder embedings Private, Movienet, LVU movie scene representation learning, video classifcation (sex, violence, drug-use) movie scene content rating
Detection and Classification of Sensitive Audio-Visual Content for Automated Film Censorship and Rating pdf 2022 CNN + GRU + MLP CNN embeddings from video frames Violence detection dataset violent/non-violent classification from videos movie scene content rating
Automatic parental guide ratings for short movies page 2021 separate model for each task: concat + LSTM, object detector, one-class CNN embeddings video frame pixel values, image embeddings, text Nudenet, private dataset profanity, violence, nudity, drug classification movie content rating
From None to Severe: Predicting Severity in Movie Scripts scholar 2021 multi-task pairwise ranking-classification network GloVe, Bert and TextCNN text embeddings Movie script severity dataset rating classifcation (frightening, mild, moderate, severe) movie content rating
A Case Study of Deep Learning-Based Multi-Modal Methods for Labeling the Presence of Questionable Content in Movie Trailers scholar 2021 multi-modal + multi output concat+MLP CNN+LSTM video features, Bert and DeepMoji text embeddings, MFCC audio features MM-Trailer rating classifcation (red, yellow, green) movie trailer content rating
Automatic Parental Guide Scene Classification Menggunakan Metode Deep Convolutional Neural Network Dan Lstm scholar 2020 3 CNN model for 3 modality, multi-label dataset CNN video and audio embeddings, LSTM text (subitle) embeddings private dataset gore, nudity, drug, profanity classification from video and subtitle movie scene content rating
Multimodal data fusion for sensitive scene localization scholar 2019 meta-learning with Naive Bayes, SVM MFCC and prosodic features from audio, HOG and TRoF features from images Pornography-2k dataset, VSD2014 violent and pornographic scene localization from video movie scene content rating
A Deep Learning approach for the Motion Picture Content Rating scholar 2019 MLP + rule-based decision InceptionV3 image embeddings Violent Scenes Dataset, private dataset violence (shooting, blood, fire, weapon) classification from video movie scene content rating
Hybrid System for MPAA Ratings of Movie Clips Using Support Vector Machine springer 2019 SVM DCT features from image private dataset movie content rating classification from images movie content rating
Inappropriate scene detection in a video stream page 2017 SVM classifier + Lenet image classifier + rules-based decision HoG and CNN features for image private dataset image classification: no/mild/high violence, safe/unsafe/pornoghraphy movie frame content rating

content moderation

name paper year model features datasets tasks context
Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild scholar 2022 novel threshold optimization tech. (TruSThresh) prediction scores UnSmile (Korean hatespeech dataset) optimum threshold prediction social media content moderation
On-Device Content Moderation scholar 2021 mobilenet v3 + SSD object detector mobilenet v3 image embeddings private dataset object detection + nudity classification from images on-device content moderation
Gore Classification and Censoring in Images scholar 2021 ensemble of CNN + MLP mobilenet v2, densenent, vgg16 image embeddings private dataset gore classification from images general content moderation
Automated Censoring of Cigarettes in Videos Using Deep Learning Techniques scholar 2020 CNN + MLP inception v3 image embeddings private dataset cigarette classification from video general content moderation
A Multimodal CNN-based Tool to Censure Inappropriate Video Scenes scholar 2019 CNN + SVM InceptionV3 image embeddings, AudioVGG audio embeddings private dataset inappropriate (nudity+gore) classification from video general video content moderation
A baseline for NSFW video detection in e-learning environments scholar 2019 concat + SVM, MLP InceptionV3 image embeddings, AudioVGG audio embeddings YouTube8M, NDPI, Cholec80 nudity classification from video e-learning content moderation
Bringing the kid back into youtube kids: Detecting inappropriate content on video streaming platforms scholar 2019 CNN + LSTM (late fusion) CNN based encoder for image, video and audio spectrograms private dataset video classification: orignal, fake explicit, fake violent social media content moderation

movie/scene genre classification

name paper year model features datasets tasks
Effectively leveraging Multi-modal Features for Movie Genre Classification scholar 2022 embeddings + fusion + MLP CLIP image embeddings, PANNs audio embeddings, CLIP text embeddings MovieNet movie genre classification
OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification scholar 2022 embeddings + novel transformer ResNet-18 image embeddings, ResNet-VLAD audio embeddings TI-News news scene segmentation/classification (studio, outdoor, interview)
Detection of Animated Scenes Among Movie Trailers scholar 2022 CNN + GRU EfficientNet image embeddings Private dataset genre classification from movie trailer scenes
A multi-label movie genre classification scheme based on the movie's subtitles springer 2022 KNN text frequency vectors Private dataset genre classification from movie subtitle text
A multimodal approach for multi-label movie genre classification scholar 2020 CNN + LSTM MFCCs/SSD/LBP from audio, LBP/3DCNN from video frames, Inception-v3 from poster, TFIDF from text Private dataset genre classification from movie trailers
Genre classification of movie trailers using 3d convolutional neural networks ieee 2020 3D CNN images Private dataset genre classification from movie trailer scenes
A unified framework of deep networks for genre classification using movie trailer scholar 2020 CNN + LSTM Inception V4 image embeddings EmoGDB genre classification from movie trailer scenes
Towards story-based classification of movie scenes scholar 2020 logistic regression manually extracted categorical features Flintstones Scene Dataset scene classification (Obstacle, Midpoint, Climax of Act 1)

multimodal architectures

synchronous multimodal architectures

name paper year model features datasets tasks modalities
M&M Mix: A Multimodal Multiview Transformer Ensemble scholar 2022 transformer with 2 cls heads ViT image embeddings from audio spect., frame image, optical flow Epic-Kitchens video/action classification image + audio + optical flow
MultiMAE: Multi-modal Multi-task Masked Autoencoders scholar 2022 transformer with 3 decoder + cls heads ViT-like image enc. patch embeddings (optional modalities) ImageNet: Pseudo labeled multi-task training dataset (depth, segm) image cs., semantic segm., depth est. image + depth map
Data2vec: A general framework for self-supervised learning in speech, vision and language scholar 2022 single encoder transformer based audio, text, image encoder embeddings ImageNet, Librispeech masked pretraining image + audio + text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text scholar 2022 1 encoder per modality transformer based audio, text, image encoder embeddings AudioSet, HowTo100M pretraining + video/audio classification image + audio + text
Expanding Language-Image Pretrained Models for General Video Recognition scholar 2022 1 encoder per modality transformer based video, text encoder embeddings HMDB-51, UCF-101 contrastive pretraining video + text
Audio-Visual Instance Discrimination with Cross-Modal Agreement scholar 2021 1 encoder per modality CNN based audio, video encoder embeddings HMDB-51, UCF-101 video/audio classification video + audio
Robust Audio-Visual Instance Discrimination scholar 2021 1 encoder per modality CNN based audio, video encoder embeddings HMDB-51, UCF-101 video/audio classification video + audio
Learning transferable visual models from natural language supervision scholar 2021 1 encoder per modality transformer based image, text encoder embeddings JFT-300M contrastive pretraining image + text
Self-supervised multimodal versatile networks scholar 2020 multiple encoders CNN based image/audio embeddings, word2vec text embeddings UCF101, Kinetics, AudioSet contrastive pretraining + classification image + audio + text
Uniter: Universal image-text representation learning scholar 2020 multimodal encoder combined embeddings COCO, Visual Genome, Conceptual Captions qa/image-text retrieval image + text
12-in-1: Multi-task vision and language representation learning scholar 2020 multimodal encoder combined embeddings COCO, Flickr30k qa/image-text retrieval image + text
Two-stream convolutional networks for action recognition in videos scholar 2014 1 encoder per modality CNN based audio, text encoder embeddings HMDB-51, UCF-101 video/audio classification video + optical flow

asynchronous multimodal architectures

name paper year model features datasets tasks modalities
OmniMAE: Single Model Masked Pretraining on Images and Videos scholar 2022 transformer with 1 cls. head ViT-like image/video enc. patch embeddings ImageNet, SSv2 video/action classification image + video
OMNIVORE: A Single Model for Many Visual Modalities scholar 2022 transformer with 3 cls. heads ViT-like image/video enc. patch embeddings ImageNet, Kinetics, SSv2, SUN RGB-D image cls., action recog., depth est. image + video + depth map
Polyvit: Co-training vision transformers on images, videos and audio scholar 2021 transformer with 9 cls. heads ViT-like image/video/audio enc. embeddings ImageNet, CIFAR, Kinetics, Moments in Time, AudioSet, VGGSound image cls., video cls., audio cls. image + video + audio

action recognition

with transformers

name paper year model features datasets tasks
Frozen CLIP Models are Efficient Video Learners scholar 2022 transformer with 1 cls head CLIP image embeddings ImageNet, Kinetics, SSv2 action recognition
Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training scholar 2022 transformer with 1 cls head ViT-like video enc. patch embeddings Kinetics, SSv2 action recognition
Bevt: Bert pretraining of video transformers scholar 2022 encoder-decoder transformer VideoSwin image/video enc. embeddings Kinetics, SSv2 action recognition
Video swin transformer scholar 2022 Swin trans. with cls.head Swin video enc. embeddings Kinetics, SSv2 action recognition
Is space-time attention all you need for video understanding? scholar 2021 transformer with cls. head ViT-like video enc. patch embeddings Kinetics, SSv2 action recognition

with 3D CNNs

name paper year model features datasets tasks
X3d: Expanding architectures for efficient video recognition scholar 2020 CNN with cls. head 3D CNN based video enc. embeddings Kinetics, SSv2 action recognition
Slowfast networks for video recognition scholar 2019 CNN with cls. head 3D CNN based video enc. embeddings Kinetics, SSv2 action recognition
A closer look at spatiotemporal convolutions for action recognition (R2+1D) scholar 2018 CNN with cls. head 3D CNN based video enc. embeddings Kinetics, HMDB-51, UCF-101 action recognition
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (I3D) scholar 2017 CNN with cls. head 3D CNN based video enc. embeddings Kinetics, HMDB-51, UCF-101 action recognition

contrastive representation learning

name paper date
Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text scholar 2021
Supervised contrastive learning scholar 2020

review papers

name paper date
Machine Learning Models for Content Classification in Film Censorship and Rating pdf 2022
A survey of artificial intelligence strategies for automatic detection of sexually explicit videos scholar 2022
A survey on video content rating: taxonomy, challenges and open issues pdf 2021
Multimodal Learning with Transformers: A Survey scholar 2022
A Survey Paper on Movie Trailer Genre Detection scholar 2020

tools

name url description
better-profanity github fast swear word detection from strings
PySceneDetect github Python and OpenCV-based scene cut/transition detection program & library
LAION safety toolkit github NSFW detector trained on LAION dataset
pysrt github Python parser for SubRip (srt) files
ffsubsync github Automagically synchronize subtitles with video.
MoviePy github Video editing with Python

More Repositories

1

yolov5-pip

Packaged version of ultralytics/yolov5 + many extra features
Python
291
star
2

craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
Python
252
star
3

streamlit-image-comparison

Image comparison slider component for Streamlit
Python
219
star
4

small-object-detection-benchmark

icip2022 paper: sahi benchmark on visdrone and xview datasets using fcos, vfnet and tood detectors
Python
157
star
5

video-transformers

Easiest way of fine-tuning HuggingFace video classification models
Python
131
star
6

pywhisper

openai/whisper + extra features
Python
89
star
7

balanced-loss

Easy to use class balanced cross entropy and focal loss implementation for Pytorch
Python
87
star
8

midv500

Download and convert MIDV-500 annotations to COCO instance segmentation format
Python
83
star
9

ultralyticsplus

Huggingface utilities for Ultralytics/YOLOv8
Python
77
star
10

flask-redis-docker

A minimal template for dockerized flask app with redis task queue
Python
58
star
11

instafake-dataset

Dataset for Intagram Fake and Automated Account Detection
Python
49
star
12

face-recognition-app-tutorial

A face recognition web app powered by Facenet model using Flask, OpenCV, Heroku
HTML
35
star
13

mmdetection-object-tracker

A lightweight script for performing Kalman filter based object tracking using MMDetection models.
Python
22
star
14

augmented-maskrcnn

Object detection and instance segmentation on MaskRCNN with torchvision, albumentations, tensorboard and cocoapi. Supports custom coco datasets with positive/negative samples.
Python
19
star
15

confplot

Confusion Matrix in Python: Plot a pretty confusion matrix (like Matlab) in python using seaborn and matplotlib
Python
11
star
16

face-detection-app-tutorial

A face detection web app powered by SSD face detecctor using Flask, OpenCV, Heroku
Jupyter Notebook
7
star
17

ieee-fraud-detection

IEEE Fraud Detection with XGBoost and CatBoost
Jupyter Notebook
5
star
18

yolov5-to-supervisely

Use your yolov5 predictions as supervisely annotations
Python
4
star
19

cifar100-resnet

ResNet Implementation for CIFAR100 in Pytorch
Jupyter Notebook
4
star
20

turkish-qa-datasets

creating this repo to host some turkish nlp datasets
3
star
21

earth2-scraper

Up-to-date earth2.io data
Python
3
star
22

musicalpy

Easiest way of combining a music and a video
Python
2
star
23

insta-assist

Personal Instagram Tools
Python
2
star
24

fcakyon

2
star
25

deprem-uydu-bina-tespiti

Instance segmentation ve change detection ile uydu goruntusunden bina tespiti
Python
2
star
26

gpt2-shakespeare

A tutorial on GPT2 language model training with texts from Shakespeare
Jupyter Notebook
1
star
27

glassdoor-review-textgenrnn

Train char-rnn with Glassdoor reviews and generate sentences
Python
1
star
28

DiyarMobileFood

C#
1
star
29

public-files

personal repo for hosting large files
1
star