foamliu/Image-Captioning-PyTorch

Stars
153
Rank 243,368 (Top 5 %)
Language
Python
License
Apache License 2.0
Created over 6 years ago
Updated almost 5 years ago

foamliu/Image-Captioning-PyTorch

foamliu

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

图像中文描述+视觉注意力

图像中文描述

图像中文描述 + 视觉注意力的 PyTorch 实现。

Show, Attend, and Tell 是令人惊叹的工作，这里是作者的原始实现。

这个模型学会了“往哪瞅”：当模型逐词生成标题时，模型的目光在图像上移动以专注于跟下一个词最相关的部分。

依赖

Python 3.5
PyTorch 0.4

数据集

使用 AI Challenger 2017 的图像中文描述数据集，包含30万张图片，150万句中文描述。训练集：210,000 张，验证集：30,000 张，测试集 A：30,000 张，测试集 B：30,000 张。

下载点这里：图像中文描述数据集，放在 data 目录下。

网络结构

用法

数据预处理

提取210,000 张训练图片和30,000 张验证图片：

$ python pre_process.py

训练

$ python train.py

可视化训练过程，执行：

$ tensorboard --logdir path_to_current_dir/logs

演示

下载预训练模型放在 models 目录，然后执行:

$ python demo.py

原图	注意力

小小的赞助~

若对您有帮助可给予小小的赞助~

Deep-Image-Matting

Deep Image Matting

Deep-Image-Matting-PyTorch

Deep Image Matting implementation in PyTorch

Car-Recognition

Car Recognition with Deep Learning

InsightFace-v2

PyTorch implementation of Additive Angular Margin Loss for Deep Face Recognition.

InsightFace-PyTorch

PyTorch implementation of Additive Angular Margin Loss for Deep Face Recognition.

Machine-Translation

中英机器文本翻译

Self-Attention-Keras

自注意力与文本分类

Sentiment-Analysis

细粒度用户评论情感分析

Speech-Transformer

PyTorch re-implementation of Speech-Transformer

Image-Captioning

图像中文描述

Jupyter Notebook

Transformer

英中文本机器翻译的

Look-Into-Person

This repository is to do Human Parsing with SegNet.

Autoencoder

Convolutional Autoencoder with SetNet in PyTorch

Tacotron2-Mandarin

PyTorch reimplementation of Tacotron2 in Mandarin

Age-and-Gender

同时识别年龄与性别

MobileFaceNet

PyTorch implementation of MobileFaceNets

Scene-Classification

微调 Inception-ResNet-V2, 解决 AI Challenger 2017 场景分类问题。

Mobile-Image-Matting

a lightweight image matting model

Face-Alignment

Face alignment with similarity transform based on MTCNN and RetinaFace.

FaceNet

Face recognition using Keras

Machine-Translation-v2

英中机器文本翻译

Face-Attributes-Mobile

Regress Face Attributes with MobileNetV2

Listen-Attend-Spell-v2

PyTorch implementation of Listen Attend and Spell Automatic Speech Recognition (ASR).

Facial-Expression-Prediction

Facial Expression Prediction with Deep Learning

Crop-Disease-Detection

AI Challenger 2018 农作物病害检测

Look-Into-Person-PyTorch

Human Parsing with DeepLabv3 in PyTorch.

Colorful-Image-Colorization

This is a keras implementation of paper Colorful Image Colorization.

Scene-Understanding

室内语义分割

Zero-Shot-Learning

零样本学习

InsightFace

复现 ArcFace 论文

Conv-Autoencoder

Convolutional Autoencoder

Reading-Comprehension

DMN+ 模型的PyTorch 实现（中文数据集）

Transformer-v2

英中文本机器翻译

EAST

EAST: An Efficient and Accurate Scene Text Detector.

MobileFaceNet-PyTorch

PyTorch implementation of MobileFaceNets

Speaker-Embeddings

PyTorch implementation of a self-attentive speaker embedding

Think-Bayes

贝叶斯思维

Super-Resolution-Net

SRNet 的 Keras 实现

Language-Model

基于 PyTorch 范例实现中文语言模型。

MDSR

MDSR 的 Keras 实现

Video-Matching

Keypoints

Person Keypoint Detection in PyTorch

Chatbot

聊天机器人

Neural-Style-Transfer

图像风格迁移

3D-Object-Detection

Indoor Semantic Segmentation

Face-Attributes

Deep Face Attributes

Gaze-Estimation

Estimating human gaze from natural eye images.

Car-Recognition-PyTorch

hackathon-ocw

HomographyNet

estimate the relative homography between a pair of images

Chatbot-v2

聊天机器人

TwinsOrNot

Twins Or Not 测试人脸相似度

Listen-Attend-Spell

PyTorch implementation of Listen Attend and Spell Automatic Speech Recognition (ASR).

GST-Tacotron-v2

PyTorch implementation of Style Tokens

CRNN-v2

PyTorch re-implementation of CRNN

Class-Rebalancing

分类问题中数量不均衡造成的影响和解决之道

EAST-v2

EAST trained on COCO-Text

CRNN

PyTorch re-implementation of CRNN: Convolutional Recurrent Neural Network

SegNet

CVPR 2018 WAD Video Segmentation Challenge with SegNet

Fundus_Lesion2018

眼底病变自动分割

GNN-Tutorial-Recsys2015

RecSys Challenge 2015

Visual-Question-Answering

This is an PyTorch implementation of DMN+ model on MSCOCO VQA dataset.

MobileFaceNet-Grayscale

MobileFaceNets trained with grayscale images

Facial-Expression-Prediction-v2

Facial Expression Prediction

Gaze

Real-Time Video Analytics Service Platform

Hanging-Company-Logo

NComputerVision

NComputerVision is a fast computer vision algorithm library written in C#.

Image-Matching

框架图像识别

YOLO-Face-Detection

Hello-SMPL

Invisibility-Cloak

FaceNet-v2

SHALE

RetinafaceWrapper

a wrapper of https://github.com/biubug6/Pytorch_Retinaface

MTCNN

MTCNN 论文预测部分的 PyTorch 实现

DeepRankIQA

Image-Quality-Assessment

3DDFA

Complex-Analysis

Visual Complex Analysis

Jupyter Notebook

Mobile-Image-Colorization

StyleGAN-PyTorch

Convex-Optimization

NWebCrawler

DeepIQA

Semantic-Segmentation

A Comparative Study of Semantic Segmentation

facesdk

A python package for face analysis.

Tacotron2-Khmer

Image-Inpainting

基于反卷积网络实现图像补绘

GST-Tacotron-Uyghur

Short-Text-Similarity

Beta-Distribution

Gaze-Estimation-MPIIGaze

Baidu-Segmentation-Test

Simple-Captioning

Bare minimal code to run image captioning demo.

Star-Recognition

明星人脸识别

Dynamic-Memory-Network-Plus

A Pytorch implementation of Dynamic memory Network Plus

Pointcloud-Classifier

Pointcloud classification with kaolin

Griffin-Lim

Remove-Forged

FiatCoinNet