• Stars
    star
    8
  • Rank 2,099,232 (Top 42 %)
  • Language
    Python
  • Created almost 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pytorch implementation for image caption baseline model

More Repositories

1

MLE-LLaMA

Multi-language Enhanced LLaMA
Python
301
star
2

Visual-LLaMA

Open LLaMA Eyes to See the World
Python
172
star
3

IEA

Image Editing Anything
Python
107
star
4

DiS

Scalable Diffusion Models with State Space Backbone
Python
101
star
5

Video-Stable-Diffusion

Generate consistent videos with stable diffusion models
Python
45
star
6

Gradient-Free-Textual-Inversion

Gradient-Free Textual Inversion for Personalized Text-to-Image Generation
Python
33
star
7

Stable-Edit

Text-based real image editing with stable diffusion models
Python
25
star
8

Perceiver-Music-Generation

music generation with perceiver-ar model
Python
24
star
9

DeeCap

Dynamic Early Exit for Image Captioning
Python
16
star
10

Vespa

Video Diffusion State Space Models
Python
15
star
11

Visual-ChatGLM

Open ChatGLM Eyes to See the World
Python
13
star
12

PNAIC

Partially Non-Autoregressive Image Captioning
Python
10
star
13

AIO

All In One: General Multimodal Large Language Model
Python
9
star
14

Future-Caption

Efficient modeling of future context for image captioning
Python
8
star
15

Meta-Ensemble

Meta-Ensemble Parameter Learning
Python
8
star
16

UAIC

Uncertainty-away image caption generation
Python
7
star
17

Dialogue-System

Multi-modal dialogue system
Python
5
star
18

Latent-Dynamics

Exploring latent dynamics for visual storytelling
Python
4
star
19

MaskGMT

Masked generative music transformer
Python
4
star
20

Matrix-Analysis-and-Application

References and coding homework in matrix analysis and application course in UCAS
Python
3
star
21

Cleaned-Webvid

Use strategy to achieve clean webvid-10m dataset
Python
3
star
22

Diverse-Image-Caption

Promoting Coherence and Diversity in Image Captioning
Python
3
star
23

Visual-MOSS

Makes MOSS model understand visual information
Python
3
star
24

ACSG

Actor-Critic Sequence Generation for Relative Difference Captioning
2
star
25

LQMA

Language Quantized Masked AutoEncoders
Python
2
star
26

DSC

descriptive synthetic captions in dalle3
2
star
27

feizc

2
star
28

MAIC

Memory augmented image captioning
Python
2
star
29

SAIC

Semi-Autoregressive Image Captioning
2
star
30

arXiv-MM

Multimodal dataset for arXiv
Python
2
star
31

DiffuCap

Controllable Image Captioning with Diffusion Model
2
star
32

Union

Unifying Language-Image Pre-training via Single-Tower Transformer
Python
2
star
33

AAT

Attention-Aligned Transformer for Image Captioning
Python
2
star
34

CLIP-MAE

When clip meet mae and beyond
Python
2
star
35

Chinese-Image-Caption

An image captioner with Chinese language
Python
2
star
36

ViD

Text-to-Image Diffusion Models as Refined Visual Learners
Python
1
star
37

Meta-ViT

Meta-ensemble parameter learning for Vision Transformer
Python
1
star
38

ClipCap

Incorporating CLIP features into Transformer-based image captioning
Python
1
star
39

CLKA

Cross Lingual Knowledge Alignment for Stable Diffusion Models
Python
1
star
40

Diffusion-Model

A tutorial of diffusion model for text-guide image generation
Python
1
star
41

LLaMA-XL

LLaMA model Beyond Length Limitation
1
star
42

GameTag

official implementation for GameTag algorithm
Python
1
star
43

MoE-MLLM

Mixture-of-Experts for Multimodal Large Language Models
Python
1
star