There are no reviews yet. Be the first to send feedback to the community and the maintainers!
mammoth
An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learningmeshed-memory-transformer
Meshed-Memory Transformer for Image Captioning. CVPR 2020dress-code
Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022multimodal-garment-designer
This is the official repository for the paper "Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing". ICCV 2023show-control-and-tell
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019novelty-detection
Latent space autoregression for novelty detection.LLaVA-MORE
LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1art2real
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation. CVPR 2019VKD
PyTorch code for ECCV 2020 paper: "Robust Re-Identification by Multiple Views Knowledge Distillation"VATr
open-fashion-clip
This is the official repository for the paper "OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data". ICIAP 2023pacscore
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023STAGE_action_detection
Code of the STAGE module for video action detectionhuman-pose-annotation-tool
Human Pose Annotation Toolsafe-clip
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024awesome-human-visual-attention
This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.TransformerBasedGestureRecognition
speaksee
PyTorch library for Visual-Semantic taskscamel
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022Ti-MGD
This is the official repository for the paper "Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing".RefiNet
mvad-names-dataset
M-VAD Names Dataset. Multimedia Tools and Applications (2019)DynamicConv-agent
PyTorch code for BMVC 2019 paper: Embodied Vision-and-Language Navigation with Dynamic Convolutional Filtersperceive-transform-and-act
PyTorch code for the paper: "Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation"freeda
FreeDA: Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation (CVPR 2024)CoDE
[ECCV'24] Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similaritiesmcmr
PyTorch code for 3DV 2021 paper: "Multi-Category Mesh Reconstruction From Image Collections"PMA-Net
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023MaPeT
Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Trainingfocus-on-impact
LiDER
Official implementation of "On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning"HWD
LoCoNav
CSL-TAL
Pytorch code for ECCVW 2022 paper "Consistency-based Self-supervised Learning for Temporal Anomaly Localization"Alfie
Democratising RGBA Image Generation With No $$$ (AI4VA@ECCV24)DiCO
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization (BMVC 2024)COCOFake
FourBi
Binarizing Documents by Leveraging both Space and Frequency. (ICDAR 2024)bridge-score
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues. ECCV 2024RMSNet_Soccer
PyTorch code for RMS-NetADCC
mugat
Official implementation of our ECCVW paper "ΞΌgat: Improving Single-Page Document Parsing by Providing Multi-Page Context"aimagelab-srv
AImageLab-SRV wiki, support, code snippets and best practices.CSSL
Code implementation for "Continual Semi-Supervised Learning through Contrastive Interpolation Consistency"rpe_spdh
PyTorch code for IEEE RA-L paper: "Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from Depth Maps"MAD
Official PyTorch implementation for "Semantically Coherent Montages by Merging and Splitting Diffusion Paths", presenting the Merge-Attend-Diffuse operator (ECCV24)vffc
LAM
The Ludovico Antonio Muratori (LAM) dataset is the largest line-level HTR dataset to date and contains 25,823 lines from Italian ancient manuscripts edited by a single author over 60 years. The dataset comes in two configurations: a basic splitting and a date-based splitting which takes into account the age of the author. The first setting is intended to study HTR on ancient documents in Italian, while the second focuses on the ability of HTR systems to recognize text written by the same writer in time periods for which training data are not available.aidlda_tutorial
A tutorial on PyTorch - AI-DLDA 2018Emuru
unveiling-the-truth
DefConvs_HTR
Boosting modern and historical handwritten text recognition with deformable convolutions (ICPR20, IJDAR22)cvcs2023
Teddy
FourBi_old
CaSpeR
Code implementation for "Latent Spectral Regularization for Continual Learning"Love Open Source and this site? Check out how you can help us