There are no reviews yet. Be the first to send feedback to the community and the maintainers!
GFPGAN
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.PhotoMaker
PhotoMaker [CVPR 2024]T2I-Adapter
T2I-AdapterInstantMesh
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction ModelsBrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]MasaCtrl
[ICCV 2023] Consistent Image Synthesis and EditingSEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language ModelLLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.Mix-of-Show
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion ModelsOpen-MAGVIT2
Open-MAGVIT2: Democratizing Autoregressive Visual GenerationAnimeSR
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"VQFR
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel DecoderCustomNet
SmartEdit
Official code of SmartEdit [CVPR-2024 Highlight]UMT
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.MM-RealSR
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"ViT-Lens
[CVPR 2024] ViT-Lens: Towards Omni-modal RepresentationsMCQ
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).DeSRA
Official codes for DeSRA (ICML 2023)FAIG
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-ResolutionArcNerf
Nerf and extensions in allST-LLM
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"SurfelNeRF
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor ScenesRepSR
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"mllm-npu
mllm-npu: training multimodal large language models on Ascend NPUsHOSNeRF
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single VideoFastRealVSR
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"ConMIM
Official codes for ConMIM (ICLR 2023)GVT
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".TVTS
Turning to Video for Transcript SortingViSFT
pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)Efficient-VSR-Training
Codes for "Accelerating the Training of Video Super-Resolution"DTN
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.OpenCompatible
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.BTS
BTS: A Bi-lingual Benchmark for Text Segmentation in the WildSGAT4PASS
This is the official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation (IJCAI 2023)SFDA
TaCA
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".Plot2Code
common_trainer
Common template for pytorch project. Easy to extent and modify for new project.TransFusion
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.BasicVQ-GEN
ArcVis
Visualization of 3d and 2d components interactively.VTLayout
Love Open Source and this site? Check out how you can help us