Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Scala

JavaScript

Clojure

C#

C++

TypeScript

Python

Perl

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Ruby

R

Lua

MATLAB

Zig

Crystal

Dart

Perl

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇳🇨 New Caledonia

🇭🇷 Croatia

🇸🇬 Singapore

🇫🇯 Fiji

🇮🇩 Indonesia

🇸🇮 Slovenia

🇨🇼 Curaçao

🇧🇩 Bangladesh

All Countries Compare Countries

xmed-lab/CLIP_Surgery

Stars
346
Rank 122,430 (Top 3 %)
Language
Jupyter Notebook
Created over 1 year ago
Updated over 1 year ago

xmed-lab/CLIP_Surgery

xmed-lab

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks

CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks (arxiv)

Introduction

This work focuses on the explainability of CLIP via its raw predictions. We identify two problems about CLIP's explainability: opposite visualization and noisy activations. Then we propose the CLIP Surgery, which does not require any fine-tuning or additional supervision. It greatly improves the explainability of CLIP, and enhances downstream open-vocabulary tasks such as multi-label recognition, semantic segmentation, interactive segmentation (specifically the Segment Anything Model, SAM), and multimodal visualization. Currently, we offer a simple demo for interpretability analysis, and how to convert text to point prompts for SAM. Rest codes including evaluation and other tasks will be released later.

Opposite visualization is due to wrong relation in self-attention:

Noisy activations is owing to redundant features across lables:

Our visualization results:

Text2Points to guide SAM:

Multimodal visualization:

Segmentation results:

Multilabel results:

Demo

Firstly to install the SAM, and download the model

pip install git+https://github.com/facebookresearch/segment-anything.git
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

Then explain CLIP via jupyter demo "demo.ipynb". Or use the python file:

python demo.py

(Note: demo's results are slightly different from the experimental code, specifically no apex amp fp16 for easier to use.)

Cite

@misc{li2023clip,
      title={CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks}, 
      author={Yi Li and Hualiang Wang and Yiqun Duan and Xiaomeng Li},
      year={2023},
      eprint={2304.05653},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

CLIPN

ICCV 2023: CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No

GenericSSL

NeurIPS 2023: Towards Generic Semi-Supervised Framework for Volumetric Medical Image Segmentation

AllSpark

CVPR 2024: AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation

DIF-Net

MICCAI 2023: Learning Deep Intensity Field for Extremely Sparse-View CBCT Reconstruction

TriALS

MICCAI 2024: nnUNet incorporating additional baselines as SAMed️, Mamba Variants, and MedNeXT to establish a benchmark for segmentation challenges.

DHC

MICCAI 2023: DHC: Dual-debiased Heterogeneous Co-training Framework for Class-imbalanced Semi-supervised Medical Image Segmentation

NuInstruct

RSCFed

CVPR 2022: RSCFed: Random Sampling Consensus Federated Semi-supervised Learning

CLD-Semi

MICCAI 2022: Calibrating Label Distribution for Class-Imbalanced Barely-Supervised Knee Segmentation

EPL_SemiDG

AAAI 2022: Enhancing Pseudo Label Quality for Semi-Supervised Domain-Generalized Medical Image Segmentation

URN

AAAI 2022: Uncertainty Estimation via Response Scaling for Pseudo-Mask Noise Mitigation in Weakly-Supervised Semantic Segmentation

OEEM

MICCAI 2022: Online Easy Example Mining for Weakly-supervised Gland Segmentation from Histology Images

Jupyter Notebook

GraphEcho

ICCV 2023, "GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation"

DIF-Gaussian

MICCAI 2024: Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

FSDiffReg

MICCAI 2023: FSDiffReg: Feature-wise and Score-wise Diffusion-guided Unsupervised Deformable Image Registration for Cardiac Images

C2RV-CBCT

CVPR 2024, "C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction"

UCVME

AAAI 2023: Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks

FDDM

MICCAI 2023: Fundus-Enhanced Disease-Aware Distillation Model for Retinal Disease Classification from OCT Images

AdaCon

IEEE TMI 2021: AdaCon: Adaptive Contrast for Image Regression in Computer-Aided Disease Assessment

CSS-SemiVideo

IEEE TMI 2022: Cyclical Self-Supervision for Semi-Supervised Ejection Fraction Prediction from Echocardiogram Videos

TimeStamp-Surgical

TMI 2023: Less is More: Surgical Phase Recognition from Timestamp Supervision

ECBM

ICLR 2024: Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations

Jupyter Notebook

CPR

MICCAI 2023: Context-Aware Pseudo-Label Refinement for Source-Free Domain Adaptive Fundus Image Segmentation

HCGNet

J-BHI 2024: Exploiting Hierarchical Interactions for Protein Surface Learning

Fed-MAS

MICCAI 2023 DeCaF Best Paper Award: Federated Model Aggregation via Self-Supervised Priors for Highly Imbalanced Medical Image Classification

SAHC

IEEE TMI 2022: Exploring Segment-level Semantics for Online Phase Recognition from Surgical Videos

SC-Cor

ECCV 2022: Learning Shadow Correspondence for Video Shadow Detection

DistillingSelf

MICCAI 2022: Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

ICLIP

Exploring Visual Interpretability for Contrastive Language-Image Pretraining

RIDL

MICCAI 2023: Radiomics-Informed Deep Learning for Classification of Atrial Fibrillation Sub-Types from Left-Atrium CT Volumes

FreeSeg

FreeSeg: Free Mask from Interpretable Contrastive Language-Image Pretraining for Semantic Segmentation

TripleE-DG

MSSG

MICCAI 2023: Morphology-inspired Unsupervised Gland Segmentation via Selective Semantic Grouping

SCAN

DiffCMR

FoPro-KD

TMI 2023: FoPro-KD: Fourier Prompted Effective Knowledge Distillation for Long-Tailed Medical Image Recognition

ToMo-UDA

[ICML' 24] Unsupervised Domain Adaptation for Anatomical Structure Detection in Ultrasound Images.

FD-SOS

MICCAI 2024 Oral: Vision-Language Open-Set Detectors for Bone Fenestration and Dehiscence Detection from Intraoral Images

M3-UDA

CVPR M^3-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection

CLSS

SEDSkill

VDPL

Variance-Aware Domain-Augmented Pseudo Labeling for Semi-Supervised Domain Generalization on Medical Image Segmentation

GL-Fusion

MICCAI 2023: GL-Fusion: Global-Local Fusion Network for Multi-view Echocardiogram Video Segmentation

NumCLIP

[ECCV 2024] Teach CLIP to Develop a Number Sense for Ordinal Regression

DrugRec

CardiacNet

NGOAT

GPTrack

DDAug

ICONIP 2023: Dynamic Data Augmentation via Monte-Carlo Tree Search for Prostate MRI Segmentation

CoFA