Language-Image Quality Evaluator (LIQE)
The official repo of Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective (CVPR2023)
Abstract
We aim at advancing blind image quality assessment (BIQA), which predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks, in a way that the model parameter sharing and the loss weighting are determined automatically. Specifically, we first describe all candidate label combinations (from multiple tasks) using a textual template, and compute the joint probability from the cosine similarities of the visual-textual embeddings. Predictions of each task can be inferred from the joint distribution, and optimized by carefully designed loss functions. Through comprehensive experiments on learning three tasks - BIQA, scene classification, and distortion type identification, we verify that the proposed BIQA method 1) benefits from the scene classification and distortion type identification tasks and outperforms the state-of-the-art on multiple IQA datasets, 2) is more robust in the group maximum differentiation competition, and 3) realigns the quality annotations from different IQA datasets more effectively.
Requirement
torch 1.8+
torchvision
Python 3
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git
Training on 10 splits
python train_unique_clip_weight.py
Evaluation on test-sets
python BIQA_benchmark.py
Demo: detailed pipeline
python demo.py
Demo2: import LIQE as a standalone module and perform inference
python demo2.py
Pre-trained weights
Google Drive:
https://drive.google.com/file/d/1GoKwUKNR-rvX11QbKRN8MuBZw2hXKHGh/view?usp=sharing
็พๅบฆ็ฝ็๏ผ
้พๆฅ: https://pan.baidu.com/s/1KHjj7T8y2H_eKE6w7HnWJA ๆๅ็ : 2b8v
New! IQA-PyTorch implementation
IQA-PyTorch supports LIQE now! Can be easily used as follows:
import pyiqa
model = pyiqa.create_metric('liqe', as_loss=False) #Re-trained on the official set of KonIQ-10k
score = model(img_path)
or
import pyiqa
model = pyiqa.create_metric('liqe_mix', as_loss=False) #Trained on multiple datasets as in the paper.
score = model(img_path)
Zero-shot (cross-database) performance (SRCC) on the AIGC datasets.
BIQA Model | AGIQA-3K | AGIQA-1K | SJTU-H3D | AIGCIQA2023 | Paper |
---|---|---|---|---|---|
DBCNN | 0.6454 | 0.5133 | 0.4560 | 0.7301 | TCSVT2020 |
HyperIQA | 0.6291 | 0.5253 | 0.2696 | 0.7211 | CVPR2020 |
TReS | 0.6460 | 0.5101 | 0.2700 | 0.7410 | WACV2022 |
UNIQUE | 0.6659 | 0.4596 | 0.7523 | 0.7605 | TIP2021 |
MUSIQ | 0.6294 | 0.5254 | 0.5313 | 0.7358 | ICCV2021 |
PaQ-2-PiQ | 0.5023 | 0.5378 | 0.2683 | 0.6425 | CVPR2020 |
CLIPIQA | 0.6580 | 0.3411 | -0.0793 | 0.6088 | AAAI2023 |
CLIPIQA+ | 0.6831 | 0.4461 | 0.5567 | 0.7158 | AAAI2023 |
MANIQA | 0.6950 | 0.6180 | 0.4523 | 0.7282 | CVPRW2022 |
LIQE (Ours) | 0.7212 | 0.5785 | 0.6716 | 0.7435 | CVPR2023 |
Citation
@inproceedings{zhang2023liqe,
title={Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective},
author={Zhang, Weixia and Zhai, Guangtao and Wei, Ying and Yang, Xiaokang and Ma, Kede},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={14071--14081},
year={2023}
}