MatchFormer
MatchFormer: Interleaving Attention in Transformers for Feature Matching
Qing Wangβ, Jiaming Zhangβ, Kailun Yangβ , Kunyu Peng, Rainer Stiefelhagen
β denotes equal contribution and β denotes corresponding author
News
- [09/2022] MatchFormer [PDF] is accepted to ACCV2022.
Introduction
In this work, we propose a novel hierarchical extract-and-match transformer, termed as MatchFormer. Inside each stage of the hierarchical encoder, we interleave self-attention for feature extraction and cross-attention for feature matching, enabling a human-intuitive extract-and-match scheme.
More detailed can be found in our arxiv paper.
Installation
The requirements are listed in the requirement.txt
file. To create your own environment, an example is:
conda create -n matchformer python=3.7
conda activate matchformer
cd /path/to/matchformer
pip install -r requirement.txt
Datasets
You can prepare the test dataset in the same way as LoFTR, place the dataset and index in the data directory.
A structure of dataset should be:
data
βββ scannet
βΒ Β βββ index
βΒ Β βΒ Β βββ intrinsics.npz
βΒ Β β βββ scannet_test.txt
β β βββ test.npz
βΒ Β βββ test
βΒ Β βββ scene0707_00
β βββ ...
β βββ scene0806_00
βββ megadepth
βββ index
β βββ 0015_0.1_0.3.npz
Β Β β βββ ...
β βββ 0022_0.5_0.7.npz
β βββ megadepth_test_1500.txt
βββ test
βββ Undistorted_SfM
βββ phoenix
Evaluation
The evaluation configurations can be adjusted at /config/defaultmf.py
The weights can be downloaded in Google Drive.
Put the weight at model/weights
.
Indoor:
# adjust large SEA model config:
MATCHFORMER.BACKBONE_TYPE = 'largesea'
MATCHFORMER.SCENS = 'indoor'
MATCHFORMER.RESOLUTION = (8,2)
MATCHFORMER.COARSE.D_MODEL = 256
MATCHFORMER.COARSE.D_FFN = 256
python test.py /config/data/scannet_test_1500.py --ckpt_path /model/weights/indoor-large-SEA.ckpt --gpus=1 --accelerator="ddp"
# adjust lite LA model config:
MATCHFORMER.BACKBONE_TYPE = 'litela'
MATCHFORMER.SCENS = 'indoor'
MATCHFORMER.RESOLUTION = (8,4)
MATCHFORMER.COARSE.D_MODEL = 192
MATCHFORMER.COARSE.D_FFN = 192
python test.py /config/data/scannet_test_1500.py --ckpt_path /model/weights/indoor-lite-LA.ckpt --gpus=1 --accelerator="ddp"
Outdoor:
# adjust large LA model config:
MATCHFORMER.BACKBONE_TYPE = 'largela'
MATCHFORMER.SCENS = 'outdoor'
MATCHFORMER.RESOLUTION = (8,2)
MATCHFORMER.COARSE.D_MODEL = 256
MATCHFORMER.COARSE.D_FFN = 256
python test.py /config/data/megadepth_test_1500.py --ckpt_path /model/weights/outdoor-large-LA.ckpt --gpus=1 --accelerator="ddp"
# adjust lite SEA model config:
MATCHFORMER.BACKBONE_TYPE = 'litesea'
MATCHFORMER.SCENS = 'outdoor'
MATCHFORMER.RESOLUTION = (8,4)
MATCHFORMER.COARSE.D_MODEL = 192
MATCHFORMER.COARSE.D_FFN = 192
python test.py /config/data/megadepth_test_1500.py --ckpt_path /model/weights/indoor-large-SEA.ckpt --gpus=1 --accelerator="ddp"
Training
Based on the LOFTER code to train MatchFormer, replace LoFTR/src/loftr/backbone/ with model/backbone/match_**.py to train.
Citation
If you are interested in this work, please cite the following work:
@inproceedings{wang2022matchformer,
title={MatchFormer: Interleaving Attention in Transformers for Feature Matching},
author={Wang, Qing and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Stiefelhagen, Rainer},
booktitle={Asian Conference on Computer Vision},
year={2022}
}
Acknowledgments
Our work is based on LoFTR and we use their code. We appreciate the previous open-source repository LoFTR.