• Stars
    star
    442
  • Rank 95,263 (Top 2 %)
  • Language
    Python
  • License
    Other
  • Created over 4 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

R2D2: Reliable and Repeatable Detector and Descriptor

This repository contains the implementation of the following paper:

@inproceedings{r2d2,
  author    = {Jerome Revaud and Philippe Weinzaepfel and C{\'{e}}sar Roberto de Souza and
               Martin Humenberger},
  title     = {{R2D2:} Repeatable and Reliable Detector and Descriptor},
  booktitle = {NeurIPS},
  year      = {2019},
}

Fast-R2D2

This repository also contains the code needed to train and extract Fast-R2D2 keypoints. Fast-R2D2 is a revised version of R2D2 that is significantly faster, uses less memory yet achieves the same order of precision as the original network.

License

Our code is released under the Creative Commons BY-NC-SA 3.0 (see LICENSE for more details), available only for non-commercial use.

Getting started

You just need Python 3.6+ equipped with standard scientific packages and PyTorch1.1+. Typically, conda is one of the easiest way to get started:

conda install python tqdm pillow numpy matplotlib scipy
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Pretrained models

For your convenience, we provide five pre-trained models in the models/ folder:

  • r2d2_WAF_N16.pt: this is the model used in most experiments of the paper (on HPatches MMA@3=0.686). It was trained with Web images (W), Aachen day-time images (A) and Aachen optical flow pairs (F)
  • r2d2_WASF_N16.pt: this is the model used in the visual localization experiments (on HPatches MMA@3=0.721). It was trained with Web images (W), Aachen day-time images (A), Aachen day-night synthetic pairs (S), and Aachen optical flow pairs (F).
  • r2d2_WASF_N8_big.pt: Same than previous model, but trained with N=8 instead of N=16 in the repeatability loss. In other words, it outputs a higher density of keypoints. This can be interesting for certain applications like visual localization, but it implies a drop in MMA since keypoints gets slighlty less reliable.
  • faster2d2_WASF_N16.pt: The Fast-R2D2 equivalent of r2d2_WASF_N16.pt
  • faster2d2_WASF_N8_big.pt: The Fast-R2D2 equivalent of r2d2_WASF_N8.pt

For more details about the training data, see the dedicated section below. Here is a table that summarizes the performance of each model:

model name model size
(#weights)
number of
keypoints
MMA@3 on
HPatches
r2d2_WAF_N16.pt 0.5M 5K 0.686
r2d2_WASF_N16.pt 0.5M 5K 0.721
r2d2_WASF_N8_big.pt 1.0M 10K 0.692
faster2d2_WASF_N8_big.pt 1.0M 5K 0.650

Feature extraction

To extract keypoints for a given image, simply execute:

python extract.py --model models/r2d2_WASF_N16.pt --images imgs/brooklyn.png --top-k 5000

This also works for multiple images (separated by spaces) or a .txt image list. For each image, this will save the top-k keypoints in a file with the same path as the image and a .r2d2 extension. For example, they will be saved in imgs/brooklyn.png.r2d2 for the sample command above.

The keypoint file is in the npz numpy format and contains 3 fields:

  • keypoints (N x 3): keypoint position (x, y and scale). Scale denotes here the patch diameters in pixels.
  • descriptors (N x 128): l2-normalized descriptors.
  • scores (N): keypoint scores (the higher the better).

Note: You can modify the extraction parameters (scale factor, scale range...). Run python extract.py --help for more information. By default, they corespond to what is used in the paper, i.e., a scale factor equal to 2^0.25 (--scale-f 1.189207) and image size in the range [256, 1024] (--min-size 256 --max-size 1024).

Note2: You can significantly improve the MMA@3 score (by ~4 pts) if you can afford more computations. To do so, you just need to increase the upper-limit on the scale range by replacing --min-size 256 --max-size 1024 with --min-size 0 --max-size 9999 --min-scale 0.3 --max-scale 1.0.

Feature extraction with kapture datasets

Kapture is a pivot file format, based on text and binary files, used to describe SFM (Structure From Motion) and more generally sensor-acquired data.

It is available at https://github.com/naver/kapture. It contains conversion tools for popular formats and several popular datasets are directly available in kapture.

It can be installed with:

pip install kapture

Datasets can be downloaded with:

kapture_download_dataset.py update
kapture_download_dataset.py list
# e.g.: install mapping and query of Extended-CMU-Seasons_slice22
kapture_download_dataset.py install "Extended-CMU-Seasons_slice22_*"

If you want to convert your own dataset into kapture, please find some examples here.

Once installed, you can extract keypoints for your kapture dataset with:

python extract_kapture.py --model models/r2d2_WASF_N16.pt --kapture-root pathto/yourkapturedataset --top-k 5000

Run python extract_kapture.py --help for more information on the extraction parameters.

Evaluation on HPatches

The evaluation is based on the code from D2-Net.

git clone https://github.com/mihaidusmanu/d2-net.git
cd d2-net/hpatches_sequences/
bash download.sh
bash download_cache.sh
cd ../..
ln -s d2-net/hpatches_sequences # finally create a soft-link

Once this is done, extract all the features:

python extract.py --model models/r2d2_WAF_N16.pt --images d2-net/image_list_hpatches_sequences.txt

Finally, evaluate using the iPython notebook d2-net/hpatches_sequences/HPatches-Sequences-Matching-Benchmark.ipynb. You should normally get the following MMA plot: image.

New: we have uploaded in the results/ folder some pre-computed plots that you can visualize using the aforementioned ipython notebook from d2-net (you need to place them in the d2-net/hpatches_sequences/cache/ folder).

  • r2d2_*_N16.size-256-1024.npy: keypoints were extracted using a limited image resolution (i.e. with python extract.py --min-size 256 --max-size 1024 ...)
  • r2d2_*_N16.scale-0.3-1.npy: keypoints were extracted using a full image resolution (i.e. with python extract.py --min-size 0 --max-size 9999 --min-scale 0.3 --max-scale 1.0).

Here is a summary of the results:

result file training set resolution MMA@3 on
HPatches
note
r2d2_W_N16.scale-0.3-1.npy W only full 0.699 no annotation whatsoever
r2d2_WAF_N16.size-256-1024.npy W+A+F 1024 px 0.686 as in NeurIPS paper
r2d2_WAF_N16.scale-0.3-1.npy W+A+F full 0.718 +3.2% just from resolution
r2d2_WASF_N16.size-256-1024.npy W+A+S+F 1024 px 0.721 with style transfer
r2d2_WASF_N16.scale-0.3-1.npy W+A+S+F full 0.758 +3.7% just from resolution

Evaluation on visuallocalization.net

In our paper, we report visual localization results on the Aachen Day-Night dataset (nighttime images) available at visuallocalization.net. We used the provided local feature evaluation pipeline provided here: https://github.com/tsattler/visuallocalizationbenchmark/tree/master/local_feature_evaluation In the meantime, the ground truth poses as well as the error thresholds of the Aachen nighttime images (which are used for the local feature evaluation) have been improved and changed on the website, thus, the original results reported in the paper cannot be reproduced.

Training the model

We provide all the code and data to retrain the model as described in the paper.

Downloading training data

The first step is to download the training data. First, create a folder that will host all data in a place where you have sufficient disk space (15 GB required).

DATA_ROOT=/path/to/data
mkdir -p $DATA_ROOT
ln -fs $DATA_ROOT data 
mkdir $DATA_ROOT/aachen

Then, manually download the Aachen dataset here and save it as $DATA_ROOT/aachen/database_and_query_images.zip. Finally, execute the download script to complete the installation. It will download the remaining training data and will extract all files properly.

./download_training_data.sh

The following datasets are now installed:

full name tag Disk # imgs # pairs python instance
Random Web images W 2.7GB 3125 3125 auto_pairs(web_images)
Aachen DB images A 2.5GB 4479 4479 auto_pairs(aachen_db_images)
Aachen style transfer pairs S 0.3GB 8115 3636 aachen_style_transfer_pairs
Aachen optical flow pairs F 2.9GB 4479 4770 aachen_flow_pairs

Note that you can visualize the content of each dataset using the following command:

python -m tools.dataloader "PairLoader(aachen_flow_pairs)"

image

Training details

To train the model, simply run this command:

python train.py --save-path /path/to/model.pt 

On a recent GPU, it takes 30 min per epoch, so ~12h for 25 epochs. You should get a model that scores 0.71 +/- 0.01 in MMA@3 on HPatches (this standard-deviation is similar to what is reported in Table 1 of the paper).

If you want to retrain fast-r2d2 architectures, run:

python train.py --save-path /path/to/fast-model.pt --net 'Fast_Quad_L2Net_ConfCFS()'

Note that you can fully configure the training (i.e. select the data sources, change the batch size, learning rate, number of epochs etc.). One easy way to improve the model is to train for more epochs, e.g. --epochs 50. For more details about all parameters, run python train.py --help.

More Repositories

1

billboard.js

πŸ“Š Re-usable, easy interface JavaScript chart library based on D3.js
TypeScript
5,723
star
2

fe-news

FE 기술 μ†Œμ‹ νλ ˆμ΄μ…˜ λ‰΄μŠ€λ ˆν„°
5,274
star
3

dust3r

DUSt3R: Geometric 3D Vision Made Easy
Python
3,409
star
4

egjs-flicking

🎠 ♻️ Everyday 30 million people experience. It's reliable, flexible and extendable carousel.
TypeScript
2,551
star
5

egjs-infinitegrid

A module used to arrange card elements including content infinitely on a grid layout.
TypeScript
1,869
star
6

ngrinder

enterprise level performance testing solution
Java
1,788
star
7

d2codingfont

D2 Coding κΈ€κΌ΄
1,774
star
8

egjs

Javascript components group that brings easiest and fastest way to build a web application in your way.
JavaScript
922
star
9

biobert-pretrained

BioBERT: a pre-trained biomedical language representation model for biomedical text mining
632
star
10

sqlova

Python
625
star
11

splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)
Python
618
star
12

deep-image-retrieval

End-to-end learning of deep visual representations for image retrieval
Python
615
star
13

fixture-monkey

Let Fixture Monkey generate test instances including edge cases automatically
Java
440
star
14

egjs-view360

360 integrated viewing solution
TypeScript
438
star
15

kapture

kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
Python
429
star
16

scavenger

A runtime dead code analysis tool
Java
383
star
17

yobi

Project hosting software - Deprecated
Java
379
star
18

roma

RoMa: A lightweight library to deal with 3D rotations in PyTorch.
Python
364
star
19

lispe

An implementation of a full fledged Lisp interpreter with Data Structure, Pattern Programming and High level Functions with Lazy Evaluation Γ  la Haskell.
C
357
star
20

lucy-xss-filter

HTML
319
star
21

arcus

ARCUS is the NAVER memcached with lists, sets, maps and b+trees. http://naver.github.io/arcus
Shell
300
star
22

spring-jdbc-plus

Spring JDBC Plus
Java
257
star
23

egjs-grid

A component that can arrange items according to the type of grids
TypeScript
253
star
24

kapture-localization

Provide mapping and localization pipelines based on kapture format
Python
251
star
25

android-imagecropview

android image crop library
Java
250
star
26

smarteditor2

Javascript WYSIWYG HTML editor
JavaScript
241
star
27

lucy-xss-servlet-filter

Java
237
star
28

claf

CLaF: Open-Source Clova Language Framework
Python
215
star
29

eslint-config-naver

Naver JavaScript Coding Conventions rules for eslint
JavaScript
205
star
30

kor2vec

OOV없이 λΉ λ₯΄κ³  μ •ν™•ν•œ ν•œκ΅­μ–΄ Embedding 라이브러리
Python
197
star
31

tamgu

Tamgu (탐ꡬ), a FIL programming language: Functional, Imperative, Logical all in one for annotation and data augmentation
C++
186
star
32

nlp-challenge

NLP Shared tasks (NER, SRL) using NSML
Python
176
star
33

nbase-arc

nbase-arc is an open source distributed memory store based on Redis
C
171
star
34

nanumfont

170
star
35

egjs-view3d

Fast & customizable 3D model viewer for everyone
TypeScript
170
star
36

hackday-conventions-java

캠퍼슀 핡데이 Java μ½”λ”© μ»¨λ²€μ…˜
169
star
37

egjs-axes

A module used to change the information of user action entered by various input devices such as touch screen or mouse into the logical virtual coordinates.
TypeScript
150
star
38

cgd

Combination of Multiple Global Descriptors for Image Retrieval
Python
144
star
39

croco

Python
137
star
40

volley-extensions

Volley Extensions v2.0.0. ( Volleyer, Volley requests, Volley caches, Volley custom views )
Java
134
star
41

naver-openapi-guide

CSS
129
star
42

tldr

TLDR is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self-supervised learning losses
Python
120
star
43

fire

Python
119
star
44

grabcutios

Image segmentation using GrabCut algorithm for iOS
C++
118
star
45

sling

C++
117
star
46

gdc

Code accompanying our papers on the "Generative Distributional Control" framework
Python
116
star
47

naveridlogin-sdk-android

넀이버 μ•„μ΄λ””λ‘œ 둜그인 SDK (μ•ˆλ“œλ‘œμ΄λ“œ)
Kotlin
112
star
48

PoseGPT

Python
106
star
49

egjs-conveyer

Conveyer adds Drag gestures to your Native Scroll.
TypeScript
103
star
50

egjs-agent

Extracts browser and operating system information from the user agent string or user agent object(userAgentData).
TypeScript
100
star
51

spring-batch-plus

Add useful features to spring batch
Kotlin
100
star
52

cfcs

Write once, create framework components that supports React, Vue, Svelte, and more.
TypeScript
98
star
53

searchad-apidoc

Java
96
star
54

dope

Python
91
star
55

multi-hmr

Pytorch demo code and models for Multi-HMR
Python
87
star
56

imagestabilizer

C++
77
star
57

posescript

Python
76
star
58

guitar

AutoIt
76
star
59

arcus-memcached

ARCUS memory cache server
C
69
star
60

disco

A Toolkit for Distributional Control of Generative Models
Python
68
star
61

svc

Easy and intuitive pattern for Android
Kotlin
63
star
62

cover-checker

Check your pull request code coverage
Java
63
star
63

storybook-addon-preview

Storybook Addon Preview can show user selected knobs in various framework code in Storybook
TypeScript
63
star
64

egjs-list-differ

βž•βž–πŸ”„ A module that checks the diff when values are added, removed, or changed in an array.
TypeScript
61
star
65

egjs-imready

I'm Ready to check if the images or videos are loaded!
TypeScript
59
star
66

egjs-flicking-plugins

Plugins for @egjs/flicking
TypeScript
59
star
67

naveridlogin-sdk-ios

Objective-C
58
star
68

clova-face-kit

On-device lightweight face recognition. Available on Android, iOS, WASM, Python.
57
star
69

prism-live-studio

C++
56
star
70

rye

RYE, Native Sharding RDBMS
C
54
star
71

hubblemon

Python
54
star
72

zeplin-flutter-gen

πŸš€The Flutter dart code generator from zeplin. ex) Container, Text, Color, TextStyle, ... - Save your time.
JavaScript
54
star
73

egjs-visible

A class that checks if an element is visible in the base element or viewport.
HTML
52
star
74

aqm-plus

PyTorch code for Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation (AQM+) (ICLR 2019)
Python
50
star
75

arcus-java-client

ARCUS Java client
Java
49
star
76

isometrizer

Isometrizer turns your DOM elements into isometric projection
TypeScript
47
star
77

garnet

Python
45
star
78

jindojs-jindo

Jindo JavaScript Framework
JavaScript
44
star
79

artemis

Official code release for ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity (published at ICLR 2022)
Python
42
star
80

covid19-nmt

Multi-lingual & multi-domain (specialisation for biomedical data) translation model
Python
40
star
81

react-sample-code

이 ν”„λ‘œμ νŠΈλŠ” hello world에 κ³΅κ°œν•œ React 개발 κ°€μ΄λ“œμ— ν•„μš”ν•œ μƒ˜ν”Œ μ½”λ“œμž…λ‹ˆλ‹€.
JavaScript
39
star
82

passport-naver

A passport strategy for Naver OAuth 2.0
JavaScript
38
star
83

hadoop

Public hadoop release repository
Java
38
star
84

kaist-oss-course

Introduction to Open Source Software class @ KAIST 2016
38
star
85

pump

Python
38
star
86

egjs-component

A class used to manage events in a component like DOM
TypeScript
38
star
87

graphql-dataloader-mongoose

graphql-dataloader-mongoose is a DataLoader generator based on an existing Mongoose model
TypeScript
38
star
88

egjs-persist

Provide cache interface to handle persisted data among history navigation.
JavaScript
38
star
89

posebert

Python
37
star
90

naverspeech-sdk-ios

Swift
32
star
91

reflect

C++ class reflection library without RTTI.
C++
32
star
92

android-utilset

Utilset is collections of useful functions to save your valuable time.
Java
32
star
93

cafe-sdk-unity

31
star
94

naver-spring-batch-ex

Java
31
star
95

image-maps

jquery plugin which can be partially linked to the image
JavaScript
31
star
96

whale-browser-developers

Documents for Whale browser developers.
28
star
97

ai-hackathon

넀이버 AI Hackathon_AI Vision!
Python
28
star
98

image-sprite-webpack-plugin

A webpack plugin that generates spritesheets from your stylesheets.
JavaScript
28
star
99

oasis

Code for the paper "On the Road to Online Adaptation for Semantic Image Segmentation", CVPR 2022
Python
27
star
100

react-native-image-modifier

Modify local images by React-native module
Java
25
star