• Stars
    star
    190
  • Rank 203,739 (Top 5 %)
  • Language
    Python
  • Created over 5 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The source code of AMFMN and the dataset RSITMD

The offical PyTorch code for paper "Exploring a Fine-grained Multiscale Method for Cross-modal Remote Sensing Image Retrieval", TGRS 2021.

For better retrieval results, please refer to GaLR, TGRS 2022.

For Semantic localization test, please refer to SeLo, TGRS 2022.

Author: Zhiqiang Yuan

Supported Python versions Supported OS npm License

-------------------------------------------------------------------------------------

Welcome ๐Ÿ‘Fork and Star๐Ÿ‘, then we'll let you know when we update

#### News:
#### 2021.05.22: ---->RSITMD is expected to be released before July<----
#### 2021.06.21: ---->RSITMD is now open to access<----
#### 2021.07.29: ---->The code of AMFMN is expected to be released before September<----
#### 2021.08.03: ---->The code of AMFMN has been open to access<----
#### 2021.10.28: ---->Four samples were updated to correct blank sentences<----

-------------------------------------------------------------------------------------

INTRODUCTION

This is AMFMN, a cross-modal retrieval method for remote sensing images. Here, you can get the benchmark of the image-text cross-modal retrieval method, which can be further modified to obtain higher retrieval accuracy. Next, we will publish the more fine-grained image-text RSITMD dataset, and welcome you to use the proposed dataset.

AMFMN

Network Architecture

arch image Asymmetric multimodal feature matching network for RS image-text retrieval. AMFMN uses the MVSA module to obtain salient image features and uses salient features to guide the representation of text modalities. The network supports multiple retrieval methods and can adaptively fuse different modal text information.

Multiscale Visual Self-Attention

Multiscale Visual Self-Attention. We first use a multiscale feature fusion network to obtain the multilevel feature representation, then use a redundant feature filtering network to filter out useless feature expressions, and finally get the salient mask of the RS image.

Three different visual-guided attention mechanisms

Three different visual-guided attention mechanisms.

RSITMD

Dataset Features

The similarity visualization results of six datasets The similarity visualization results of six datasets, where the similarity score is weighted by the BLEU and METEOR indicators in the natural language processing field. The ideal picture is a straight diagonal line from the upper left to the lower right, which means each sentence is only related to the corresponding image.

Quantitative comparison of the four datasets Quantitative comparison of the four datasets. (a) Comparison of sample number. (b) Comparison of average sentence length. (c) Comparison of diversity score. (d) Comparison of average similarity. (e) Comparison of the total number of words. (f) Comparison of the number of categories.

Citation

If you feel this code helpful or use this code or dataset, please cite it as

Z. Yuan et al., "Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval," in IEEE Transactions on Geoscience and Remote Sensing, doi: 10.1109/TGRS.2021.3078451.

More Repositories

1

labview2018-tutorial

this repository tells how to use LabVIEW based on labview2018.
LabVIEW
111
star
2

Stable-Diffusion-for-Remote-Sensing-Image-Generation

A project for text-to-image remote sensing image generation.
Jupyter Notebook
96
star
3

retrievalSystem

The back-end of cross-modal retrieval system๏ผŒwihch will contain services such as semantic location .etc
Python
71
star
4

GaLR

Source code of paper "Remote Sensing Cross-Modal Image-Text Retrieval Based on Global and Local Information"
Python
56
star
5

Controllable-Fake-Sample-Generation-for-RS

Code for "Efficient and Controllable Remote Sensing Fake Sample Generation Based on Diffusion Model"
Python
27
star
6

SemanticLocalizationMetrics

The first research for semantic localization
Python
22
star
7

MCRN

A multi-source cross-modal retrieval network
Python
13
star
8

RemoteSensingCaptions

a repository for remote sensing captions with attention , including Sydney and UCM
Python
11
star
9

remote-sensing-multi-modal-repository

multi-modal repository under remote sensing
7
star
10

Res-Trans

First place in the 2020 iFLYTEK Multimodal Emotion Analysis and Recognition Challenge
Python
7
star
11

xiaoyuan1996_bk

Python
6
star
12

XiaoYuanWords

A repository of remember English words, you can make some change base on it.
Python
6
star
13

xiaoyuan1996

5
star
14

Useful-tools

For some small script, such as crawler
Python
3
star
15

TextMinging

Second place of Iflytek AI developer competition text mining track
Python
2
star
16

matplotlib

this is a tutorial of matplotlab.pylot in python ,I think it's easy for you to learn it by this course
Python
2
star
17

shuati

leecode ๅˆท้ข˜
Scala
1
star
18

ChatRobot

let the robot chat with you
Python
1
star
19

python-tk

this is a repository of Mr.Gao's interface made by python tk
Python
1
star
20

AirPipeline

Jupyter Notebook
1
star
21

xiaoyuan1996.github.io

JavaScript
1
star