• Stars
    star
    346
  • Rank 122,430 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created over 6 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Siamese Mask R-CNN model for one-shot instance segmentation

Siamese Mask R-CNN

This is the official implementation of Siamese Mask R-CNN from One-Shot Instance Segmentation. It is based on the Mask R-CNN implementation by Matterport.

The repository includes:

  • Source code of Siamese Mask R-CNN
  • Training code for MS COCO
  • Evaluation on MS COCO metrics (AP)
  • Training and evaluation of one-shot splits of MS COCO
  • Training code to reproduce the results from the paper
  • Pre-trained weights for ImageNet
  • Pre-trained weights for all models from the paper
  • Code to evaluate all models from the paper
  • Code to generate result figures

One-Shot Instance Segmentation

One-shot instance segmentation can be summed up as: Given a query image and a reference image showing an object of a novel category, we seek to detect and segment all instances of the corresponding category (in the image above โ€˜personโ€™ on the left, โ€˜carโ€™ on the right). Note that no ground truth annotations of reference categories are used during training. This type of visual search task creates new challenges for computer vision algorithms, as methods from metric and few-shot learning have to be incorporated into the notoriously hard tasks ofobject identification and segmentation. Siamese Mask R-CNN extends Mask R-CNN - a state-of-the-art object detection and segmentation system - with a Siamese backbone and a matching procedure to perform this type of visual search.

Installation

  1. Clone this repository
  2. Prepare COCO dataset as described below
  3. Run the install_requirements.ipynb notebook to install all relevant dependencies.

Requirements

Linux, Python 3.4+, Tensorflow, Keras 2.1.6, cython, scikit_image 0.13.1, h5py, imgaug and opencv_python

Prepare COCO dataset

The model requires MS COCO and the CocoAPI to be added to /data.

cd data
git clone https://github.com/cocodataset/cocoapi.git

It is recommended to symlink the dataset root of MS COCO.

ln -s $PATH_TO_COCO$/coco coco

If unsure follow the instructions of the Matterport Mask R-CNN implementation.

Get pretrained weights

Get the pretrained weights from the releases menu and save them to /checkpoints.

Training

To train siamese mask r-cnn on MS COCO simply follow the instructions in the training.ipynb notebook. There are two model configs available, a small one which runs on a single GPU with 12GB memory and a large one which needs 4 GPUs with 12GB memory each. The second model config is the same as used in our experiments.

To reproduce our results and train the models reported in the paper run the notebooks provided in experiments. Those models need 4 GPUs with 12GB memory each.

Our models are trained on the coco 2017 training set, of which we remove the last 3000 images for validation.

Evaluation

To evaluate and visualize a models results run the evaluation.ipynb notebook. Make sure to use the same config as used for training the model.

To evaluate the models reported in the paper run the evaluation notebook provided in experiments. Each model will be evaluated 5 times to compensate for the stochastic effects introduced by randomly choosing the reference instances. The final result is the mean of those five runs.

We use the coco 2017 val set for testing and the last 3000 images from the training set for validation.

Model description

Siamese Mask R-CNN is designed as a minimal variation of Mask R-CNN which can perform the visual search task described above. For more details please read the paper.

Citation

If you use this repository or want to reference our work please cite our paper:

@article{michaelis_one-shot_2018,
    title = {One-Shot Instance Segmentation},
    author = {Michaelis, Claudio and Ustyuzhaninov, Ivan and Bethge, Matthias and Ecker, Alexander S.},
    year = {2018},
    journal = {arXiv},
    url = {http://arxiv.org/abs/1811.11507}
}

More Repositories

1

foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
Python
2,733
star
2

imagecorruptions

Python package to corrupt arbitrary images.
Python
409
star
3

model-vs-human

Benchmark your model on out-of-distribution datasets with carefully collected human comparison data (NeurIPS 2021 Oral)
Python
333
star
4

robust-detection-benchmark

Code, data and benchmark from the paper "Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming" (NeurIPS 2019 ML4AD)
Jupyter Notebook
182
star
5

stylize-datasets

A script that applies the AdaIN style transfer method to arbitrary datasets
Python
155
star
6

robustness

Robustness and adaptation of ImageNet scale models. Pre-Release, stay tuned for updates.
Python
128
star
7

openimages2coco

Convert Open Images annotations into MS Coco format to make it a drop in replacement
Jupyter Notebook
112
star
8

slow_disentanglement

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding
Jupyter Notebook
72
star
9

frequency_determines_performance

Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurIPS'24]
Jupyter Notebook
71
star
10

AnalysisBySynthesis

Adversarially Robust Neural Network on MNIST.
Python
64
star
11

game-of-noise

Trained model weights, training and evaluation code from the paper "A simple way to make neural networks robust against diverse image corruptions"
Python
62
star
12

decompose

Blind source separation based on the probabilistic tensor factorisation framework
Python
43
star
13

adversarial-vision-challenge

NIPS Adversarial Vision Challenge
Python
41
star
14

CiteME

CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.
Python
35
star
15

InDomainGeneralizationBenchmark

Python
33
star
16

robust-vision-benchmark

Robust Vision Benchmark
Python
22
star
17

docker

Information and scripts to run and develop the Bethge Lab Docker containers
Makefile
20
star
18

slurm-monitoring-public

Monitor your high performance infrastructure configured over slurm using TIG stack
Python
19
star
19

google_scholar_crawler

Crawl Google scholar publications and authors
Python
12
star
20

DataTypeIdentification

Code for the ICLR'24 paper: "Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models"
11
star
21

magapi-wrapper

Wrapper around Microsoft Academic Knowledge API to retrieve MAG data
Python
10
star
22

testing_visualizations

Code for the paper " Exemplary Natural Images Explain CNN Activations Better than Feature Visualizations"
Python
10
star
23

docker-deeplearning

Development of new unified docker container (WIP)
Python
9
star
24

sort-and-search

Code for the paper: "Efficient Lifelong Model Evaluation in an Era of Rapid Progress" [NeurIPS'24]
Python
9
star
25

notorious_difficulty_of_comparing_human_and_machine_perception

Code for the three case studies: Closed Contour Detection, Synthetic Visual Reasoning Test, Recognition Gap
Jupyter Notebook
8
star
26

lifelong-benchmarks

Benchmarks introduced in the paper: "Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress"
8
star
27

tools

Shell
6
star
28

docker-jupyter-deeplearning

Docker Image with Jupyter for Deep Learning (Caffe, Theano, Lasagne, Keras)
6
star
29

docker-xserver

Docker Image with Xserver, OpenBLAS and correct user settings
Shell
2
star
30

gym-Atari-SpaceInvaders-V0

Python
1
star
31

bwki-weekly-tasks

BWKI Task of the week
Jupyter Notebook
1
star