• Stars
    star
    152
  • Rank 244,685 (Top 5 %)
  • Language
    Python
  • Created about 6 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

๐Ÿ–ผ๏ธ ๐–€๐–“๐–”๐–‹๐–‹๐–Ž๐–ˆ๐–Ž๐–†๐–‘ PyTorch implementation of "Learning Fine-grained Image Similarity with Deep Ranking" (arXiv:1404.4661)

Image Similarity using Deep Ranking

Thanks to Haseeb (@haseeb33) for improving the accuracy calculation as well as image query feature!

Mathjax/Latex is heavily used in this README file. Please download MathJax Plugin for Github in order to render Mathjax/Latex in Github correctly.

Table of contents

Introduction

The goal of this project is to get hands-on experience concerning the computer vision task of image similarity. Like most tasks in this field, it's been aided by the ability of deep networks to extract image features.

The task of image similarity is retrieve a set of N images closest to the query image. One application of this task could involve visual search engine where we provide a query image and want to find an image closest that image in the database.

Dataset

For this project, we will use the Tiny ImageNet dataset. Tiny ImageNet Challenge is the default course project for Stanford CS231N. It runs similar to the ImageNet challenge (ILSVRC).

Tiny Imagenet has 200 classes. Each class has 500 training images, 50 validation images, and 50 test images. Training and validation sets with images and annotations have been released. As well as both class labels and bounding boxes as annotations; however, you are asked only to predict the class label of each image without localizing the objects. The test set is released without labels. You can download the whole tiny ImageNet dataset here.

Project Description

Overview

You will design a simplified version of the deep ranking model as discussed in the paper. Your network architecture will look exactly the same, but the details of the triplet sampling layer will be a lot simpler. The architecture consists of $3$ identical networks $(Q,P,N)$. Each of these networks take a single image denoted by $p_i$ , $p_i^+$ , $p_i^-$ respectively.

  • $p_i$: Input to the $Q$ (Query) network. This image is randomly sampled across any class.
  • $p_i^+$: Input to the $P$ (Positive) network. This image is randomly sampled from the SAME class as the query image.
  • $p_i^-$: Input to the $N$ (Negative) network. This image is randomly sample from any class EXCEPT the class of $p_i$.

The output of each network, denoted by $f(p_i)$, $f(p_i^+)$, $f(p_i^-)$ is the feature embedding of an image. This gets fed to the ranking layer.

Ranking Layer

The ranking layer just computes the triplet loss. It teaches the network to produce similar feature embeddings for images from the same class (and different embeddings for images from different classes).

$$ l(p_i, p_i^+, p_i^-) = \max { 0, g + D \big(f(p_i), f(p_i^+) \big) - D \big( f(p_i), f(p_i^-) \big) } $$

$D$ is the Euclidean Distance between $f(p_i)$ and $f(p_i^{+/-})$.

$$ D(p, q) = \sqrt{(q_1 โˆ’ p_1)^2 + (q_2 โˆ’ p_2)^2 + \dots + (q_n โˆ’ p_n)^2} $$

$g$ is the gap parameter that regularizes the gap between the distance of two image pairs: $(p_i, p_i^+)$ and $(p_i, p_i^-)$. We use the default value of $1.0$, but you can tune it if youโ€™d like (make sure it's positive).

Testing stage

The testing (inference) stage only has one network and accepts only one image. To retrieve the top $n$ similar results of a query image during inference, the following procedure is followed:

  1. Compute the feature embedding of the query image.
  2. Compare (euclidean distance) the feature embedding of the query image to all the feature embeddings in the training data (i.e. your database).
  3. Rank the results - sort the results based on Euclidean distance of the feature embeddings.

Triplet Sampling Layer

One of the main contributions of the paper is the triplet sampling layer. Sampling the query image (randomly) and the positive sample image (randomly from the same class as the query image) are quite straightforward.

Negative samples are composed of two different types of samples: in-class and out-of-class. For this project, we will implement out-of-class samples only. Again, out-of-class samples are images sampled randomly from any class except the class of the query image.

Tips

  1. use the ResNet architecture instead of the multiscale network.
  2. Use the data loader - it'll help a lot in loading the images in parallel (there is a num_workers option)
  3. Sample triplets beforehand, so during training all you're doing is reading images.
  4. Make sure load your model with pre-trained weights. This will greatly reduce the time to train your ranking network.

Implementation Details

Hyper-parameters settings

Hyper-parameters Description
lr=0.001 learning rate
momentum=0.9 momentum factor
nesterov=True Nesterov momentum
weight_decay=1e-5 weight decay (L2 penalty)
epochs=50 number of epochs to train
batch_size_train=30 training set input batch size
batch_size_test=30 test set input batch size
num_of_pos_images / num_of_neg_images = 3 number of p / n images for each query image
g=1.0 gap parameter

Optimizer and loss function

criterion = nn.TripletMarginLoss(margin=args.g, p=args.p)
optimizer = torch.optim.SGD(net.parameters(),
                            lr=args.lr,
                            momentum=args.momentum,
                            weight_decay=args.weight_decay,
                            nesterov=args.nesterov)

Model architecture

TripletNet(
  (embeddingnet): EmbeddingNet(
    (features): Sequential(
      (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace)
      (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
      (4): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (1): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (2): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (5): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (2): BasicBlock(
          (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (3): BasicBlock(
          (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (6): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (2): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (3): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (4): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (5): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (6): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (7): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (8): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (9): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (10): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (11): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (12): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (13): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (14): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (15): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (16): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (17): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (18): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (19): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (20): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (21): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (22): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (7): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (2): BasicBlock(
          (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace)
          (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (8): AvgPool2d(kernel_size=7, stride=1, padding=0)
    )
    (fc1): Linear(in_features=512, out_features=4096, bias=True)
  )
)

Sampling strategies

Since we have used the simplified version of sampling method as follows:

  • $p_i$: Input to the $Q$ (Query) network. This image is randomly sampled across any class.
  • $p_i^+$: Input to the $P$ (Positive) network. This image is randomly sampled from the SAME class as the query image.
  • $p_i^-$: Input to the $N$ (Negative) network. This image is randomly sample from any class EXCEPT the class of $p_i$.

I have created a file named sampler.py which is aimed to random sampling positive images and negative images for each query image.

$ python3 sampler.py
Input Directory: ../tiny-imagenet-200/train
Output Directory: ../
Number of Positive image per Query image:  1
Number of Negative image per Query image:  1
==> Sampling Done ... Now Writing ...

triplets.txt looks like this:

../tiny-imagenet-200/train/n01443537/images/n01443537_0.JPEG,../tiny-imagenet-200/train/n01443537/images/n01443537_219.JPEG,../tiny-imagenet-200/train/n04376876/images/n04376876_418.JPEG
../tiny-imagenet-200/train/n01443537/images/n01443537_0.JPEG,../tiny-imagenet-200/train/n01443537/images/n01443537_219.JPEG,../tiny-imagenet-200/train/n02948072/images/n02948072_159.JPEG
../tiny-imagenet-200/train/n01443537/images/n01443537_0.JPEG,../tiny-imagenet-200/train/n01443537/images/n01443537_219.JPEG,../tiny-imagenet-200/train/n04099969/images/n04099969_450.JPEG

Visualization

Training loss and accuracy

Image Search

Current Implementation

  1. Compute the feature embedding of the query image.
  2. Train NearestNeighbor model with the feature embeddings of the training data (can be modified for other datasets as well).
  3. Find top N nearest neighbors of query image embedding (these are most similar images with query image from embedding space).

For testing and searching: functions from src/acc_knn.py can be used, explored and modified for custom use cases.

$ python3 acc_knn.py --predict_similar_images "../tiny-imagenet-200/test/images/test_9970.JPEG" --predict_top_N 5

$ python3 acc_knn.py --predict_similar_images "../tiny-imagenet-200/test/images/test_73.JPEG" --predict_top_N 10

References

[1] Jiang Wang, Yang song, Thomas Leung, Chuck Rosenberg, Jinbin Wang, James Philbin, Bo Chen, Ying Wu. "Learning Fine-grained Image Similarity with Deep Ranking". arXiv:1404.4661
[2] Akarsh Zingade "Image Similarity using Deep Ranking"
[3] Pytorch Discussion. Feedback on PyTorch for Kaggle competitions

More Repositories

1

DA-RNN

๐Ÿ“ƒ ๐–€๐–“๐–”๐–‹๐–‹๐–Ž๐–ˆ๐–Ž๐–†๐–‘ PyTorch Implementation of DA-RNN (arXiv:1704.02971)
Jupyter Notebook
410
star
2

reinforcement-learning-stanford

๐Ÿ•น๏ธ CS234: Reinforcement Learning, Winter 2019 | YouTube videos ๐Ÿ‘‰
Python
297
star
3

machine-learning-uiuc

๐Ÿ–ฅ๏ธ CS446: Machine Learning in Spring 2018, University of Illinois at Urbana-Champaign
Python
264
star
4

CSAPP-Labs

๐Ÿ’ป Computer Systems: A Programmer's Perspective, Lab Assignments Solutions
C
162
star
5

advanced-deep-learning-and-reinforcement-learning-deepmind

๐ŸŽฎ Advanced Deep Learning and Reinforcement Learning at UCL & DeepMind | YouTube videos ๐Ÿ‘‰
Jupyter Notebook
148
star
6

data-structures-ucb

๐ŸŒณ CS 61B: Data Structures in Spring 2018, University of California, Berkeley
Java
92
star
7

zhenye-na

๐Ÿงโ€โ™‚๏ธ
69
star
8

e2e-learning-self-driving-cars

๐Ÿš— ๐–€๐–“๐–”๐–‹๐–‹๐–Ž๐–ˆ๐–Ž๐–†๐–‘ PyTorch implementation of "End-to-End Learning for Self-Driving Cars" (arXiv:1604.07316) with Udacity's Simulation env
Jupyter Notebook
59
star
9

crnn-pytorch

โœ๏ธ Convolutional Recurrent Neural Network in Pytorch | Text Recognition
Jupyter Notebook
48
star
10

giligili

Go
31
star
11

computer-vision-uiuc

๐Ÿ–ผ๏ธ CS543 / ECE549: Computer Vision in Spring 2018, University of Illinois at Urbana-Champaign
MATLAB
27
star
12

gcn-spp

Shortest Path prediction using Graph Convolutional Networks
Jupyter Notebook
25
star
13

SQL-Exercises

๐Ÿ’พ WIKIBOOKS: SQL Exercises
PLpgSQL
22
star
14

neural-style-pytorch

๐Ÿ“„ ๐–€๐–“๐–”๐–‹๐–‹๐–Ž๐–ˆ๐–Ž๐–†๐–‘ PyTorch implementation of "A Neural Algorithm of Artistic Style" (arXiv:1508.06576)
Python
21
star
15

data-structures-uiuc

๐ŸŒณ CS225: Data Structures
C++
20
star
16

cs106b

:neckbeard: CS 106B: Programming Abstractions (C++) | Spring 2017
C++
19
star
17

database-systems-uiuc

๐Ÿ’พ CS411: Database Systems in Spring 2018, UIUC
TeX
19
star
18

leetcode

๐Ÿ‘จโ€๐Ÿ’ป This repository contains the solutions and explanations for algorithm problems in LeetCode, implemented by Python or Java. Code Skeletons are generated automatically via the `vscode-leetcode` plugin.
Python
19
star
19

pokemon-gan

๐Ÿผ Generating new Pokemons with Wasserstein DCGAN | TensorFlow Implementation
Python
18
star
20

lintcode

๐Ÿ‘จโ€๐Ÿ’ป This repository contains the solutions and explanations to the algorithm problems on LintCode. All are written in Python/Java/C++ and implemented by myself.
Python
17
star
21

coursera-ml

๐Ÿ’กThis repository contains all of the lecture exercises of Machine Learning course by Andrew Ng, Stanford University @ Coursera. All are implemented by myself and in MATLAB/Octave.
MATLAB
16
star
22

computational-advertising-uiuc

๐Ÿ’ธ CS498HS4: Computational Advertising in Fall 2018, UIUC
Python
11
star
23

aws-certs-cheatsheet

๐Ÿ’ฏ Cheatsheets for AWS Certified Exams - AWS Certified Solutions Architect Associate
SCSS
8
star
24

blog

๐Ÿ“” Technical blog
SCSS
6
star
25

algo-for-data-analytics

IE531: Algorithms for Data Analytics in 2018 Spring, UIUC
C
5
star
26

pan.go

๐Ÿ’พ A Tiny Golang based Distributed Cloud Storage Service | MySQL, Reids, RabbitMQ, Docker and Ceph
Go
4
star
27

viola-jones-algo

๐Ÿ‘จ๐Ÿ‘ฉ Viola Jones Face Detection
Python
3
star
28

marketplace

๐Ÿช Node.js based Marketplace Web Application
HTML
2
star
29

Pymelody

๐ŸŽถ Classical Music Generation with Machine Learning
Python
2
star
30

tiny-url

๐Ÿ”— URL shortening service built with Golang
Go
1
star
31

Deep-Learning-Specialization

โš›๏ธ Deep Learning Specialization by deeplearning.ai
Jupyter Notebook
1
star
32

practical-http

HTTP ๅ่ฎฎๅŽŸ็† + ๅฎž่ทต Web ๅผ€ๅ‘ๅทฅ็จ‹ๅธˆๅฟ…ๅญฆ
JavaScript
1
star
33

zhenye-na.github.io

JavaScript
1
star
34

learn.go

Go
1
star
35

analysis-of-network-data

IE532: Analysis of Network Data in 2017 Fall, UIUC
Jupyter Notebook
1
star