• Stars
    star
    118
  • Rank 299,923 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Authors official Tensorflow implementation of the "Near-Duplicate Video Retrieval with Deep Metric Learning" [ICCVW 2017]

Near-Duplicate Video Retrieval
with Deep Metric Learning

This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retrieval with Deep Metric Learning. It provides code for training and evalutation of a Deep Metric Learning (DML) network on the problem of Near-Duplicate Video Retrieval (NDVR). During training, the DML network is fed with video triplets, generated by a triplet generator. The network is trained based on the triplet loss function. The architecture of the network is displayed in the figure below. For evaluation, mean Average Precision (mAP) and Presicion-Recall curve (PR-curve) are calculated. Two publicly available dataset are supported, namely VCDB and CC_WEB_VIDEO.

Prerequisites

  • Python
  • Tensorflow 1.xx

Getting started

Installation

  • Clone this repo:
git clone https://github.com/MKLab-ITI/ndvr-dml
cd ndvr-dml
  • You can install all the dependencies by
pip install -r requirements.txt

or

conda install --file requirements.txt

Triplet generation

Run the triplet generation process for each dataset, VCDB and CC_WEB_VIDEO. This process will generate two files for each dataset:

  1. the global feature vectors for each video in the dataset:
    <output_dir>/<dataset>_features.npy
  2. the generated triplets:
    <output_dir>/<dataset>_triplets.npy

To execute the triplet generation process, do as follows:

  • The code does not extract features from videos. Instead, the .npy files of the already extracted features have to be provided. You may use the tool in here to do so.

  • Create a file that contains the video id and the path of the feature file for each video in the processing dataset. Each line of the file have to contain the video id (basename of the video file) and the full path to the corresponding .npy file of its features, separated by a tab character (\t). Example:

      23254771545e5d278548ba02d25d32add952b2a4	features/23254771545e5d278548ba02d25d32add952b2a4.npy
      468410600142c136d707b4cbc3ff0703c112575d	features/468410600142c136d707b4cbc3ff0703c112575d.npy
      67f1feff7f624cf0b9ac2ebaf49f547a922b4971	features/67f1feff7f624cf0b9ac2ebaf49f547a922b4971.npy
                                               ...	
    
  • Run the triplet generator and provide the generated file from the previous step, the name of the processed dataset, and the output directory.

python triplet_generator.py --dataset vcdb --feature_files vcdb_feature_files.txt --output_dir output_data/

DML training

  • Train the DML network by providing the global features and triplet of VCDB, and a directory to save the trained model.
python train_dml.py --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/ 
  • Triplets from the CC_WEB_VIDEO can be injected if the global features and triplet of the evaluation set are provide.
python train_dml.py --evaluation_set output_data/cc_web_video_features.npy --evaluation_triplets output_data/cc_web_video_triplets.npy --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/

Evaluation

  • Evaluate the performance of the system by providing the trained model path and the global features of the CC_WEB_VIDEO.
python evaluation.py --fusion Early --evaluation_set output_data/cc_vgg_features.npy --model_path model/

OR

python evaluation.py --fusion Late --evaluation_features cc_web_video_feature_files.txt --evaluation_set output_data/cc_vgg_features.npy --model_path model/
  • The mAP and PR-curve are returned

Citation

If you use this code for your research, please cite our paper.

@inproceedings{kordopatis2017dml,
  title={Near-Duplicate Video Retrieval with Deep Metric Learning},
  author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Yiannis},
  booktitle={2017 IEEE International Conference on Computer Vision Workshop (ICCVW)},
  year={2017},
}

Related Projects

Intermediate-CNN-Features - this repo was used to extract our features

ViSiL - video similarity learning for fine-grained similarity calculation

FIVR-200K - download our FIVR-200K dataset

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos ([email protected])
Symeon Papadopoulos ([email protected])

More Repositories

1

CUDA

GPU-accelerated LIBSVM is a modification of the original LIBSVM that exploits the CUDA framework to significantly reduce processing time while producing identical results. The functionality and interface of LIBSVM remains the same. The modifications were done in the kernel computation, that is now performed using the GPU.
HTML
213
star
2

visil

Authors official PyTorch implementation of the "ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning" [ICCV 2019]
Python
203
star
3

image-verification-corpus

This contains an evolving dataset of fake and real images shared in social media.
Java
155
star
4

FIVR-200K

FIVR-200K dataset from the "FIVR: Fine-grained Incident Video Retrieval" [TMM 2019]
Python
78
star
5

intermediate-cnn-features

Feature extraction from videos based on intermediate layers of a Convolutional Neural Network.
Python
63
star
6

multimedia-indexing

A framework for large-scale feature extraction, indexing and retrieval.
Java
59
star
7

greek-sentiment-lexicon

A lexicon to be used for sentiment analysis in Greek.
34
star
8

news-popularity-prediction

A set of methods that predict the future values of popularity indices for news posts using a variety of features.
Python
33
star
9

pygrank

Recommendation algorithms for large graphs
Python
29
star
10

reveal-graph-embedding

Implementation of community-based graph embedding for user classification.
Python
28
star
11

fake-video-corpus

A dataset of debunked and verified user-generated videos.
25
star
12

ImproveMyCity-Mobile

The Android mobile version of the web-based ImproveMyCity application
Java
21
star
13

MyoWebToolkit

Web tools to do research with Myo
JavaScript
18
star
14

JGNN

A Fast Graph Neural Network Library written in Native Java
Java
16
star
15

mmdemo-dockerized

A set of services for monitoring of multiple social media platforms based on Docker.
JavaScript
16
star
16

reveal-user-classification

Performs user classification into labels using a set of seed Twitter users with known labels and the structure of the interaction network between them.
Python
11
star
17

topic-detection

Provides the implementation of a topic detection framework developed for the MULTISENSOR project.
R
9
star
18

easIE

easy Information Extraction: a framework for quickly and simply generating Web Information Extractors and Wrappers.
Java
8
star
19

simmo

Socially interconnected/interlinked and multimedia-enriched objects: A model for representing multimedia content in the context of the Web and Social Media.
Java
8
star
20

prophet

PROPheT (PERICLES Ontology Population Tool)
Python
6
star
21

decentralized-gnn

A library for implementing Decentralized Graph Neural Network algorithms.
Python
6
star
22

reveal-user-annotation

Utility methods for generating labels for Twitter users and handling their storage and retrieval.
Python
5
star
23

verge

VERGE is a hybrid interactive video retrieval system, which is capable of searching into video content by integrating different search modules that employ visual- and textual-based techniques.
PHP
5
star
24

category-based-classification

Contains the implementation of a category-based classification framework developed for the MULTISENSOR project.
Python
5
star
25

contextual-video-verification

Provides support to end users for verifying web videos using metadata and contextual signals.
Java
4
star
26

DanceAnno

Dance annotation tool for data obtained with the Kinect sensor
Python
4
star
27

hackair-data-retrieval

Contains components for air quality data collection, image collection from Flickr and web cams, and image analysis for sky detection and localization.
Java
4
star
28

mgraph-summarization

Implementation of MGraph framework for generating summaries from large collections of social media posts (e.g. tweets).
Java
4
star
29

adaptive-fairness

Implementation of an algorithmic framework for achieving optimal fairness-accuracy trade-offs.
MATLAB
3
star
30

twitter-aq

Dataset and code to reproduce results of Twitter-based Air Quality estimation.
Python
3
star
31

image-privacy

Implements a personalized machine learning approach for image privacy classification.
Java
3
star
32

hugomklab

Lab's static website based on Hugo
HTML
3
star
33

gnn-tf

A TensorFlow framework for the definition and training of Graph Neural Network architectures on interoperable predictive tasks.
Python
2
star
34

usemp-pscore

Implementation of the USEMP Privacy Scoring framework.
Java
2
star
35

hackair-decision-support-api

Contains the hackAIR ontology and reasoning implementation.
Java
2
star
36

company-data-integration

Implements techniques for matching between company-related data across different sources.
Java
1
star
37

simmo-stream-manager

Stream manager adaptation for use with SIMMO.
Java
1
star
38

yamlres

Retrieving algorithm component combinations from online (or local) yaml resources.
Python
1
star
39

pericode

PeriCoDe project
MATLAB
1
star
40

patent_ontologies

PATExpert Semantic Representation Framework
1
star
41

reveal-community-ranking

Reveal Community Ranking
JavaScript
1
star
42

multisensor-concept-event-detection

Python
1
star
43

pygrank-f

A forward-oriented programming variation of pygrank
Python
1
star