• Stars
    star
    260
  • Rank 157,189 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created over 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Referring Relationships

Referring Relationships model

This repository contains code used to produce the results in the following paper:

Referring Relationships

Ranjay Krishna†, Ines Chami†, Michael Bernstein, Li Fei-Fei
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

If you are using this repository, please use the following citation:

@inproceedings{krishna2018referring,
  title={Referring Relationships},
  author={Krishna, Ranjay and Chami, Ines and Bernstein, Michael and Fei-Fei, Li },
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Clone the repository and install the dependencies.

You can clone the repository and install the requirements by running the following:

git clone https://github.com/stanfordvl/ReferringRelationships.git
cd ReferringRelationships
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt

To download the dataset used in the project, run:

./scripts/download_data.sh

Note that we only distribute the annotations for the datasets. To download the images for these datasets, please use the following links:

Model training

To train the models, you will need to create an hdf5 dataset and then run the following script to test and evaluate the model:

# For the VRD dataset.
./scripts/create_vrd_dataset.sh $LOCATION_OF_VRD_TRAIN_IMAGES $LOCATION_OF_VRD_TEST_IMAGES
./scripts/train_vrd.sh
./scripts/evaluate_vrd.sh
# For the CLEVR dataset.
./scripts/create_clevr_dataset.sh $LOCATION_OF_CLEVR_TRAIN_IMAGES $LOCATION_OF_CLEVR_VAL_IMAGES
./scripts/train_clevr.sh $LOCATION_OF_MODEL
./scripts/evaluate_clevr.sh $LOCATION_OF_MODEL
# For the Visual Genome dataset.
./scripts/create_visualgenome_dataset.sh $LOCATION_OF_VISUAL_GENOME_IMAGES
./scripts/train_visualgenome.sh $LOCATION_OF_MODEL
./scripts/evaluate_visualgenome.sh $LOCATION_OF_MODEL

This script will train the model and save the weights in the --save-dir directory. It will also save the configuration parameters in a params.json file and log events in train.log.

However, if you decide that you want more control over the training or evaluation scripts, check out the instructions below.

Customized dataset creation

The script data.py will save masks for objects and subjects in train/val/test directories that will be created in the directory --save-dir. The script also saves numpy arrays for relationships.

The script has the following command line arguments to modify the dataset pre-processing:

  -h, --help            show this help message and exit
  --test                When true, the data is not split into training and
                        validation sets
  --val-percent         Fraction of images in validation split.
  --save-dir            where to save the ground truth masks, this Location
                        where dataset should be saved.
  --img-dir             Location where images are stored.
  --annotations         Json with relationships for each image.
  --image-metadata      Image metadata json file.
  --image-dim           The size the images should be saved as.
  --output-dim          The size the predictions should be saved as.
  --seed                The random seed used to reproduce results.
  --num-images          The random seed used to reproduce results.
  --save-images         Use this flag to specify that the images should also
                        be saved.
  --max-rels-per-image  Maximum number of relationships per image.

Customized Training.

The model can be trained by calling python train.py with the following command line arguments to modify your training:

optional arguments:
  -h, --help            Show this help message and exit
  --opt                 The optimizer used during training. Currently supports
                        rms, adam, adagrad and adadelta.
  --lr                  The learning rate for training.
  --lr_decay            The learning rate decay.
  --batch-size          The batch size used in training.
  --epochs              The number of epochs to train.
  --seed                The random seed used to reproduce results.
  --overwrite           Train even if that folder already contains an existing
                        model.
  --save-dir            The location to save the model and the results.
  --models-dir          The location of the model weights
  --use-models-dir      Indicates that new models can be saved in the models
                        directory set by --models-dir.
  --save-best-only      Saves only the best model checkpoint.

  --use-subject         Boolean indicating whether to use the subjects.
  --use-predicate       Boolean indicating whether to use the predicates.
  --use-object          Boolean indicating whether to use the objects.

  --embedding-dim       Number of dimensions in our class embeddings.
  --hidden-dim          Number of dimensions in the hidden unit.
  --feat-map-dim        The size of the feature map extracted from the image.
  --input-dim           Size of the input image.
  --num-predicates      The number of predicates in the dataset.
  --num-objects         The number of objects in the dataset.
  --dropout             The dropout probability used in training.

  --train-data-dir      Location of the training data.
  --val-data-dir        Location of the validation data.
  --image-data-dir      Location of the images.
  --heatmap-threshold   The thresholds above which we consider a heatmap to
                        contain an object.

Customized evaluation

The evaluations can be run using python evaluate.py with the following options:

  -h, --help            show this help message and exit
  --batch-size          The batch size used in training.
  --seed                The random seed used to reproduce results.
  --workers             Number workers used to load the data.
  --heatmap-threshold   The thresholds above which we consider a heatmap to
                        contain an object.
  --model-checkpoint    The model to evaluate.
  --data-dir            Location of the data to evluate with.

Customized discovery evaluation.

The discovery based experiments can be run by setting the following flags during training and using python evaluate_discovery.py when evaluating.

  --discovery           Used when we run the discovery experinent where
                        objects are dropped during training.
  --always-drop-file    Location of list of objects that should always be
                        dropped.
  --subject-droprate    Rate at which subjects are dropped.
  --object-droprate     Rate at which objects are dropped.
  --model-checkpoint    The model to evaluate.
  --data-dir            Location of the data to evluate with.

Contributing.

We welcome everyone to contribute to this reporsitory. Send us a pull request.

License:

The code is under the MIT license. Check LICENSE for details.

More Repositories

1

GibsonEnv

Gibson Environments: Real-World Perception for Embodied Agents
C
864
star
2

taskonomy

Taskonomy: Disentangling Task Transfer Learning [Best Paper, CVPR2018]
Python
845
star
3

cs131_notes

Class notes for CS 131.
TeX
736
star
4

iGibson

A Simulation Environment to train Robots in Large Realistic Interactive Scenes
Python
656
star
5

CS131_release

Released assignments for the Stanford's CS131 course on Computer Vision.
Jupyter Notebook
454
star
6

OmniGibson

OmniGibson: a platform for accelerating Embodied AI research built upon NVIDIA's Omniverse engine. Join our Discord for support: https://discord.gg/bccR5vGFEx
Python
425
star
7

3DSceneGraph

The data skeleton from "3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera" http://3dscenegraph.stanford.edu
Python
237
star
8

JRMOT_ROS

Source code for JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset
Python
145
star
9

RubiksNet

Official repo for ECCV 2020 paper - RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
Python
99
star
10

feedback-networks

The repo of Feedback Networks, CVPR17
Lua
89
star
11

ntp

Neural Task Programming
81
star
12

STR-PIP

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Python
74
star
13

bddl

Jupyter Notebook
67
star
14

robovat

RoboVat: A unified toolkit for simulated and real-world robotic task environments.
Python
67
star
15

iGibsonChallenge2021

Python
55
star
16

behavior

Code to evaluate a solution in the BEHAVIOR benchmark: starter code, baselines, submodules to iGibson and BDDL repos
Python
52
star
17

atp-video-language

Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (ATP).
Python
47
star
18

GibsonSim2RealChallenge

GibsonSim2RealChallenge @ CVPR2020
Python
35
star
19

moma

A dataset for multi-object multi-actor activity parsing
Jupyter Notebook
34
star
20

NTP-vat-release

The PyBullet wrapper (Vat) for Neural Task Programming
Python
34
star
21

mini_behavior

MiniGrid Implementation of BEHAVIOR Tasks
Python
28
star
22

BehaviorChallenge2021

Python
25
star
23

HMS

The repository of the code base of "Multi-Layer Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search"
Python
25
star
24

ac-teach

Code for the CoRL 2019 paper AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
Python
24
star
25

STGraph

Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
22
star
26

cavin

Python
20
star
27

alignment

ELIGN: Expectation Alignment as a Multi-agent Intrinsic Reward
Python
19
star
28

Sonicverse

HTML
17
star
29

Gym

Custom version of OpenAI Gym
Python
14
star
30

causal_induction

Codebase for "Causal Induction from Visual Observations for Goal-Directed Tasks"
Python
12
star
31

keto

Python
12
star
32

Lasersuite

Forked robosuite for LASER project
Python
11
star
33

perls2

PErception and Robotic Learning System v2
Python
11
star
34

STIP

Python
10
star
35

behavioral_navigation_nlp

Code for translating navigation instructions in natural language to a high-level plan for behavioral navigation for robot navigation
Python
9
star
36

bullet3

C++
8
star
37

arxivbot

Python
8
star
38

egl_probe

A helpful module for listing available GPUs for EGL rendering.
C
6
star
39

ssai

Socially Situated AI
4
star
40

ig_navigation

Python
4
star
41

omnigibson-eccv-tutorial

Jupyter Notebook
4
star
42

RL-Pseudocode

AppleScript
4
star
43

ARPL

Adversarially Robust Policy Learning
Python
4
star
44

sail-blog-new-post

The repository for making new post submissions to the SAIL Blog
HTML
3
star
45

behavior-website-old

HTML
2
star
46

behavior-baselines

Python
2
star
47

behavior-website

SCSS
1
star
48

iris

IRIS: Implicit Reinforcement without Interaction at Scale for Control from Large-Scale Robot Manipulation Datasets
1
star
49

bullet3_ik

Pybullet frozen at version 1.9.5 - purely for using its IK implementation.
C++
1
star