svip-lab/GazeFollowing

Stars
103
Rank 333,046 (Top 7 %)
Language
Python
License
MIT License
Created about 6 years ago
Updated over 3 years ago

svip-lab/GazeFollowing

svip-lab

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Code for ACCV2018 paper 'Believe It or Not, We Know What You Are Looking at!'

Gaze following

PyTorch implementation of our ACCV2018 paper:

'Believe It or Not, We Know What You Are Looking at!' [paper] [poster]

Dongze Lian*, Zehao Yu*, Shenghua Gao

(* Equal Contribution)

Prepare training data

GazeFollow dataset is proposed in [1], please download the dataset from http://gazefollow.csail.mit.edu/download.html. Note that the downloaded testing data may have wrong label, so we request test2 provided by author. I do not know whether the author update their testing set. If not, it is better for you to e-mail authors in [1]. For your convenience, we also paste the testing set link here provided by authors in [1] when we request. (Note that the license is in [1])

Download our dataset

OurData is in Onedrive Please download and unzip it

OurData contains data descriped in our paper.

OurData/tools/extract_frame.py

extract frame from clipVideo in 2fps. Different version of ffmpeg may have different results, we provide our extracted images.

OurData/tools/create_video_image_list.py

extract annotation to json.

Testing on gazefollow data

Please download the pretrained model manually and save to model/

cd code
python test_gazefollow.py

Evaluation metrics

cd code
python cal_min_dis.py
python cal_auc.py

Test on our data

cd code
python test_ourdata.py

Training scratch

cd code
python train.py

Inference

simply run python inference.py image_path eye_x eye_y to infer the gaze. Note that eye_x and eye_y is the normalized coordinate (from 0 - 1) for eye position. The script will save the inference result in tmp.png.

cd code
python inference.py ../images/00000003.jpg 0.52 0.14

Reference:

[1] Recasens*, A., Khosla*, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015).

Citation

If this project is helpful for you, you can cite our paper:

@InProceedings{Lian_2018_ACCV,
author = {Lian, Dongze and Yu, Zehao and Gao, Shenghua},
title = {Believe It or Not, We Know What You Are Looking at!},
booktitle = {ACCV},
year = {2018}
}

impersonator

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

PlanarReconstruction

[CVPR'19] Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

FastMVSNet

[CVPR'20] Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement

PPGNet

Source code for our CVPR 2019 paper - PPGNet: Learning Point-Pair Graph for Line Segment Detection

Indoor-SfMLearner

[ECCV'20] Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

HRNet-for-Fashion-Landmark-Estimation.PyTorch

[DeepFashion2 Challenge] Fashion Landmark Estimation with HRNet

AS-MLP

[ICLR'22] This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision".

PlaneDepth

[CVPR2023] This is an official implementation for "PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes".

CIDNN

CIDNN: Encoding Crowd Interaction with Deep Neural Network

IVOS-W

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

MLEP

LBYLNet

[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

Weekly_Group_Meeting_Paper_List

RGBD-Counting

RGBD crowd counting

WeakSVR

(CVPR 2023) Official implemention of the paper "Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos"

Locating_Counting_with_a_Depth_Prior

[TPAMI] Locating and Counting Heads in Crowds With a Depth Prior

RGBD-Gaze

RGBD Based Gaze Estimation via Multi-task CNN

SVIP-Sequence-VerIfication-for-Procedures-in-Videos

[CVPR2022] SVIP: Sequence VerIfication for Procedures in Videos

ShanghaiTechRGBDSyn

[TPAMI] Locating and Counting Heads in Crowds With a Depth Prior

Medical-Image-CodeBase-SVIP-Lab

Useful and frequently used code for computer vision

Saliency-Detection-in-360-Videos

Saliency-Detection-in-360-Videos

svip-lab.github.io

SvipLab-ChatGPT-Web-Share

CrowdCountingPAL

SphericalDNNs