• Stars
    star
    143
  • Rank 257,007 (Top 6 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Repository relating to "Data-driven Feature Tracking for Event Cameras" (CVPR, 2023, Award Candidate).

Data-driven Feature Tracking for Event Cameras

youtube_video

This is the code for the CVPR23 paper Data-driven Feature Tracking for Event Cameras (PDF) by Nico Messikommer*, Carter Fang*, Mathis Gehrig, and Davide Scaramuzza. For an overview of our method, check out our video.

If you use any of this code, please cite the following publication:

@Article{Messikommer23cvpr,
  author  = {Nico Messikommer* and Carter Fang* and Mathias Gehrig and Davide Scaramuzza},
  title   = {Data-driven Feature Tracking for Event Cameras},
  journal = {IEEE Conference on Computer Vision and Pattern Recognition},
  year    = {2023},
}

Abstract

Because of their high temporal resolution, increased resilience to motion blur, and very sparse output, event cameras have been shown to be ideal for low-latency and low-bandwidth feature tracking, even in challenging scenarios. Existing feature tracking methods for event cameras are either handcrafted or derived from first principles but require extensive parameter tuning, are sensitive to noise, and do not generalize to different scenarios due to unmodeled effects. To tackle these deficiencies, we introduce the first data-driven feature tracker for event cameras, which leverages low-latency events to track features detected in a grayscale frame. We achieve robust performance via a novel frame attention module, which shares information across feature tracks. By directly transferring zero-shot from synthetic to real data, our data-driven tracker outperforms existing approaches in relative feature age by up to 120% while also achieving the lowest latency. This performance gap is further increased to 130% by adapting our tracker to real data with a novel self-supervision strategy.

ziggy Β  Β  Β  Β  shapes_6dof


Content

This document describes the usage and installation for this repository.

  1. Installation
  2. Test Sequences and Pretrained Weights
  3. Preparing Synthetic Data
  4. Training on Synthetic Data
  5. Preparing Pose Data
  6. Training on Pose Data
  7. Preparing Evaluation Data
  8. Running Ours
  9. Evaluation
  10. Visualization

Installation

This guide assumes use of Python 3.9.7

  1. If desired, a conda environment can be created using the following command:
conda create -n <env_name>
  1. Install the dependencies via the requirements.txt file
    pip install -r requirements.txt

    Dependencies for training:

    • PyTorch
    • Torch Lightning
    • Hydra

    Dependencies for pre-processing:

    • numpy
    • OpenCV
    • H5Py and HDF5Plugin

    Dependencies for visualization:

    • matplotlib
    • seaborn
    • imageio


Test Sequences and Pretrained Weights

To facilitate the evaluation of the tracking performance, we provide the raw events, multiple event representation, etc., for the used test sequences of the Event Camera Dataset and the EDS dataset. The ground truth tracks for both EC and EDS datasets generated based on the camera poses and KLT tracks can be downloaded here.

Furthermore, we also provide the network weights trained on the Multiflow dataset, the weights fine-tuned on the EC, and fine-tuned on the EDS dataset using our proposed pose supervision strategy.


Preparing Synthetic Data

Download MultiFlow Dataset

Download links:

If you use this dataset in an academic context, please cite:

@misc{Gehrig2022arxiv,
 author = {Gehrig, Mathias and Muglikar, Manasi and Scaramuzza, Davide},
 title = {Dense Continuous-Time Optical Flow from Events and Frames},
 url = {https://arxiv.org/abs/2203.13674},
 publisher = {arXiv},
 year = {2022}
}

The models were pre-trained using an older version of this dataset, available at the time of the submission. The download links above link to the up-to-date version of the dataset.

Pre-Processing Instructions

Preparation of the synthetic data involves generating input representations for the Multiflow sequences and extracting the ground-truth tracks.

To generate ground-truth tracks, run:
python data_preparation/synthetic/generate_tracks.py <path_to_multiflow_dataset> <path_to_multiflow_extras_dir>

Where the Multiflow Extras directory contains data needed to train our network such as the ground-truth tracks and input event representations.

To generate input event representations, run:
python data_preparation/synthetic/generate_event_representations <path_to_multiflow_dataset> <path_to_multiflow_extras_dir> <representation_type>

The resulting directory structure is:

multiflow_reloaded_extra/
β”œβ”€ sequence_xyz/
β”‚  β”œβ”€ events/
β”‚  β”‚  β”œβ”€ 0.0100/
β”‚  β”‚  β”‚  β”œβ”€ representation_abc/
β”‚  β”‚  β”‚  β”‚  β”œβ”€ 0400000.h5
β”‚  β”‚  β”‚  β”‚  β”œβ”€ 0410000.h5
β”‚  β”‚  β”œβ”€ 0.0200/
β”‚  β”‚  β”‚  β”œβ”€ representation_abc/
β”‚  β”‚
β”‚  β”œβ”€ tracks/
β”‚  β”‚  β”œβ”€ shitomasi.gt.txt

Training on Synthetic Data

Training on synthetic data involves configuring the dataset, model, and training. The high-level config is at configs/train_defaults.yaml.

To configure the dataset:

  • Set data field to mf
  • Configure the synthetic dataset in configs/data/mf.yaml
  • Set the track_name field (default is shitomasi_custom)
  • Set the event representation (default is SBT Max, referred to as time_surfaces_v2_5 here)

Important parameters in mf.yaml are:

  • augment - Whether to augment the tracks or not. The actual limits for augmentations are defined as global variables in utils/datasets.py
  • mixed_dt - Whether to use both timesteps of 0.01 and 0.02 during training.
  • n_tracks/val - Number of tracks to use for validation and training. All tracks are loaded, shuffled, then trimmed.

To configure the model, set the model field to one of the available options in configs/model. Our default model is correlation3_unscaled.

To configure the training process:

  • Set the learning rate in configs/optim/adam.yaml (Default is 1e-4)
  • In configs/training/supervised_train.yaml, set the sequence length schedule via init_unrolls, max_unrolls, unroll_factor, and the schedule. At each of the specified training steps, the number of unrolls will be multiplied by the unroll factor.
  • Configure the synthetic dataset in configs/data/mf.yaml

The last parameter to set is experiment for organizational purposes.

With everything configured, we can begin training by running

CUDA_VISIBLE_DEVICES=<gpu_id> python train.py

Hydra will then instantiate the dataloader and model. PyTorch Lightning will handle the training and validation loops. All outputs (checkpoints, gifs, etc) will be written to the log directory.

The correlation_unscaled model inherits from models/template.py since it contains the core logic for training and validation. At each training step, event patches are fetched for each feature track (via TrackData instances) and concatenated prior to inference. Following inference, the TrackData instances accumulate the predicted feature displacements. The template file also contains the validation logic for visualization and metric computation.

To inspect models during training, we can launch an instance of tensorboard for the log directory: tensorboard --logdir <log_dir>.


Preparing Pose Data

To prepare pose data for fine-tuning, we need to rectify the data, run colmap, and generate event representations.

To rectify the data, run python data_preparation/real/rectify_ec.py or python data_preparation/real/eds_rectify_events_and_frames.py.

To refine the pose data with colmap, see data_preparation/colmap.py. We first run colmap.py generate. This will convert the pose data to a readable format for COLMAP to serve as an initial guess, generated in the colmap directory of the sequence. We then follow the instructions here from the COLMAP FAQ regarding refining poses.

Essentially:

  1. Navigate to the colmap directory of a sequence
  2. colmap feature_extractor --database_path database.db
  3. colmap exhaustive_matcher --database_path database.db --image_path ../images_corrected
  4. colmap point_triangulator --database_path database.db --image_path ../images_corrected/ --input_path . --output_path .
  5. Launch the colmap gui, import the model files, and re-run Bundle Adjustment ensuring that only extrinsics are refined.
  6. Run colmap.py extract to convert the pose data from COLMAP format back to our standard format.

To generate event representations, run python data_preparation/real/prepare_eds_pose_supervision.py or prepare_ec_pose_supervision.py. These scripts generate r event representations between frames. The time-window of the last event representation in the interval is trimmed. Currently, these scripts only support SBT-Max as a representation.


Training on Pose Data

To train on pose data, we again need to configure the dataset, model, and training. The model configuration is the same as before. The data field now needs to be set to pose_ec, and configs/data/pose_ec.yaml must be configured.

Important parameters to set in pose.yaml include:

  • root_dir - Directory with prepared pose data sequences.
  • n_frames_skip - How many frames to skip when chunking a sequence into several sub-sequences for pose training.
  • n_event_representations_per_frame - r value used when generating the event representations.

In terms of dataset configuration must also set pose_mode = True in utils/dataset.py. This overrides the loading of event representations from the time-step directories (eg 0.001) and instead from the pose data directories (eg pose_3).

In terms of the training process, for pose supervision we use a single sequence length so init_unrolls and max_unrolls should be the same value. Also, the schedule should have a single value indicating when to stop training. The default learning rate for pose supervision is 1e-6.

Since we are fine-tuning on pose, we must also set the checkpoint_path in configs/training/pose_finetuning_train_ec.yaml to the path of our pretrained model.

We are then ready to run train.py and fine-tune the network. Again, during training, we can launch tensorboard. For pose supervision, the re-projected features are visualized.


Running Ours

Preparing Input Data

The SequenceDataset class is responsible for loading data for inference. It expects a similar data format for the sequence as with synthetic training:

sequence_xyz/
β”œβ”€ events/
β”‚  β”œβ”€ 0.0100/
β”‚  β”‚  β”œβ”€ representation_abc/
β”‚  β”‚  β”‚  β”œβ”€ 0000000.h5
β”‚  β”‚  β”‚  β”œβ”€ 0010000.h5
β”œβ”€ images_corrected/

To prepare a single sequence for inference, we rectify the sequence, a sequence segment, and generate event representations.

Rectification

For the EDS dataset, we download the txt-based version of a sequence and run data_preparation/real/eds_rectify_events_and_frames.py.
For the Event-Camera dataset, we download the txt-based version of a sequence and run data_preparation/real/rectify_ec.py.

Sequence Cropping and Event Generation

For the EDS dataset, we run data_preparation/real/prepare_eds_subseq with the index range for the cropped sequence as inputs. This will generate a new sub-sequence directory, copy the relevant frames for the selected indices, and generate event representations.

Inference

The inference script is evaluate_real.py and the configuration file is eval_real_defaults.yaml. We must set the event representation and checkpoint path before running the script.

The list of sequences is defined in the EVAL_DATASETS variable in evaluate_real.py. The script iterates over these sequences, instantiates a SequenceDataset instance for each one, and performs inference on the event representations generated in the previous section.

For benchmarking, the provided feature points need to be downloaded and used in order to ensure that all methods use the same features. The gt_path needs to be set in eval_real_defaults.yaml to the directory containing the text files.


Evaluation

Once we have predicted tracks for a sequence using all methods, we can benchmark their performance using scripts/benchmark.py. This script loads the predicted tracks for each method and compares them against the re-projected, frame-based ground-truth tracks, which can be downloaded here.
Inside the scripts/benchmark.py, the evaluation sequences, the results directory, the output directory and the name of the test methods <method_name_x> need to be specified. The result directory should have the following structure:

sequence_xyz/
β”œβ”€ gt/
β”‚  β”œβ”€ <seq_0>.gt.txt
β”‚  β”œβ”€ <seq_1>.gt.txt
β”‚  β”œβ”€ ...
β”œβ”€ <method_name_1>/
β”‚  β”œβ”€ <seq_0>.txt
β”‚  β”œβ”€ <seq_1>.txt
β”‚  β”œβ”€ ...
β”œβ”€ <method_name_2>/
β”‚  β”œβ”€ <seq_0>.txt
β”‚  β”œβ”€ <seq_1>.txt
β”‚  β”œβ”€ ...

The results are printed to the console and written to a CSV in the output directory.

More Repositories

1

event-based_vision_resources

2,212
star
2

rpg_svo

Semi-direct Visual Odometry
C++
2,013
star
3

rpg_svo_pro_open

C++
1,125
star
4

rpg_trajectory_evaluation

Toolbox for quantitative trajectory evaluation of VO/VIO
Python
852
star
5

flightmare

An Open Flexible Quadrotor Simulator
C++
756
star
6

agile_autonomy

Repository Containing the Code associated with the Paper: "Learning High-Speed Flight in the Wild"
C++
577
star
7

rpg_timelens

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation
Python
566
star
8

rpg_quadrotor_control

Quadrotor control framework developed by the Robotics and Perception Group
C++
494
star
9

rpg_open_remode

This repository contains an implementation of REMODE (REgularized MOnocular Depth Estimation), as described in the paper.
C++
480
star
10

rpg_esim

ESIM: an Open Event Camera Simulator
C
476
star
11

agilicious

Agile flight done right!
TeX
424
star
12

vilib

CUDA Visual Library by RPG
C++
399
star
13

rpg_public_dronet

Code for the paper Dronet: Learning to Fly by Driving
Python
395
star
14

high_mpc

Policy Search for Model Predictive Control with Application to Agile Drone Flight
C
317
star
15

rpg_dvs_ros

ROS packages for DVS
C++
293
star
16

rpg_e2vid

Code for the paper "High Speed and High Dynamic Range Video with an Event Camera" (T-PAMI, 2019).
Python
275
star
17

dslam_open

Public code for "Data-Efficient Decentralized Visual SLAM"
MATLAB
270
star
18

rpg_svo_example

Example node to use the SVO Installation.
C++
268
star
19

rpg_mpc

Model Predictive Control for Quadrotors with extension to Perception-Aware MPC
C
248
star
20

rpg_vid2e

Open source implementation of CVPR 2020 "Video to Events: Recycling Video Dataset for Event Cameras"
Python
235
star
21

netvlad_tf_open

Tensorflow port of https://github.com/Relja/netvlad
Python
225
star
22

rpg_ultimate_slam_open

Open source code for "Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High-Speed Scenarios" RA-L 2018
C++
225
star
23

deep_drone_acrobatics

Code for the project Deep Drone Acrobatics.
Python
178
star
24

rpg_information_field

Information Field for Perception-aware Planning
C++
170
star
25

data_driven_mpc

Python
165
star
26

rpg_vision-based_slam

This repo contains the code of the paper "Continuous-Time vs. Discrete-Time Vision-based SLAM: A Comparative Study", RA-L 2022.
C++
163
star
27

vimo

Visual-Inertial Model-based State and External Forces Estimator
C++
162
star
28

rpg_dvs_evo_open

Implementation of EVO (RA-L 17)
C++
160
star
29

fault_tolerant_control

Vision-based quadrotor fault-tolerant flight controller.
C++
139
star
30

rpg_event_representation_learning

Repo for learning event representations
Python
135
star
31

rpg_emvs

Code for the paper "EMVS: Event-based Multi-View Stereo" (IJCV, 2018)
C++
129
star
32

rpg_monocular_pose_estimator

A monocular pose estimation system based on infrared LEDs
C++
128
star
33

rpg_eklt

Code for the paper "EKLT: Asynchronous, Photometric Feature Tracking using Events and Frames" (IJCV'19)
C++
126
star
34

agile_flight

Developing and Comparing Vision-based Algorithms for Vision-based Agile Flight
Python
124
star
35

e2calib

CVPRW 2021: How to calibrate your event camera
Python
118
star
36

rpg_vikit

Vision-Kit provides some tools for your vision/robotics project.
C++
110
star
37

rpg_asynet

Code for the paper "Event-based Asynchronous Sparse Convolutional Networks" (ECCV, 2020).
Python
105
star
38

rpg_e2depth

Code for Learning Monocular Dense Depth from Events paper (3DV20)
Python
105
star
39

rpg_ig_active_reconstruction

This repository contains the active 3D reconstruction library described in the papers: "An Information Gain Formulation for Active Volumetric 3D Reconstruction" by Isler et al. (ICRA 2016) and "A comparison of volumetric information gain metrics for active 3D object reconstruction" by Delmerico et al. (Autonomous Robots, 2017).
C++
103
star
40

fast

FAST corner detector by Edward Rosten
C++
102
star
41

deep_uncertainty_estimation

This repository provides the code used to implement the framework to provide deep learning models with total uncertainty estimates as described in "A General Framework for Uncertainty Estimation in Deep Learning" (Loquercio, SegΓΉ, Scaramuzza. RA-L 2020).
Python
102
star
42

aegnn

Python
101
star
43

rpg_corner_events

Fast Event-based Corner Detection
C++
101
star
44

snn_angular_velocity

Event-Based Angular Velocity Regression with Spiking Networks
Python
98
star
45

DSEC

Python
96
star
46

eds-buildconf

Build bootstrapping for the Event-aided Direct Sparce Odometry (EDS)
Shell
94
star
47

IROS2019-FPV-VIO-Competition

FPV Drone Racing VIO competition.
93
star
48

rpg_davis_simulator

Simulate a DAVIS camera from synthetic Blender scenes
Python
92
star
49

E-RAFT

Python
82
star
50

sim2real_drone_racing

A Framework for Zero-Shot Sim2Real Drone Racing
C++
77
star
51

learned_inertial_model_odometry

This repo contains the code of the paper "Learned Inertial Odometry for Autonomous Drone Racing", RA-L 2023.
Python
75
star
52

rpg_ramnet

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)
Python
75
star
53

imips_open

Matching Features Without Descriptors: Implicitly Matched Interest Points
Python
73
star
54

rpg_feature_tracking_analysis

Package for performing analysis on event-based feature trackers.
Python
72
star
55

rpg_svo_pro_gps

SVO Pro with GPS
C++
71
star
56

sb_min_time_quadrotor_planning

Code for the project Minimum-Time Quadrotor Waypoint Flight in Cluttered Environments
C++
61
star
57

mh_autotune

AutoTune: Controller Tuning for High-Speed Flight
Python
55
star
58

rpg_image_reconstruction_from_events

MATLAB
52
star
59

event-based_object_catching_anymal

Code for "Event-based Agile Object Catching with a Quadrupedal Robot", Forrai et al. ICRA'23
C++
48
star
60

RVT

Implementation of "Recurrent Vision Transformers for Object Detection with Event Cameras". CVPR 2023
Python
48
star
61

ess

Repository relating to "ESS: Learning Event-based Semantic Segmentation from Still Images" (ECCV, 2022).
Python
47
star
62

colmap_utils

Python scripts and functions to work with COLMAP
Python
46
star
63

rpg_youbot_torque_control

Torque Control for the KUKA youBot Arm
C
46
star
64

rpg_time_optimal

Time-Optimal Planning for Quadrotor Waypoint Flight
Python
46
star
65

rpg_blender_omni_camera

Patch for adding an omnidirectional camera model into Blender (Cycles)
42
star
66

rpg_vi_cov_transformation

Covariance Transformation for Visual-inertial Systems
Python
40
star
67

line_tracking_with_event_cameras

C++
37
star
68

sips2_open

Succinct Interest Points from Unsupervised Inlierness Probability Learning
Python
35
star
69

cl_initial_buffer

Repository relating to "Contrastive Initial State Buffer for Reinforcement Learning" (ICRA, 2024).
Python
33
star
70

uzh_fpv_open

Repo to accompany the UZH FPV dataset
Python
32
star
71

rpg_ev-transfer

Open source implementation of RAL 2022 "Bridging the Gap between Events and Frames through Unsupervised Domain Adaptation"
Python
31
star
72

ESL

ESL: Event-based Structured Light
Python
30
star
73

IROS2020-FPV-VIO-Competition

FPV Drone Racing VIO Competition
29
star
74

flightmare_unity

C#
27
star
75

authorship_attribution

Python
27
star
76

rpg_event_lifetime

MATLAB Implementation of Event Lifetime Estimation
MATLAB
27
star
77

slam-eds

Events-aided Sparse Odometry: this is the library for the direct approach using events and frames
C++
25
star
78

fast_neon

Fast detector with NEON accelerations
C++
19
star
79

direct_event_camera_tracker

Open-source code for ICRA'19 paper Bryner et al.
C++
17
star
80

timelens-pp

Dataset Download page for the BS-ERGB dataset introduced in Time Lens++ (CVPR'22)
15
star
81

ICRA2020-FPV-VIO-Competition

FPV Drone Racing VIO competition.
12
star
82

rpg_quadrotor_common

Common functionality for rpg_quadrotor_control
C++
11
star
83

flymation

Flexible Animation for Flying Robots
C#
8
star
84

ze_oss

RPG fork of ze_oss
C++
7
star
85

slam-orogen-eds

Event-aided Direct Sparse Odometry: full system in a Rock Task component
C++
6
star
86

cvpr18_event_steering_angle

Repository of the CVPR18 paper "Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars"
Python
5
star
87

rpg_mpl_ros

C++
4
star
88

dsec-det

Code for assembling and visualizing DSEC data for the detection task.
Python
4
star
89

esfp

ESfP: Event-based Shape from Polarization (CVPR 2023)
Python
3
star
90

VAPAR

Python
3
star
91

rpg_single_board_io

GPIO and ADC functionality for single board computers
C++
3
star
92

assimp_catkin

A catkin wrapper for assimp
CMake
1
star
93

aruco_catkin

Catkinization of https://sourceforge.net/projects/aruco/
CMake
1
star
94

dodgedrone_simulation

C++
1
star
95

power_line_tracking_with_event_cameras

Python
1
star
96

pangolin_catkin

CMake
1
star
97

dlib_catkin

Catkin wrapper for https://github.com/dorian3d/DLib
CMake
1
star