• Stars
    star
    225
  • Rank 177,187 (Top 4 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created about 4 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TartanVO: A Generalizable Learning-based VO

TartanVO: A Generalizable Learning-based VO

TartanVO is a learning-based visual odometry trained on TartanAir dataset. It generalizes to multiple datasets and real-world scenarios, and outperforms geometry-based methods in challenging scenes. You can check the python3 branch if you are using python3.

Please check out our Paper.

Example

Introduction video: Youtube Bilibili

Our model is trained purely on simulation data, but it generalizes well to real-world data. For example, these are the testing results on KITTI-10 and EuRoC V102:

KITTI10 EUROC_V102

Setting up the environment in the docker

We provide a prebuilt docker image and a dockerfile, which allow you to replicate our setup. The docker image contains everything we need for testing this repo, including cuda, pytorch, cupy, opencv, ROS-melodic and etc. Here are the steps to build the docker image.

  1. Install docker and nvidia-docker. You can find online tutorials like this.
  2. Run the docker image and mount the repository into the container, the following commands will automatically download the docker image.
$ git clone https://github.com/castacks/tartanvo.git
$ cd tartanvo
$ nvidia-docker run -it --rm --network host --ipc=host -v $PWD:/tartanvo amigoshan/tartanvo:latest
$ cd tartanvo
  1. Now it's all set. Continuing the following steps inside the container.

The above docker image is built on a ubuntu machine with nvidia driver 440.100. Alternatively, you can also build the docker image from the dockerfile we provided:

$ cd docker 
$ docker build -t tartanvo -f tartanvo_ros.dockerfile .

Running without docker

This repo has the following dependencies:

  • Python 2 / 3
  • numpy
  • matplotlib
  • scipy
  • pytorch >= 1.3
  • opencv-python
  • cupy

You can install the above dependencies manually, or use the following command:

$ pip install numpy matplotlib scipy torch==1.4.0 opencv-python==4.2.0.32 cupy==6.7.0

Our code has been tested on Ubuntu 18.04 and 16.04. An nvidia-driver and a Cuda version of 9.2/10.2 are required to run the code.

Testing with a pretrained model

Download the pretrained model

$ mkdir models
$ wget https://cmu.box.com/shared/static/t1a5u4x6dxohl89104dyrsiz42mvq2sz.pkl -O models/tartanvo_1914.pkl

Download the testing data

  • Download KITTI-10 testing trajectory
$ mkdir data
$ wget https://cmu.box.com/shared/static/nw3bi7x5vng2xy296ndxt19uozpk64jq.zip -O data/KITTI_10.zip
$ unzip -q data/KITTI_10.zip -d data/KITTI_10/

You can also download other trajectories from the KITTI website. First, each trajectory may have different intrinsics, make sure you change the values in Datasets/utils.py. Second, we are using the same format of pose file with the TartanAir dataset. You can use the Datasets/transformation/kitti2tartan() to convert the KITTI pose files.

  • Download EuRoC-V102 testing trajectory
$ mkdir data
$ wget https://cmu.box.com/shared/static/1ctocidptdv1xev6pjxdj0b782mrstps.zip -O data/EuRoC_V102.zip
$ unzip -q data/EuRoC_V102.zip -d data/EuRoC_V102/

You can download other trajectories from the EuRoC dataset. What we did for the sample data EuRoC_V102 was the following:

  • Undistort the image according to their calibration results.
  • Match the images with the GT poses according to the timestamp.
  • Change the intrinsics parameters in the code.

Note the poses outputed by TartanVO are in the NED frame.

Alternative download links:

In case using the above links is slow, please try the following links from Azure. You can replace the links with the following ones. Instead of wget, using the azcopy tool usually is much faster.

Try Baidu cloud or Google drive if neither the Box nor the Azure works.

Run the testing script

The vo_trajectory_from_folder.py script shows an example of running TartanVO on a sequence of images out of a folder. Because TartanVO outputs up-to-scale translation, the script also reads a pose file for adjusting the translation scale. If the pose file is not provided, the default scale of 1.0 will be used. The results are stored in the results folder.

  • Testing on KITTI
$ python vo_trajectory_from_folder.py  --model-name tartanvo_1914.pkl \
                                       --kitti \
                                       --batch-size 1 --worker-num 1 \
                                       --test-dir data/KITTI_10/image_left \
                                       --pose-file data/KITTI_10/pose_left.txt 
  • Testing on EuRoC

$ python vo_trajectory_from_folder.py  --model-name tartanvo_1914.pkl \
                                       --euroc \
                                       --batch-size 1 --worker-num 1 \
                                       --test-dir data/EuRoC_V102/image_left \
                                       --pose-file data/EuRoC_V102/pose_left.txt

Running the above commands with the --save-flow tag, allows you to save intermediate optical flow outputs into the results folder.

Adjust the batch size and the worker number by --batch-size 10, --worker-num 5.

Run the ROS node

We provide a python ROS node in tartanvo_node.py for the easy integration of the TartanVO into robotic systems.

How does TartanVONode work?

  1. Subscribed topics

    • rgb_image (sensor_msgs/Image): RGB image.
    • cam_info (sensor_msgs/CameraInfo): camera parameters which are used to calculate intrinsics layer.
    • vo_scale (std_msgs/Float32): scale of the translation (should be published at the same frequncy with the image). If this is not provided, default value of 1.0 will be used.
  2. Published topics

    • tartanvo_pose (geometry_msgs/PoseStamped): position and orientation of the camera
    • tartanvo_odom (nav_msgs/Odometry): position and orientation of the camera (same with the tartanvo_pose).
  3. Parameters: We use the following parameters to calculate the initial intrinsics layer. If the cam_info topic is received, the intrinsics value will be over-written.

    • image_width : image width
    • image_height : image height
    • focal_x : camera focal lengh
    • focal_y : camera focal lengh
    • center_x : camera optical center
    • center_y : camera optical center

Run the ROS node

  1. Open a ROS core:
$ roscore
  1. Run the TartanVONode
$ python tartanvo_node.py
  1. Publish the images and scales, e.g. run the following example
$ python publish_image_from_folder.py

If you open the rviz and use this config file, you can see the visualization as follows

RVIZ

Paper

More technical details are available in the TartanVO paper. Please cite this as:

@article{tartanvo2020corl,
  title =   {TartanVO: A Generalizable Learning-based VO},
  author =  {Wang, Wenshan and Hu, Yaoyu and Scherer, Sebastian},
  booktitle = {Conference on Robot Learning (CoRL)},
  year =    {2020}
}
@article{tartanair2020iros,
  title =   {TartanAir: A Dataset to Push the Limits of Visual SLAM},
  author =  {Wang, Wenshan and Zhu, Delong and Wang, Xiangwei and Hu, Yaoyu and Qiu, Yuheng and Wang, Chen and Hu, Yafei and Kapoor, Ashish and Scherer, Sebastian},
  booktitle = {2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year =    {2020}
}

License

This software is BSD licensed.

Copyright (c) 2020, Carnegie Mellon University All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

More Repositories

1

tartanair_tools

TartanAir dataset tools and samples
Jupyter Notebook
294
star
2

tartancalib

TartanCalib: Iterative Wide-Angle Lens Calibration using Adaptive SubPixel Refinement of AprilTags
C++
233
star
3

DytanVO

[ICRA'23] DytanVO: Visual Odometry in Dynamic Environments
Python
167
star
4

tartan_drive

Tartan Drive dataset tools and utilities.
Python
53
star
5

xplane_ros

A ROS wrapper for the XPlane-11 flight simulator.
Python
33
star
6

PX4-fully-actuated

PX4 autopilot extended to fully-actuated multirotors
C++
25
star
7

tartan_drive_2.0

Python
23
star
8

lidar_eskf

C++
22
star
9

mcts-stl-planning

Online Signal Temporal Logic (STL) Monte-Carlo Tree Search for Guided Imitation Learning
Python
22
star
10

cvar-energy-risk-deep-model

CVaR-based Flight Energy Risk Assessment for Multirotor UAVs using a Deep Energy Model
Python
19
star
11

learned_cost_map

Python
14
star
12

alfa-dataset-tools

Tools for working with ALFA datset
C++
14
star
13

trochoids

Time-optimal path planning for UAVs in wind using trochoids and Dubins set classification. Package also includes a fast Dubins path solver in no-wind scenarios.
C++
14
star
14

trajairnet

A socially-aware trajectory prediction algorithm
Python
13
star
15

tigris

TIGRIS: An Informed Sampling-based Informative Path Planner
13
star
16

wind_estimator

C++
7
star
17

tartanairpy

A Python package for the TartanAir-V2 dataset.
Python
7
star
18

WIT-UAS-Dataset

Object Detection for High-altitude Infrared Thermal Dataset
Jupyter Notebook
6
star
19

castacks.github.io

AirLab's website source
TeX
6
star
20

semantic_3d_mapping

C++
6
star
21

inspection_sim

C++
4
star
22

maxent_irl_maps

Python
4
star
23

alfa-dataset

3
star
24

trajectory_control

C++
2
star
25

dsta_docker

Dockerfiles and scripts for the DSTA project.
Shell
2
star
26

octomap_tools

C++
2
star
27

firevision_sim

A Visually Stunning Fire IR Simulator for SLAM and Perception
Python
2
star
28

matlab_planning_toolbox

MATLAB
1
star
29

wildfire-m600-firefly

C++
1
star
30

sony_cam_calib_exe

Sony instance search project. A folder holds the script for funning the camera calibration.
Shell
1
star
31

MultiDroneMultiActorFilming

Julia
1
star
32

sony_cam_calib

ROS workspace for camera calibration used for the Sony instance search project.
1
star
33

sony_docker

Docker scripts and dockerfiles for the Sony instance search project.
Dockerfile
1
star
34

dsta_vpi_resample_fisheye

Resample fisheye image using VPI. DSTA project.
C++
1
star
35

airlab_ros_common

Common definitions for ROS used in airlab research projects.
CMake
1
star
36

shimizu_sys_utils

Some system utilities for the Shimizu project.
CMake
1
star
37

mvs_gi_download

Download scripts of the dataset from mvs_gi project.
Python
1
star
38

MetaSLAM

A general Meta SLAM framework
1
star
39

image_sampler

Image sampler for synthetic data generation.
Python
1
star
40

mvs_utils

Utility codes for multi-view stereo.
Python
1
star
41

TartanDrive2

Website for tartan_drive_2.0 repo
JavaScript
1
star