Multi Camera Pose Estimation
Code for estimating the absolute pose of a multi-camera system from a set of 2D-3D matches.
Setup
This project has the following dependencies:
RansacLib and PoseLib are included as a submodule. After cloning the repository, run
git submodule update --init --recursive
To compile the project (under Linux), simple type
mkdir build
cd build/
cmake -DCMAKE_BUILD_TYPE=Release ../
make
Running the executables
There are two executables" fixed_rig_camera_pose
and multi_camera_pose
.
fixed_rig_camera_pose
assumes that the absolute scale of the transformation between the images is known.
multi_camera_pose
does not require the scale to be known, e.g., when the poses are estimated by SLAM, but rather
estimates the scale. Both executables define multi-camera rigs define from sequences of images and expect a list of 2D-3D
matches to be given for each image in the multi-camera system. Both executables share the same command line parameters:
images_with_intrinsics
is the file name of a text file that contains image names, camera intrinsics, and camera extrinsics. Each line consists of the following information:image_name
: the name of the image.intrinsics
: the intrinsics of the image, consisting of a camera type and the parameters. We use Colmaps camera definition. Please see Colmap's camera definitions. An example for this part isSIMPLE_RADIAL 1024 1024 640.0 512 512 0.2
.- The camera extrinsics in the form
qw qx qy qz cx cy cz
, whereqw qx qy qz
is a unit quaternion defining a rotation from world to camera coordinates for this image andcx cy cz
is the position of the image in world coordinates. I.e., a pointXw
in world coordinates is transformed into the local camera coordinate system of the image asXc = R * (Xw - c)
, whereR
is the rotation matrix defined by the quaternion andc
is the position of the image in the world coordinate system. These poses can be defined in an arbitrary coordinate frame. The executables will automatically extract relative poses between the images.
outfile
is the file name of a text file into which the estimated poses will be written. For each image inimages_with_intrinsics
, a pose will be written (if a corresponding pose can be estimated) in a line of the output file. The format of that line isimage_name qw qx qy qz tx ty tz
. Hereimage_name
is the name of the image (as specified inimages_with_intrinsics
,qw qx qy qz
again is a unit quaternion defining the rotation from the coordinate system of the 3D points (see below) to the local camera coordinate system andtx ty tz
is the corresponding translation. IfR
is the rotation matrix corresponding to the quaternion, then a pointXw
in world coordinates is transformed into the local camera coordinate system asXc = R * Xw + t
, wheret
is the translation vector given bytx ty tz
.inlier_threshold
: the inlier threshold to be used in RANSAC, given in pixels.num_lo_steps
: the number of local optimization steps performed in RANSAC whenever a new best minimal pose is found.invert_Y_Z
: the 2D-3D matches for each image are read from text files, where each line has the formatx y X Y Z
. Here,x y
defines a 2D keypoint. Set this variable to1
ifx y
is given in a coordinate system where the y-axis is pointing upwards.points_centered
: set to1
if the 2D keypoint coordinatesx y
are already centered around the principal point. If set to0
, the executables will center the keypoints before using them.undistortion_needed
: set to1
if the 2D keypoint coordinates need to be undistorted and to0
if the keypoints that are read from the text files are already undistorted.sequence_length
: both executables assume that the images specified inimages_with_intrinsics
are given in sequential order. If set tok
, e.g.,3
,fixed_rig_camera_pose
will use the firstk
images to define a multi-camera system and attempt to localize them jointly. It will then use the nextk
images to define the next multi-camera rig, etc.multi_camera_pose
will use the firstk
images to define the first multi-camera system. It will then use images2, ..., k+1
to define the next multi-camera system, then3, ..., k + 2
, etc.[match-file postfix]
(optional): the postfix of the match files, set to.individual_datasets.matches.txt
per default. For a given image namea.jpg
inimages_with_intrinsics
, both executables will attempt to load 2D-3D matches from a text file calleda.jpg.individual_datasets.matches.txt
(for the default value). The text file stores each 2D-3D match in a single line, with the formatx y X Y Z
. Here,x y
is the 2D coordinate of the matching 2D point andX Y Z
is the corresponding 3D point.
Simply calling one of the programs without parameters will give you a list of command line arguments.
Citing
When using this software for publications, please cite
@inproceedings{wald2020,
title={{Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes}},
author={Wald, Johanna and Sattler, Torsten and Golodetz, Stuart and Cavallari, Tommaso and Tombari, Federico},
booktitle = {Proceedings IEEE European Conference on Computer Vision (ECCV)},
year = {2020}
}
When using the pose estimators based on the GP3P or GP4Ps solvers, please cite
@InProceedings{Kukelova2016CVPR,
author = {Kukelova, Zuzana and Heller, Jan and Fitzgibbon, Andrew},
title = {Efficient Intersection of Three Quadrics and Applications in Computer Vision},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2016}
}