Evaluation
System Evaluation
We open sourced our evaluation toolkit at eval-vislam.
Prerequisites
Accuracy (APE, RPE, ARE, RRE, Completeness)
Usage:
./accuracy <groundtruth> <input> <fix scale>
Arguments:
<groundtruth> Path to sequence folder, e.g. ~/VISLAM-Dataset/A0.
<input> SLAM camera trajectory file in TUM format(timestamp[s] px py pz qx qy qz qw).
<fix scale> Set to 1 for VISLAM, set to 0 for VSLAM.
Initialization Scale Error and Time
Usage:
./initialization <groundtruth> <input> <has inertial>
Arguments:
<groundtruth> Path to sequence folder, e.g. ~/VISLAM-Dataset/A0.
<input> SLAM camera trajectory file in TUM format(timestamp[s] px py pz qx qy qz qw).
<has inertial> Set to 1 for VISLAM, set to 0 for VSLAM.
Robustness
Usage:
./robustness <groundtruth> <input> <fix scale>
Arguments:
<groundtruth> Path to sequence folder, e.g. ~/VISLAM-Dataset/A0.
<input> SLAM camera trajectory file in TUM format(timestamp[s] px py pz qx qy qz qw).
<fix scale> Set to 1 for VISLAM, set to 0 for VSLAM.
Relocalization Time
Usage:
relocalization <groundtruth> <input> <has inertial>
Arguments:
<groundtruth> Path to sequence folder, e.g. ~/VISLAM-Dataset/A0.
<input> SLAM camera trajectory file in TUM format(timestamp[s] px py pz qx qy qz qw).
<has inertial> Set to 1 for VISLAM, set to 0 for VSLAM.
Representative Monocular SLAM Results
Due to the randomness in these SLAM systems, they may not always produce the same results. So, we run each benchmark 10 times to take the average of the results.
Some algorithms, when running against some sequences, may produce inconsistent results. Therefore, we remove these failure cases (i.e., APEs were unusually large) by inspection, and then compute the average of the remaining ones.
The number of failures for each sequence
Each SLAM system ran 10 times on each sequence
Seq | PTAM | ORB-SLAM2 | LSD-SLAM | DSO | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 |
---|---|---|---|---|---|---|---|---|---|
A0 | 1 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 |
A1 | 3 | 5 | 1 | 6 | 3 | 0 | 0 | 0 | 0 |
A2 | 1 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 |
A3 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 |
A4 | 0 | 1 | 1 | 0 | 0 | 2 | 0 | 0 | 0 |
A5 | 1 | 0 | 0 | 2 | 0 | 6 | 0 | 0 | 0 |
A6 | 3 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 |
A7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
VSLAM Tracking accuracy
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | DSO | ||||
---|---|---|---|---|---|---|---|---|
A0 | 75.442 | 6.696 | 96.777 | 5.965 | 105.963 | 11.761 | 231.860 | 10.456 |
A1 | 113.406 | 16.344 | 95.379 | 10.285 | 221.643 | 23.833 | 431.929 | 12.555 |
A2 | 67.099 | 6.833 | 69.486 | 5.706 | 310.963 | 8.156 | 216.893 | 5.337 |
A3 | 10.913 | 4.627 | 15.310 | 7.386 | 199.445 | 10.872 | 188.989 | 4.294 |
A4 | 21.007 | 4.773 | 10.061 | 2.995 | 155.692 | 10.756 | 115.477 | 4.595 |
A5 | 40.403 | 8.926 | 29.653 | 11.717 | 249.644 | 12.302 | 323.482 | 7.978 |
A6 | 19.483 | 3.051 | 12.145 | 6.741 | 49.805 | 3.018 | 14.864 | 2.561 |
A7 | 13.503 | 2.462 | 5.832 | 1.557 | 38.673 | 2.662 | 27.142 | 2.213 |
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | DSO | ||||
---|---|---|---|---|---|---|---|---|
A0 | 12.051 | 0.257 | 5.119 | 0.342 | 20.589 | 0.371 | 9.983 | 0.401 |
A1 | 53.954 | 0.291 | 8.534 | 0.242 | 51.122 | 0.288 | 39.007 | 0.524 |
A2 | 8.789 | 0.301 | 5.550 | 0.255 | 30.282 | 0.296 | 10.584 | 0.253 |
A3 | 6.225 | 0.293 | 1.431 | 0.264 | 31.370 | 0.475 | 20.580 | 0.241 |
A4 | 6.295 | 0.255 | 1.015 | 0.157 | 9.592 | 0.498 | 5.217 | 0.180 |
A5 | 14.030 | 0.452 | 1.963 | 0.546 | 36.789 | 0.810 | 40.939 | 0.324 |
A6 | 2.348 | 0.217 | 0.892 | 0.169 | 5.012 | 0.207 | 1.435 | 0.189 |
A7 | 1.218 | 0.153 | 0.569 | 0.115 | 3.052 | 0.147 | 2.239 | 0.135 |
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | DSO |
---|---|---|---|---|
A0 | 79.386 | 65.175 | 49.513 | 14.476 |
A1 | 60.893 | 68.303 | 11.511 | 0.869 |
A2 | 85.348 | 79.263 | 21.804 | 22.878 |
A3 | 71.635 | 98.497 | 27.112 | 43.493 |
A4 | 95.418 | 100.000 | 64.283 | 80.371 |
A5 | 87.399 | 97.785 | 25.033 | 2.059 |
A6 | 97.399 | 99.786 | 94.883 | 100.000 |
A7 | 100.000 | 100.000 | 98.663 | 100.000 |
VISLAM Tracking accuracy
Sequence | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 | |||||
---|---|---|---|---|---|---|---|---|---|---|
A0 | 156.018 | 7.436 | 71.677 | 7.064 | 63.395 | 3.510 | 75.388 | 2.497 | 58.995 | 2.525 |
A1 | 294.091 | 14.580 | 87.73 | 4.283 | 80.687 | 3.472 | 161.444 | 1.676 | 55.097 | 2.876 |
A2 | 102.657 | 10.151 | 68.381 | 5.412 | 74.842 | 8.605 | 56.562 | 1.334 | 36.370 | 1.560 |
A3 | 44.493 | 3.780 | 22.949 | 8.739 | 19.964 | 1.234 | 23.643 | 0.837 | 17.792 | 0.779 |
A4 | 114.845 | 8.338 | 146.89 | 12.46 | 18.691 | 1.091 | 21.532 | 0.953 | 15.558 | 0.930 |
A5 | 82.885 | 8.388 | 77.924 | 7.588 | 42.451 | 2.964 | 49.790 | 1.473 | 34.810 | 1.954 |
A6 | 66.001 | 6.761 | 63.895 | 6.86 | 26.240 | 1.167 | 27.088 | 0.683 | 20.467 | 0.569 |
A7 | 105.492 | 4.576 | 47.465 | 6.352 | 18.226 | 1.465 | 19.973 | 0.746 | 10.777 | 0.831 |
Sequence | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 | |||||
---|---|---|---|---|---|---|---|---|---|---|
A0 | 6.584 | 0.203 | 3.637 | 0.741 | 3.441 | 0.205 | 3.378 | 0.206 | 3.660 | 0.197 |
A1 | 8.703 | 0.135 | 5.14 | 1.098 | 1.518 | 0.088 | 1.470 | 0.088 | 2.676 | 0.092 |
A2 | 3.324 | 0.195 | 2.493 | 0.869 | 1.775 | 0.201 | 1.766 | 0.201 | 1.674 | 0.181 |
A3 | 6.952 | 0.186 | 2.459 | 0.825 | 2.121 | 0.176 | 2.443 | 0.176 | 1.642 | 0.182 |
A4 | 4.031 | 0.104 | 3.765 | 0.603 | 1.185 | 0.063 | 1.419 | 0.063 | 1.129 | 0.071 |
A5 | 4.928 | 0.167 | 8.843 | 0.360 | 3.000 | 0.040 | 4.521 | 0.040 | 2.041 | 0.089 |
A6 | 2.625 | 0.170 | 2.275 | 0.629 | 1.478 | 0.131 | 1.511 | 0.131 | 1.656 | 0.134 |
A7 | 6.810 | 0.120 | 3.536 | 0.602 | 1.248 | 0.073 | 0.842 | 0.073 | 0.502 | 0.082 |
Sequence | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 |
---|---|---|---|---|---|
A0 | 40.186 | 94.255 | 92.546 | 82.945 | 97.317 |
A1 | 1.646 | 98.235 | 86.508 | 19.674 | 95.072 |
A2 | 61.423 | 94.959 | 88.301 | 92.389 | 99.707 |
A3 | 97.814 | 95.972 | 100.000 | 100.000 | 100.000 |
A4 | 76.629 | 97.429 | 100.000 | 100.000 | 100.000 |
A5 | 76.738 | 98.162 | 98.795 | 98.733 | 99.143 |
A6 | 94.128 | 97.805 | 100.000 | 100.000 | 100.000 |
A7 | 68.341 | 96.690 | 100.000 | 100.000 | 100.000 |
Initialization quality
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | DSO | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 |
---|---|---|---|---|---|---|---|---|---|
A0 | 13.914 | 2.040 | 1.615 | 3.783 | 1.154 | 1.067 | 0.895 | 0.900 | 1.840 |
A1 | 18.334 | 6.930 | 25.578 | 15.598 | 5.182 | 2.892 | 3.220 | 3.135 | 3.674 |
A2 | 3.087 | 1.945 | 4.980 | 1.321 | 3.820 | 1.155 | 0.584 | 0.641 | 2.154 |
A3 | 1.667 | 0.974 | 0.810 | 0.683 | 3.730 | 0.690 | 1.254 | 1.273 | 0.764 |
A4 | 12.059 | 2.777 | 6.404 | 1.793 | 2.872 | 10.997 | 1.751 | 1.783 | 2.967 |
A5 | 18.743 | 4.062 | 12.934 | 17.815 | 4.366 | 2.119 | 1.866 | 1.895 | 1.183 |
A6 | 2.415 | 2.794 | 5.655 | 2.699 | 6.712 | 6.696 | 2.246 | 2.132 | 1.484 |
A7 | 1.037 | 0.772 | 1.624 | 0.671 | 9.532 | 1.413 | 1.164 | 1.126 | 0.835 |
Average | 8.907 | 2.787 | 7.450 | 5.545 | 4.671 | 3.379 | 1.622 | 1.611 | 1.863 |
Max | 18.743 | 6.930 | 25.578 | 17.815 | 9.532 | 10.997 | 3.220 | 3.130 | 3.674 |
Tracking robustness
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | DSO | MSCKF | OKVIS | VINS-Mono | VINS-Mono-NoLoop | SenseSLAM v1.0 |
---|---|---|---|---|---|---|---|---|---|
B0 (Rapid Rotation) | 4.73 | 0.844 | 1.911 | 6.991 | --- | 1.071 | 2.789 | 2.835 | 0.306 |
B1 (Rapid Translation) | 4.971 | 0.231 | 1.09 | 2.636 | --- | 0.597 | 1.211 | 1.415 | 0.199 |
B2 (Rapid Shaking) | 5.475 | 0.294 | 1.387 | --- | --- | 3.917 | 13.403 | 21.08 | 2.013 |
B3 (Moving People) | 7.455 | 0.6 | 0.897 | 6.399 | --- | 0.673 | 0.785 | 0.531 | 0.465 |
B4 (Covering Camera) | 16.033 | 2.702 | 0.727 | --- | --- | 1.976 | 0.714 | 0.826 | 0.326 |
Relocalization time
Sequence | PTAM | ORB-SLAM2 | LSD-SLAM | VINS-Mono | SenseSLAM v1.0 |
---|---|---|---|---|---|
B5 (1s black-out) | 1.032 | 0.077 | 1.082 | 5.274 | 0.592 |
B6 (2s black-out) | 0.366 | 0.465 | 5.413 | 3.755 | 1.567 |
B7 (3s black-out) | 0.651 | 0.118 | 1.834 | 1.282 | 0.332 |
Average | 0.683 | 0.220 | 2.776 | 3.437 | 0.830 |
System Reference
Klein G, Murray D. Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan, 2007: 225–234 DOI:10.1109 / ISMAR.2007.4538852
Mur-Artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 2017, 33(5): 1255–1262 DOI:10.1109 / tro.2017.2705103.
Engel J, Schöps T, Cremers D. LSD-SLAM: Large-Scale Direct Monocular SLAM. Computer Vision–ECCV 2014. Cham: Springer International Publishing, 2014: 834−849 DOI:10.1007 / 978-3-319-10605-2_54.
Engel J, Koltun V, Cremers D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611–625 DOI:10.1109 / tpami.2017.2658577.
Mourikis A I, Roumeliotis S I. A multi-state constraint kalman filter for vision-aided inertial navigation. In: IEEE International Conference on Robotics and Automation. Roma, Italy, 2007: 3565–3572 DOI:10.1109 / ROBOT.2007.364024.
Leutenegger S, Lynen S, Bosse M, Siegwart R, Furgale P. Keyframe-based visual–inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 2015, 34(3): 314–334 DOI:10.1177 / 0278364914554813.
Qin T, Li P L, Shen S J. VINS-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 2018, 34(4): 1004–1020 DOI:10.1109 / tro.2018.2853729.