SLAM-for-AR Competition @ ISMAR2019
Click here to visit our competition Homepage.
The competition results and the system descriptions has been published.
Competition Results - V-SLAM
Competition Results - VI-SLAM
Download
- TrainingData : OneDrive or Our Server.
- Registration Form : doc
Dataset Format
Each sequence would provide several ‘sensors’ along with their sensor.yaml
file that specifies sensor type, intrinsic and extrinsic parameters. The sensor measurements(or measurement indices, for camera) is stored in data.csv
. In our case, both camera and IMU data would be provided. The vicon and groundtruth is also treated like a ‘sensor’.
Here is an example:
A01
|--camera
| |--sensor.yaml
| |--data.csv
| `--data
| |--771812250517066.png
| |--771812283849357.png
| `--...
|--imu
| |--sensor.yaml
| `--data.csv
|--vicon
| |--sensor.yaml
| `--data.csv
`--groundtruth
|--sensor.yaml
`--data.csv
Evaluation Instruction
The Format of Submission Result
The estimated 6 DoF camera poses (from camera coordinate to the world coordinate) are required to evaluate the performance. Considering that there is a certain randomness of estimation, each sequence is required to be run for 5 times, resulting in 5 pose files and 5 running time files. We will select the median result from all five results for evaluation. It should be noted that : The format for each pose file is described as follows:
timestamp[i] p_x p_y p_z q_x q_y q_z q_w
where (p_x, p_y, p_z)
is the camera position, and the unit quaternion (q_x, q_y, q_z, q_w)
is the camera orientation. You should output the real-time poses after each frame is processed (not the poses after final global optimization), and the output of poses should be in the same frame rate as the input camera images (Otherwise, the completeness evaluation would be affected).
The format for each running time file is described as follows
timestamp[i] t_pose
where t_pose
denotes the system time (or cumulative running time in seconds and at least three decimal places, even for black frames in D6) when the pose is estimated.
Please submit a zip file containing all the poses and running time files. The structure of zip file should follow the form described as follows:
YourSLAMName/sequence_name/Round-pose.txt
YourSLAMName/sequence_name/Round-time.txt
e.g.
MY-SLAM/C0_test/0-pose.txt
MY-SLAM/C0_test/0-time.txt
You can click here to download the example.
Evaluation
We evaluate the overall performance of a SLAM system considering tracking accuracy, initialization quality, tracking robustness, relocalization time and the computation efficiency. The criteria are as follows:
- $\color{black}{\varepsilon_{APE} / \varepsilon_{ARE}}$ - absolute positional / rotational error
- $\color{black}{\varepsilon_{RPE} / \varepsilon_{RRE}}$ - relative positional / rotational error
- $\color{black}{\varepsilon_{bad}}$ - the ratio of bad poses (100% - completeness)
- $\color{black}{\varepsilon_{init}}$ - initialization quality
- $\color{black}{\varepsilon_{RO}}$ - tracking robustness
- $\color{black}{t_{RL}}$ - relocalization time
The detailed description of the above criteria can be found in the following paper:
Jinyu Li, Bangbang Yang, Danpeng Chen, Nan Wang, Guofeng Zhang, Hujun Bao. Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality. Journal of Virtual Reality & Intelligent Hardware, 2019, 1(4): 386-410. DOI:10.3724/SP.J.2096-5796.2018.0011. URL: http://www.vr-ih.com/vrih/html/EN/10.3724/SP.J.2096-5796.2018.0011/article.html
We convert each criteria error $\color{black}{\varepsilon_{i}}$ into a normalized score by $\color{black}{s_i=\frac{{\sigma_i}^2}{{\sigma_i}^2+{\varepsilon_i}^2}\times100\%}$ , where $\color{black}{{\sigma_i}^2}$ is the variance controlling the normalization function shape. The complete score is a weighted sum of all the individual scores as:
$\color{black}{S=w_{APE}s_{APE}+w_{ARE}s_{ARE}+w_{RPE}s_{RPE}+w_{RRE}s_{RRE}+w_{bad}s_{bad}+w_{init}s_{init}+w_{RO}s_{RO}+w_{RL}s_{RL}}$
The weight $\color{black}{w}$ and variance $\color{black}{\sigma}$ (V-SLAM / VI-SLAM) for each criteria are listed below:
$\color{black}{w_{APE}}$ | $\color{black}{w_{ARE}}$ | $\color{black}{w_{RPE}}$ | $\color{black}{w_{RRE}}$ | $\color{black}{w_{bad}}$ | $\color{black}{w_{init}}$ | $\color{black}{w_{RO}}$ | $\color{black}{w_{RL}}$ |
---|---|---|---|---|---|---|---|
1.0 | 1.0 | 0.5 | 0.5 | 1.0 | 1.0 | 1.0 | 1.0 |
$\color{black}{\sigma_{APE}}$ | $\color{black}{\sigma_{ARE}}$ | $\color{black}{\sigma_{RPE}}$ | $\color{black}{\sigma_{RRE}}$ | $\color{black}{\sigma_{bad}}$ | $\color{black}{\sigma_{init}}$ | $\color{black}{\sigma_{RO}}$ | $\color{black}{\sigma_{RL}}$ |
72.46 / 55.83 | 7.41 / 2.48 | 6.72 / 2.92 | 0.26 / 0.17 | 20.68 / 2.38 | 2.79 / 1.85 | 2.27 / 0.95 | 0.65 / 1.42 |
The variance $\color{black}{\sigma}$ list above is obtained by computing the median of our previous evaluation results, which contain the results of 4 V-SLAM systems (PTAM, ORB-SLAM2, LSD-SLAM, DSO) and 4 VI-SLAM systems (MSCKF, OKVIS, VINS-Mono, SenseSLAM) evaluated on our previous released dataset.
You can evaluate your SLAM system with our training dataset using the evaluation tool: https://github.com/zju3dv/eval-vislam.
In the final round competition, we will test all systems on benchmarking PCs with the same hardware configuration. The running time will be taken into account for computing the final score according to the following equation:
$\color{black}{S^*=\frac{\text{min}(30,framerate)}{30}S}$
where $\color{black}{framerate}$ denotes the average framerate of the system.
It should be noted that not all sequences are evaluated for all the critera. The corresponding critera for the sequences are listed below:
Sequences | Corresponding Critera |
---|---|
C0-C11, D8-D10 | APE, RPE, ARE, RRE, Badness, Initialization Quality |
D0-D4 | Tracking Robustness |
D5-D7 | Relocalization Time |
Motion and Scene Type of Sequences
Sequence | Motion | Scene | Description | ||
Xiaomi MI8 | C0 | inspect+patrol | floor | Walking and looking around the glossy floor. | |
C1 | inspect+patrol | clean | Walking around some texture-less areas. | ||
C2 | inspect+patrol | mess | Walking around some random objects. | ||
C3 | aiming+inspect | mess+floor | Random objects first, and then glossy floor. | ||
C4 | aiming+inspect | desktop+clean | From a small scene to a texture-less area. | ||
C5 | wave+inspect | desktop+mess | From a small scene to a texture-rich area. | ||
C6 | hold+inspect | desktop | Looking at a small desktop scene. | ||
C7 | inspect+aiming | desktop | Looking at a small desktop scene. | ||
C8 | inspect+forward | clean | Walking forward in texture-less area at low position. | ||
C9 | inspect+forward | mess | Walking forward in texture-rich area at low position. | ||
C10 | inspect+downward | clean | Walking backward in texture-less area at low position. | ||
C11 | inspect+downward | mess | Walking backward in texture-rich area at low position. | ||
D0 | rapid-rotation | desktop | Rotating the phone rapidly at some time. | ||
D1 | rapid-translation | desktop | Moving the phone rapidly at some time. | ||
D2 | rapid-shaking | desktop | Shaking the phone violently at some time. | ||
D3 | inspect | moving people | A person walks in and out. | ||
D4 | inspect | covering camera | An object occasionally occluding the camera. | ||
D5 | inspect | desktop | Similar to A6 but with black frames. | ||
D6 | inspect | desktop | Similar to A6 but with black frames. | ||
D7 | inspect | desktop | Similar to A6 but with black frames. | ||
D8 | inspect | foreground+background | Walking around the near plane and far plane. | ||
D9 | inspect | plant | Walking around the plant. | ||
D10 | loop | office | Walking around the office with loop closure. |
Dataset Preview
Video Source From YouTube
Video Source From bilibili
Competition Results - V-SLAM
$\text{Final Score} = \text{Benchmark Score} \times \text{Speed Penalty} \times 100$
Click here to download the detailed result.
Configuration of the benchmark PC:
CPU : i7-9700K 3.60GHz
Memory : 32G
GPU : Nvidia RTX 2070-8G
Hard Disk : Samsung SSD 850EVO 500G
Rank | Participants | Affiliation | System Name | APE | RPE | ARE | RRE | Badness | Initialization Quality | Robustness | Relocalization Time | Benchmark Score | Average FrameRate | Speed Penalty | Final Score | System Description |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Zike Yan, Pijian Sun, Xin Wang, Shunkai Li, Sheng Zhang, Hongbin Zha | Key Laboratory of Machine Perception(MOE), School of EECS, Peking University | LF-SLAM | 0.7155 | 0.4037 | 0.8115 | 0.2078 | 0.4554 | 0.7166 | 0.9972 | 0.9965 | 0.8331 | 30.0687 | 1.0000 | 83.31 | Robust line tracking for monocular visual system (Doc would be uploaded after the acceptance of the paper) |
2 | Darius Rueckert | University of Erlangen-Nuremberg | Snake-SLAM | 0.7543 | 0.5700 | 0.8229 | 0.2709 | 0.5183 | 0.7818 | 0.6685* | 0.9974 | 0.8273* | 106.1527 | 1.0000 | 82.73* | Doc |
3 | Neo Yuan Rong Dexter, Toh Yu Heng | Pensees | AR-ORB-SLAM2 | 0.5511 | 0.3850 | 0.5280 | 0.2028 | 0.6020 | 0.7080 | 0.9947 | 0.9890 | 0.7778 | 31.0803 | 1.0000 | 77.78 | |
4 | Xinyu Wei, Zengming Tang, Huiyan Wu, Jun Huang | Shanghai Advanced Research Institute, Chinese Academy of Sciences | PL-SLAM | 0.6036 | 0.4754 | 0.7354 | 0.2376 | 0.4551 | 0.8062 | 0.9985 | 0.9916 | 0.8245 | 27.5107 | 0.9170 | 75.61 | Doc Slides |
5 | Ao Li, Yue Ni | University of Science and Technology of China | Dy-SLAM | 0.7588 | 0.4390 | 0.7953 | 0.2002 | 0.5776 | 0.6018 | 0.9981 | 0.8578 | 0.8182 | 3.8851 | 0.1295 | 10.60 | Doc |
* Note: the output trajectory files by Snake-SLAM (the final submitted executable program during the competition) denote the invalid poses by an identity one, i.e. (0 0 0 -0 -0 -0 1). Unfortunately, it does not fit the invalid pose format that we define, so our evaluation tool still regarded them as valid poses. As a result, the computed robustness score was affected. If we remove these invalid poses, the new robustness score of Snake-SLAM becomes 0.9348, and the final score is 87.17. Since this format issue was discovered after the competition, and did not appear in the results of other teams, the competition ranking can no longer be changed.
Competition Results - VI-SLAM
$\text{Final Score} = \text{Benchmark Score} \times \text{Speed Penalty} \times 100$
Click here to download the detailed result.
Configuration of the benchmark PC:
CPU : i7-9700K 3.60GHz
Memory : 32G
GPU : Nvidia RTX 2070-8G
Hard Disk : Samsung SSD 850EVO 500G
Rank | Participants | Affiliation | System Name | APE | RPE | ARE | RRE | Badness | Initialization Quality | Robustness | Relocalization Time | Benchmark Score | Average FrameRate | Speed Penalty | Final Score | System Description |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Shaozu Cao, Jie Pan, Jieqi Shi, Shaojie Shen | Hong Kong University of Science and Technology | VINS-Mono | 0.6341 | 0.4225 | 0.4945 | 0.2429 | 0.8175 | 0.5678 | 0.8037 | 0.8572 | 0.7513 | 30.1062 | 1.0000 | 75.13 | Doc Slides |
2 | Xinyu Wei, Zengming Tang, Huiyan Wu, Jun Huang | Shanghai Advanced Research Institute, Chinese Academy of Sciences | PLVI-SLAM | 0.2767 | 0.0994 | 0.3383 | 0.1675 | 0.0097 | 0.2813 | 0.8183 | 0.6654 | 0.4205 | 18.3380 | 0.6113 | 25.71 | Doc |
3 | Jianhua Zhang, Shengyong Chen, Mengping Gui, Jialing Liu, Luzhen Ma, Kaiqi Chen | Zhejiang University of Technology | MMF-SLAM | 0.1203 | 0.0834 | 0.1760 | 0.1407 | 0.0010 | 0.3214 | 0.0102 | 0.0000 | 0.1235 | 29.9620 | 0.9987 | 12.33 | Doc Slides |
Competition Chairs
Guofeng Zhang
Zhejiang University, China
Jing Chen
Beijing Institute of Technology, China
Guoquan Huang
University of Delaware, USA
Acknowledgement
We thank Bangbang Yang for his great help in preparing competition dataset, building the website and evaluating the participating SLAM systems.