Single-Object Tracking Evaluation
While the term “single-object tracking” can be sometimes ambiguous, in this task, we focus on generic single object tracking, also known as model-free tracking. In particular, for an input video sequence and the initial bounding box of the target object in the first frame, we requires a tracking algorithm to locate the target bounding boxes in the subsequent video frames. The tracking targets in these sequences include pedestrians, cars, buses, and animals.
For this task, the performance is evaluated by the success and precision scores, same as in [1]. Notably, the success score is used as the primary metric for ranking methods. The metrics are described in the following table.
MEASURE | PERFECT | DESCRIPTION |
Success Score | 100% | The area under the curve (AUC) based on the percentage of successfully tracked frames vs. the bounding box overlap threshold |
Precision Score | 100% | The percentage of frames where the centers of the tracked object are within 20 pixels to the groundtruth |
In addition, similar to [2], we have manually tagged the sequences with 12 attributes, which represents the challenging aspects in single-object tracking in drone view. We will report the performance in each attribute for comprehensive evaluation.
ATTRIBUTE | DESCRIPTION |
ARC | Aspect Ratio Change: the fraction of ground truth aspect ratio in the first frame and at least one subsequent frame is outside the range [0.5, 2]. |
BC | Background Clutter: the background near the target has similar appearance as the target. |
CM | Camera Motion: abrupt motion of the camera. |
FM | Fast Motion: motion of the ground truth bounding box is larger than 20 pixels between two consecutive frames. |
FOC | Full Occlusion: the target is fully occluded. |
IV | Illumination Variation: the illumination of the target changes significantly. |
LR | Low Resolution: at least one ground truth bounding box has less than 400 pixels. |
OV | Out-of-View: some portion of the target leaves the view. |
POC | Partial Occlusion: the target is partially occluded. |
SOB | Similar Object: there are objects of similar shape or same type near the target. |
SV | Scale Variation: the ratio of initial and at least one subsequent bounding box is outside the range [0.5, 2]. |
VC | Viewpoint Change: viewpoint affects target appearance significantly. |
The evaluation code for single-object tracking is available on the VisDrone github.
References:
[1] Y. Wu, J. Lim, and M. Yang, “Object tracking benchmark,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1834–1848, 2015.
[2] M. Mueller, N. Smith, B. Ghanem, “A Benchmark and Simulator for UAV Tracking,” European Conference on Computer Vision, vol. 1, pp. 445-461, 2016.