Vision Meets Drones: A Challenge
Vision Meets Drones: A Challenge

The VisDrone dataset consists of 400 video clips formed by 265,228 frames and 10,209 static images

Task: Object Tracking
Task: Object Tracking

In particular, for an input video sequence and the initial bounding box of the target object in the first frame, the challenge requires a participating algorithm to locate the target bounding boxes in the subsequent video frames.

Task: Object Detection
Task: Object Detection

In object detection task, we focus on ten object categories of interest including pedestrian, person, car, van, bus, truck, motor, bicycle, awning-tricycle, and tricycle. Some rarely occurring special vehicles.

Task: Crowd Counting
Task: Crowd Counting

The challenge will provide 112 challenging sequences, including 82 video sequences for training (2,420 frames in total), and 30 sequences for testing (900 frames in total)

Drones, or general UAVs, equipped with cameras have been fast deployed to a wide range of applications, including agricultural, aerial photography, fast delivery, and surveillance. Consequently, automatic understanding of visual data collected from these platforms become highly demanding, which brings computer vision to drones more and more closely. We are excited to present a large-scale benchmark with carefully annotated ground-truth for various important computer vision tasks, named VisDrone, to make vision meet drones.

The VisDrone2020 dataset is collected by the AISKYEYE team at Lab of Machine Learning and Data Mining , Tianjin University, China. The benchmark dataset consists of 400 video clips formed by 265,228 frames and 10,209 static images, captured by various drone-mounted cameras, covering a wide range of aspects including location (taken from 14 different cities separated by thousands of kilometers in China), environment (urban and country), objects (pedestrian, vehicles, bicycles, etc.), and density (sparse and crowded scenes). Note that, the dataset was collected using various drone platforms (i.e., drones with different models), in different scenarios, and under various weather and lighting conditions. These frames are manually annotated with more than 2.6 million bounding boxes or points of targets of frequent interests, such as pedestrians, cars, bicycles, and tricycles. Some important attributes including scene visibility, object class and occlusion, are also provided for better data utilization.


  • July. 9, 2020: Paper submission system is available now. Paper submission deadline is delayed until July 15th. 
  • June. 26, 2020: Due to the impact of COVID-19, the submission deadline is delayed until July 15th. Each team will have additional 5 submission opportunities.
  • May. 31, 2020: The VisDrone2020 submission system has been opened, and deadline for submission is June 30 .
  • May. 15, 2020: The VisDrone2020 benchmark dataset is available for download.
  • Oct. 22, 2019: The booklet of VisDrone2019 can be downloaded.
  • Aug. 8, 2019: The paper submission deadline is extended to August 8th.
  • June 10, 2019: The result submission system is open and the deadline is July 10.
  • May 25, 2019: The workshop paper Submission System is open.
  • April 25, 2019: The VisDrone2019 benchmark dataset is available for download.
  • April 05, 2019: The VisDrone2019 workshop will be organized conjunction with ICCV 2019.
  • Oct. 10, 2018: The winner talk for the VisDrone2018 workshop can be downloaded from BaiduYun.
  • Sept. 18, 2018: The photos for the VisDrone2018 workshop can be downloaded from BaiduYun.
  • Sept. 01, 2018: The VisDrone2018 workshop will be held on Sept. 8th and the challenge results will be announced.
  • April 25, 2018: the VisDrone2018 benchmark dataset is available for download.
  • April 25, 2018: Our arXiv paper describing the VisDrone2018 benchmark dataset is available for download.


Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Qinghua Hu, Haibin Ling. Vision Meets Drones: Past, Present and Future. arXiv preprint arXiv:2001.06303 (2020). Bibtex source | Abstract | PDF