The Fudan-ShanghaiTech dataset (FDST) is a dataset for video crowd counting. It contains 15K frames with about 394K annotated heads captured from 13 different scenes
17 PAPERS • NO BENCHMARKS YET
WWW Crowd provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a comprehensive dataset for the area of crowd understanding.
5 PAPERS • NO BENCHMARKS YET
The TUB CrowdFlow is a synthetic dataset that contains 10 sequences showing 5 scenes. Each scene is rendered twice: with a static point of view and a dynamic camera to simulate drone/UAV based surveillance. The scenes are render using Unreal Engine at HD resolution (1280x720) at 25 fps, which is typical for current commercial CCTV surveillance systems. The total number of frames is 3200.
4 PAPERS • NO BENCHMARKS YET
DroneCrowd is a benchmark for object detection, tracking and counting algorithms in drone-captured videos. It is a drone-captured large scale dataset formed by 112 video clips with 33,600 HD frames in various scenarios. Notably, it has annotations for 20,800 people trajectories with 4.8 million heads and several video-level attributes.