MPI (Max Planck Institute) Sintel is a dataset for optical flow evaluation that has 1064 synthesized stereo images and ground truth data for disparity. Sintel is derived from open-source 3D animated short film Sintel. The dataset has 23 different scenes. The stereo images are RGB while the disparity is grayscale. Both have resolution of 1024×436 pixels and 8-bit per channel.
187 PAPERS • 6 BENCHMARKS
IISc VEED consists of 200 diverse indoor and outdoor scenes (see samples below). The videos are rendered with blender and the blend files are obtained for the scenes mainly from blendswap and turbosquid. 4 different camera trajectories are added to each scene and thus we have a total of 800 videos. The videos are rendered at full HD resolution (1920 x 1080) and at 30fps and contain 12 frames each.
1 PAPER • 1 BENCHMARK
IISc VEED-Dynamic consists of 200 diverse indoor and outdoor scenes (see samples below). The videos are rendered using blender and the blend files obtained for the scenes are mainly from blendswap and turbosquid. 4 different camera trajectories are added to each scene and thus we have a total of 800 videos. Motion is added to pre-existing objects in the scene or new objects are added and animated. The videos are rendered at full HD resolution (1920 x 1080) and at 30fps and contain 12 frames each.
SceneNet-RGBD is a synthetic dataset containing large-scale photorealistic renderings of indoor scene trajectories with pixel-level annotations. Random sampling permits virtually unlimited scene configurations, and the dataset creators provide a set of 5M rendered RGB-D images from over 15K trajectories in synthetic layouts with random but physically simulated object poses. Each layout also has random lighting, camera trajectories, and textures. The scale of this dataset is well suited for pre-training data-driven computer vision techniques from scratch with RGB-D inputs, which previously has been limited by relatively small labelled datasets in NYUv2 and SUN RGB-D. It also provides a basis for investigating 3D scene labelling tasks by providing perfect camera poses and depth data as proxy for a SLAM system.