We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. Our goal is to make the dataset creation process widely accessible, allowing researchers to transform scans into datasets with highquality ground truth. We demonstrate our framework by creating a photorealistic synthetic version of the publicly available ScanNet dataset with consistent layout, semantic labels, high quality spatially-varying BRDF and complex lighting. We render photorealistic images, as well as complex spatially-varying lighting, including direct, indirect and visibility components. Such a dataset enables important applications in inverse rendering, scene understanding and robotics. We show that deep networks trained on the proposed dataset achieve competitive performance for shape, material and lighting estimation on real images, enabling photorealistic augmented reality applications, such as object insertion and material editing. We also show our semantic labels may be used for segmentation and multitask learning. Finally, we demonstrate that our framework may also be integrated with physics engines, to create virtual robotics environments with unique ground truth such as friction coefficients and correspondence to real scenes. The dataset and all the tools to create such datasets will be publicly released, enabling others in the community to easily build large-scale datasets of their own.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here