Intrinsic Image Decomposition is the process of separating an image into its formation components such as reflectance (albedo) and shading (illumination). Reflectance is the color of the object, invariant to camera viewpoint and illumination conditions, whereas shading, dependent on camera viewpoint and object geometry, consists of different illumination effects, such as shadows, shading and inter-reflections. Using intrinsic images, instead of the original images, can be beneficial for many computer vision algorithms. For instance, for shape-from-shading algorithms, the shading images contain important visual cues to recover geometry, while for segmentation and detection algorithms, reflectance images can be beneficial as they are independent of confounding illumination effects. Furthermore, intrinsic images are used in a wide range of computational photography applications, such as material recoloring, relighting, retexturing and stylization.
In this paper we show how to perform scene-level inverse rendering to recover shape, reflectance and lighting from a single, uncontrolled image using a fully convolutional neural network.
However, it is difficult to collect ground truth training data at scale for intrinsic images.
Intrinsic image decomposition, which is an essential task in computer vision, aims to infer the reflectance and shading of the scene.
To that end, we propose a supervised end-to-end CNN architecture to jointly learn intrinsic image decomposition and semantic segmentation.
However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models.
We present a method for jointly predicting a depth map and intrinsic images from single-image input.