UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders

In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Existing RGB-D saliency detection methods treat the saliency detection task as a point estimation problem, and produce a single saliency map following a deterministic learning pipeline. Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network via conditional variational autoencoders to model human annotation uncertainty and generate multiple saliency maps for each input image by sampling in the latent space. With the proposed saliency consensus process, we are able to generate an accurate saliency map based on these multiple predictions. Quantitative and qualitative evaluations on six challenging benchmark datasets against 18 competing algorithms demonstrate the effectiveness of our approach in learning the distribution of saliency maps, leading to a new state-of-the-art in RGB-D saliency detection.

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
RGB-D Salient Object Detection DES UC-Net S-Measure 93.4 # 7
Average MAE 0.019 # 6
RGB-D Salient Object Detection LFSD UC-Net S-Measure 86.4 # 4
Average MAE 0.066 # 3
RGB-D Salient Object Detection NJU2K UC-Net S-Measure 89.7 # 16
Average MAE 0.043 # 11
RGB-D Salient Object Detection NLPR UC-Net S-Measure 92.0 # 8
Average MAE 0.025 # 9
Thermal Image Segmentation RGB-T-Glass-Segmentation UCNet MAE 0.071 # 16
RGB-D Salient Object Detection SIP UC-Net S-Measure 87.5 # 11
Average MAE 0.051 # 9
RGB-D Salient Object Detection STERE UC-Net S-Measure 90.3 # 10
Average MAE 0.039 # 6

Methods