The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation

5 Apr 2019  ·  Hermann Blum, Paul-Edouard Sarlin, Juan Nieto, Roland Siegwart, Cesar Cadena ·

Deep learning has enabled impressive progress in the accuracy of semantic segmentation. Yet, the ability to estimate uncertainty and detect failure is key for safety-critical applications like autonomous driving. Existing uncertainty estimates have mostly been evaluated on simple tasks, and it is unclear whether these methods generalize to more complex scenarios. We present Fishyscapes, the first public benchmark for uncertainty estimation in a real-world task of semantic segmentation for urban driving. It evaluates pixel-wise uncertainty estimates towards the detection of anomalous objects in front of the vehicle. We~adapt state-of-the-art methods to recent semantic segmentation models and compare approaches based on softmax confidence, Bayesian learning, and embedding density. Our results show that anomaly detection is far from solved even for ordinary situations, while our benchmark allows measuring advancements beyond the state-of-the-art.

PDF Abstract

Results from the Paper


Ranked #13 on Anomaly Detection on Fishyscapes L&F (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Anomaly Detection Fishyscapes L&F Dirichlet DeepLab AP 34.28 # 13
FPR95 47.43 # 17
Anomaly Detection Fishyscapes L&F Bayesian DeepLab AP 9.8 # 16
FPR95 38.5 # 15
Anomaly Detection Fishyscapes L&F Learned Embedding Density AP 4.7 # 17
FPR95 24.4 # 14
Anomaly Detection Fishyscapes L&F Softmax Entropy AP 2.9 # 18
FPR95 44.8 # 16
Anomaly Detection Fishyscapes L&F Void Classifier AP 10.29 # 15
FPR95 22.11 # 13

Methods