Conditional Random Fields as Recurrent Neural Networks

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.

PDF Abstract ICCV 2015 PDF ICCV 2015 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Real-Time Semantic Segmentation Cityscapes test CRF-RNN mIoU 62.5% # 37
Time (ms) 700 # 24
Frame (fps) 1.4 # 25

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Semantic Segmentation PASCAL Context CRF-RNN mIoU 39.3 # 59
Semantic Segmentation PASCAL VOC 2012 test CRF-RNN Mean IoU 74.7% # 36

Methods