DeepLabv3 is a semantic segmentation architecture that improves upon DeepLabv2 with several modifications.To handle the problem of segmenting objects at multiple scales, modules are designed which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. Furthermore, the Atrous Spatial Pyramid Pooling module from DeepLabv2 augmented with image-level features encoding global context and further boost performance.

The changes to the ASSP module are that the authors apply global average pooling on the last feature map of the model, feed the resulting image-level features to a 1 × 1 convolution with 256 filters (and batch normalization), and then bilinearly upsample the feature to the desired spatial dimension. In the end, the improved ASPP consists of (a) one 1×1 convolution and three 3 × 3 convolutions with rates = (6, 12, 18) when output stride = 16 (all with 256 filters and batch normalization), and (b) the image-level features.

Another interesting difference is that DenseCRF post-processing from DeepLabv2 is no longer needed.

Source: Rethinking Atrous Convolution for Semantic Image Segmentation

Latest Papers

Aerial Imagery Pixel-level Segmentation
| Michael R. HeffelsJoaquin Vanschoren
Ice Monitoring in Swiss Lakes from Optical Satellites and Webcams using Machine Learning
| Manu TomRajanie PrabhaTianyu WuEmmanuel BaltsaviasLaura Leal-TaixeKonrad Schindler
Semantic Segmentation for Partially Occluded Apple Trees Based on Deep Learning
Zijue ChenDavid TingRhys NewburyChao Chen
Spontaneous preterm birth prediction using convolutional neural networks
Tomasz WłodarczykSzymon PłotkaPrzemysław RokitaNicole Sochacki-WójcickaJakub WójcickiMichał LipaTomasz Trzciński
Multi-Task Pruning for Semantic Segmentation Networks
Xinghao ChenYunhe WangYiman ZhangPeng DuChunjing XuChang Xu
A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird's Eye View
| Lennart ReiherBastian LampeLutz Eckstein
ResNeSt: Split-Attention Networks
| Hang ZhangChongruo wuZhongyue ZhangYi ZhuZhi ZhangHaibin LinYue SunTong HeJonas MuellerR. ManmathaMu LiAlexander Smola
Handling Missing MRI Input Data in Deep Learning Segmentation of Brain Metastases: A Multi-Center Study
Endre GrøvikDarvin YiMichael IvElizabeth TongLine Brennhaug NilsenAnna LatyshevaCathrine SaxhaugKari Dolven JacobsenÅslaug HellandKyrre Eeg EmblemDaniel RubinGreg Zaharchuk
PointRend: Image Segmentation as Rendering
| Alexander KirillovYuxin WuKaiming HeRoss Girshick
AnoNet: Weakly Supervised Anomaly Detection in Textured Surfaces
Manpreet Singh MinhasJohn Zelek
What's There in the Dark
| Sauradip NagSaptakatha AdakSukhendu Das
RENAS: Reinforced Evolutionary Neural Architecture Search
Yukang Chen Gaofeng Meng Qian Zhang Shiming Xiang Chang Huang Lisen Mu Xinggang Wang
Reinforced Evolutionary Neural Architecture Search
| Yukang ChenGaofeng MengQian ZhangShiming XiangChang HuangLisen MuXinggang Wang
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
| Liang-Chieh ChenYukun ZhuGeorge PapandreouFlorian SchroffHartwig Adam
MobileNetV2: Inverted Residuals and Linear Bottlenecks
| Mark SandlerAndrew HowardMenglong ZhuAndrey ZhmoginovLiang-Chieh Chen
Rethinking Atrous Convolution for Semantic Image Segmentation
| Liang-Chieh ChenGeorge PapandreouFlorian SchroffHartwig Adam