Scene Parsing
75 papers with code • 2 benchmarks • 4 datasets
Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Description
Libraries
Use these libraries to find Scene Parsing models and implementationsSubtasks
Latest papers with no code
Cross-CBAM: A Lightweight network for Scene Segmentation
And we propose a Cross Convolutional Block Attention Module(CCBAM), in which a cross-multiply operation is employed in the CCBAM module to make high-level semantic information guide low-level detail information.
Treasure What You Have: Exploiting Similarity in Deep Neural Networks for Efficient Video Processing
Deep learning has enabled various Internet of Things (IoT) applications.
Local and Global Contextual Features Fusion for Pedestrian Intention Prediction
The pedestrian features include body pose and local context features that represent the pedestrian's behaviour.
Weakly Supervised Class-Agnostic Motion Prediction for Autonomous Driving
To this end, we propose a two-stage weakly supervised approach, where the segmentation model trained with the incomplete binary masks in Stage1 will facilitate the self-supervised learning of the motion prediction network in Stage2 by estimating possible moving foregrounds in advance.
Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
Re:PolyWorld not only outperforms the original model on building extraction in aerial images, thanks to the proposed joint analysis of vertices and edges, but also beats the state-of-the-art in multiple other domains.
Visual Traffic Knowledge Graph Generation from Scene Images
Although previous works on traffic scene understanding have achieved great success, most of them stop at a lowlevel perception stage, such as road segmentation and lane detection, and few concern high-level understanding.
Multi-Sem Fusion: Multimodal Semantic Fusion for 3D Object Detection
Most multi-modal 3D object detection frameworks integrate semantic knowledge from 2D images into 3D LiDAR point clouds to enhance detection accuracy.
GEBNet: Graph-Enhancement Branch Network for RGB-T Scene Parsing
RGB-T (red–green–blue and thermal) scene parsing has recently drawn considerable research attention.
Boundary Corrected Multi-scale Fusion Network for Real-time Semantic Segmentation
Image semantic segmentation aims at the pixel-level classification of images, which has requirements for both accuracy and speed in practical application.
Aerial Scene Parsing: From Tile-level Scene Classification to Pixel-wise Semantic Labeling
Finally, we perform ASP by unifying the tile-level scene classification and object-based image analysis to achieve pixel-wise semantic labeling.