Scene Parsing
75 papers with code • 2 benchmarks • 4 datasets
Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Description
Libraries
Use these libraries to find Scene Parsing models and implementationsSubtasks
Latest papers
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion
In this study, we take one step toward this new research area by exploring a feasible strategy to fully exploit VFM features for RGB-thermal scene parsing.
Robust Shape Fitting for 3D Scene Abstraction
A RANSAC estimator guided by a neural network fits these primitives to a depth map.
Applying Unsupervised Semantic Segmentation to High-Resolution UAV Imagery for Enhanced Road Scene Parsing
There are two challenges presented in parsing road scenes from UAV images: the complexity of processing high-resolution images and the dependency on extensive manual annotations required by traditional supervised deep learning methods to train robust and accurate models.
A Data-efficient Framework for Robotics Large-scale LiDAR Scene Parsing
More importantly, we innovatively propose to learn to merge the over-divided clusters based on the local low-level geometric property similarities and the learned high-level feature similarities supervised by weak labels.
Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature Aligned Pre-Training and Region-Aware Fine-tuning
Deep neural network models have achieved remarkable progress in 3D scene understanding while trained in the closed-set setting and with full labels.
Improving Panoptic Segmentation for Nighttime or Low-Illumination Urban Driving Scenes
In this work, we propose two new methods, first to improve the performance, and second to improve the robustness of panoptic segmentation in nighttime or poor illumination urban driving scenes using a domain translation approach.
RT-K-Net: Revisiting K-Net for Real-Time Panoptic Segmentation
Our resulting RT-K-Net sets a new state-of-the-art performance for real-time panoptic segmentation methods on the Cityscapes dataset and shows promising results on the challenging Mapillary Vistas dataset.
DPF: Learning Dense Prediction Fields with Weak Supervision
We showcase the effectiveness of DPFs using two substantially different tasks: high-level semantic parsing and low-level intrinsic image decomposition.
Traffic Scene Parsing through the TSP6K Dataset
To date, most existing datasets focus on autonomous driving scenes.
Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction
Performing holistic 3D scene understanding from a single-view observation, involving generating instance shapes and 3D scene segmentation, is a long-standing challenge.