Scene Understanding

510 papers with code • 3 benchmarks • 43 datasets

Scene Understanding is something that to understand a scene. For instance, iPhone has function that help eye disabled person to take a photo by discribing what the camera sees. This is an example of Scene Understanding.

Libraries

Use these libraries to find Scene Understanding models and implementations
4 papers
2,916
4 papers
1,097
See all 5 libraries.

Latest papers with no code

Depth Estimation using Weighted-loss and Transfer Learning

no code yet • 11 Apr 2024

The optimized loss function is a combination of weighted losses to which enhance robustness and generalization: Mean Absolute Error (MAE), Edge Loss and Structural Similarity Index (SSIM).

Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange

no code yet • 11 Apr 2024

Subsequently, we introduce a context-aware feature learning strategy, which encodes object patterns without relying on their specific context by aggregating object features across various scenes.

Gaga: Group Any Gaussians via 3D-aware Memory Bank

no code yet • 11 Apr 2024

We introduce Gaga, a framework that reconstructs and segments open-world 3D scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation models.

O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation

no code yet • 10 Apr 2024

Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required.

Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles

no code yet • 10 Apr 2024

In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles.

QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding

no code yet • 9 Apr 2024

Understanding the structural organisation of 3D indoor scenes in terms of rooms is often accomplished via floorplan extraction.

DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning

no code yet • 9 Apr 2024

We implement a baseline by applying cylindrical rectification on the fisheye images and using a standard LSS-based BEV segmentation model.

Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation

no code yet • 6 Apr 2024

Experimental results on FineGrip demonstrate the feasibility of the panoptic perception task and the beneficial effect of multi-task joint optimization on individual tasks.

You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects

no code yet • 4 Apr 2024

In the realm of robotic grasping, achieving accurate and reliable interactions with the environment is a pivotal challenge.

NeRF-MAE : Masked AutoEncoders for Self Supervised 3D representation Learning for Neural Radiance Fields

no code yet • 1 Apr 2024

Given the capabilities of neural fields in densely representing a 3D scene from 2D images, we ask the question: Can we scale their self-supervised pretraining, specifically using masked autoencoders, to generate effective 3D representations from posed RGB images.