Scene Recognition
64 papers with code • 8 benchmarks • 15 datasets
Benchmarks
These leaderboards are used to track progress in Scene Recognition
Latest papers
MovieCLIP: Visual Scene Recognition in Movies
Longform media such as movies have complex narrative structures, with events spanning a rich variety of ambient visual scenes.
Capsule Networks as Generative Models
Capsule networks are a neural network architecture specialized for visual scene recognition.
All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification
Finally, our SSF allows our framework to learn the same scene scheme from multi-grain instance representations and fuses them, so that the entire framework is optimized as a whole.
Where in the World is this Image? Transformer-based Geo-localization in the Wild
Predicting the geographic location (geo-localization) from a single ground-level RGB image taken anywhere in the world is a very challenging problem.
An Empirical Study of Remote Sensing Pretraining
To this end, we train different networks from scratch with the help of the largest RS scene recognition dataset up to now -- MillionAID, to obtain a series of RS pretrained backbones, including both convolutional neural networks (CNN) and vision transformers such as Swin and ViTAE, which have shown promising performance on computer vision tasks.
Omnivore: A Single Model for Many Visual Modalities
Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
Furthermore, we highlight the potential of our approach by benchmarking on a YouTube-8M subset of indoor scenes as well, where it achieves 74% accuracy and 0. 74 F1-Score.
An embarrassingly simple comparison of machine learning algorithms for indoor scene classification
With the emergence of autonomous indoor robots, the computer vision task of indoor scene recognition has gained the spotlight.
Object-to-Scene: Learning to Transfer Object Knowledge to Indoor Scene Recognition
The final results in this work show that OTS successfully extracts object features and learns object relations from the segmentation network.
BORM: Bayesian Object Relation Model for Indoor Scene Recognition
First, we utilize an improved object model (IOM) as a baseline that enriches the object knowledge by introducing a scene parsing algorithm pretrained on the ADE20K dataset with rich object categories related to the indoor scene.