Scene Classification
121 papers with code • 2 benchmarks • 21 datasets
Scene Classification is a task in which scenes from photographs are categorically classified. Unlike object classification, which focuses on classifying prominent objects in the foreground, Scene Classification uses the layout of objects within the scene, in addition to the ambient context, for classification.
Source: Scene classification with Convolutional Neural Networks
Datasets
Most implemented papers
A Remote Sensing Image Dataset for Cloud Removal
Removing clouds is an indispensable pre-processing step in remote sensing image analysis.
SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion
The availability of curated large-scale training data is a crucial factor for the development of well-generalizing deep learning methods for the extraction of geoinformation from multi-sensor remote sensing imagery.
Receptive-field-regularized CNN variants for acoustic scene classification
One side effect of restricting the RF of CNNs is that more frequency information is lost.
Emergent Properties of Foveated Perceptual Systems
The primary model has a foveated-textural input stage, which we compare to a model with foveated-blurred input and a model with spatially-uniform blurred input (both matched for perceptual compression), and a final reference model with minimal input-based compression.
Understanding the Role of Individual Units in a Deep Neural Network
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
A system of vision sensor based deep neural networks for complex driving scene analysis in support of crash risk assessment and prevention
The paper further evaluates the performance of the Multi-Net and the efficiency of the developed system.
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
In order to evaluate our multi-task approach, we extend the annotations of the common RGB-D indoor datasets NYUv2 and SUNRGB-D for instance segmentation and orientation estimation.
Vision-Language Models in Remote Sensing: Current Progress and Future Trends
Existing AI-related research in remote sensing primarily focuses on visual understanding tasks while neglecting the semantic understanding of the objects and their relationships.
Efficient Multi-Task Scene Analysis with RGB-D Transformers
However, we show that the dual CNN-based encoder of EMSANet can be replaced with a single Transformer-based encoder.
DeCUR: decoupling common & unique representations for multimodal self-supervision
We propose Decoupling Common and Unique Representations (DeCUR), a simple yet effective method for multimodal self-supervised learning.