Scene Recognition
64 papers with code • 8 benchmarks • 15 datasets
Benchmarks
These leaderboards are used to track progress in Scene Recognition
Latest papers
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations
Visual Question Answering (VQA) is one of the most important tasks in autonomous driving, which requires accurate recognition and complex situation evaluations.
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
This has been a significant bottleneck, particularly in the development of common sense reasoning and nuanced scene understanding necessary for safe and reliable autonomous driving.
Counting Manatee Aggregations using Deep Neural Networks and Anisotropic Gaussian Kernel
In this paper, we propose a deep learning based crowd counting approach to automatically count number of manatees within a region, by using low quality images as input.
A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval
Our highlight is the proposal of a paradigm that draws on prior knowledge to instruct adaptive learning of vision and text representations.
DisasterNets: Embedding Machine Learning in Disaster Mapping
It consists of two stages, space granulation and attribute granulation.
NarrativeXL: A Large-scale Dataset For Long-Term Memory Models
We show that our questions 1) adequately represent the source material 2) can be used to diagnose a model's memory capacity 3) are not trivial for modern language models even when the memory demand does not exceed those models' context lengths.
SRRM: Semantic Region Relation Model for Indoor Scene Recognition
Despite the remarkable success of convolutional neural networks in various computer vision tasks, recognizing indoor scenes still presents a significant challenge due to their complex composition.
Designing Deep Networks for Scene Recognition
Most deep learning backbones are evaluated on ImageNet.
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
Our CoMAE presents a curriculum learning strategy to unify the two popular self-supervised representation learning algorithms: contrastive learning and masked image modeling.
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research
A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks.