Object Recognition
486 papers with code • 7 benchmarks • 42 datasets
Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.
( Image credit: Tensorflow Object Detection API )
Libraries
Use these libraries to find Object Recognition models and implementationsDatasets
Latest papers
CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models
Recent years have witnessed a significant increase in the performance of Vision and Language tasks.
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models
For the face forgery detection task, we evaluate GAN-based and diffusion-based data with both visual and acoustic modalities.
Lightweight Pixel Difference Networks for Efficient Visual Representation Learning
With PDC and Bi-PDC, we further present two lightweight deep networks named \emph{Pixel Difference Networks (PiDiNet)} and \emph{Binary PiDiNet (Bi-PiDiNet)} respectively to learn highly efficient yet more accurate representations for visual tasks including edge detection and object recognition.
Self-supervised learning of video representations from a child's perspective
These results suggest that important temporal aspects of a child's internal model of the world may be learnable from their visual experience using highly generic learning algorithms and without strong inductive biases.
Local Feature Matching Using Deep Learning: A Survey
The objective of this endeavor is to furnish a comprehensive overview of local feature matching methods.
pix2gestalt: Amodal Segmentation by Synthesizing Wholes
We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.
ContextMix: A context-aware data augmentation method for industrial visual inspection systems
With the minimal additional computation cost of image resizing, ContextMix enhances performance compared to existing augmentation techniques.
Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery
In this work we propose a road segmentation benchmark dataset, Chesapeake Roads Spatial Context (RSC), for evaluating the spatial long-range context understanding of geospatial machine learning models and show how commonly used semantic segmentation models can fail at this task.
CLIP-guided Federated Learning on Heterogeneous and Long-Tailed Data
For server-side learning, in order to mitigate the heterogeneity and class-distribution imbalance, we generate federated features to retrain the server model.
Exploring Novel Object Recognition and Spontaneous Location Recognition Machine Learning Analysis Techniques in Alzheimer's Mice
Understanding object recognition patterns in mice is crucial for advancing behavioral neuroscience and has significant implications for human health, particularly in the realm of Alzheimer's research.