Object Recognition

484 papers with code • 7 benchmarks • 39 datasets

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Benchmarks

Add a Result

These leaderboards are used to track progress in Object Recognition

Dataset	Best Model	Compare
MECCANO	Faster-RCNN	See all
CIFAR10-DVS	SSNN	See all
DVS128 Gesture	SSNN	See all
N-Caltech 101	SSNN	See all
ObjectNet (ImageNet classes, trained on ImageNet)	ObjectNet-Baseline	See all
ObjectNet (ImageNet classes)	ObjectNet-Baseline	See all
ObjectNet (All classes)	ObjectNet-Baseline	See all

Libraries

Use these libraries to find Object Recognition models and implementations

peymanbateni/simple-cnaps

3 papers

110

plai-group/simple-cnaps

3 papers

open-mmlab/mmdetection

2 papers

27,708

pytorch/vision

2 papers

15,409

See all 11 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured

no code yet • 17 Apr 2024

Based on various types of non-learnable operators, including gradient, sort, local binary pattern, maximum, etc., this paper designs a set of new convolution operations that are natually invariant to arbitrary rotations.

Paper
Add Code

How to deal with glare for improved perception of Autonomous Vehicles

no code yet • 17 Apr 2024

In this paper, we investigate various glare reduction techniques, including the proposed saturated pixel-aware glare reduction technique for improved performance of the computer vision (CV) tasks employed by the perception layer of AVs.

Paper
Add Code

A Diffusion-based Data Generator for Training Object Recognition Models in Ultra-Range Distance

no code yet • 15 Apr 2024

A challenging example is Ultra-Range Gesture Recognition (URGR) in human-robot interaction where the user exhibits directive gestures at a distance of up to 25~m from the robot.

Paper
Add Code

Learning State-Invariant Representations of Objects from Image Collections with State, Pose, and Viewpoint Changes

no code yet • 9 Apr 2024

We believe that this dataset will facilitate research in fine-grained object recognition and retrieval of objects that are capable of state changes.

Paper
Add Code

GLCM-Based Feature Combination for Extraction Model Optimization in Object Detection Using Machine Learning

no code yet • 6 Apr 2024

Therefore, based on the trade-off between accuracy and complexity, the K-NN model with a combination of Correlation, Energy, and Homogeneity features emerges as a more suitable choice for real-time applications that demand high accuracy and low complexity.

Paper
Add Code

Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognition

no code yet • 1 Apr 2024

Nowadays, users demand for increased personalization of vision systems to localize and identify personal instances of objects (e. g., my dog rather than dog) from a few-shot dataset only.

Paper
Add Code

SUGAR: Pre-training 3D Visual Representations for Robotics

no code yet • 1 Apr 2024

SUGAR employs a versatile transformer-based model to jointly address five pre-training tasks, namely cross-modal knowledge distillation for semantic learning, masked point modeling to understand geometry structures, grasping pose synthesis for object affordance, 3D instance segmentation and referring expression grounding to analyze cluttered scenes.

Paper
Add Code

Efficient Multi-Band Temporal Video Filter for Reducing Human-Robot Interaction

no code yet • 26 Mar 2024

Although mobile robots have on-board sensors to perform navigation, their efficiency in completing paths can be enhanced by planning to avoid human interaction.

Paper
Add Code

PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation

no code yet • 22 Mar 2024

We frame this problem as the task of learning a low-dimensional visual-tactile embedding, wherein we encode a depth patch from which we decode the tactile signal.

Paper
Add Code

EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition

no code yet • 21 Mar 2024

To this end, we propose a novel framework, dubbed EventDance for this unsupervised source-free cross-modal adaptation problem.

Paper
Add Code

Object Recognition

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result