Object Recognition

486 papers with code • 7 benchmarks • 42 datasets

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Benchmarks

Add a Result

These leaderboards are used to track progress in Object Recognition

Dataset	Best Model	Compare
MECCANO	Faster-RCNN	See all
CIFAR10-DVS	SSNN	See all
DVS128 Gesture	SSNN	See all
N-Caltech 101	SSNN	See all
ObjectNet (ImageNet classes, trained on ImageNet)	ObjectNet-Baseline	See all
ObjectNet (ImageNet classes)	ObjectNet-Baseline	See all
ObjectNet (All classes)	ObjectNet-Baseline	See all

Libraries

Use these libraries to find Object Recognition models and implementations

peymanbateni/simple-cnaps

3 papers

110

plai-group/simple-cnaps

3 papers

open-mmlab/mmdetection

2 papers

27,857

pytorch/vision

2 papers

15,473

See all 11 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Learn and Search: An Elegant Technique for Object Lookup using Contrastive Learning

no code yet • 12 Mar 2024

The rapid proliferation of digital content and the ever-growing need for precise object recognition and segmentation have driven the advancement of cutting-edge techniques in the field of object classification and segmentation.

Paper
Add Code

Mapping High-level Semantic Regions in Indoor Environments without Object Recognition

no code yet • 11 Mar 2024

Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments.

Paper
Add Code

Textureless Object Recognition: An Edge-based Approach

no code yet • 10 Mar 2024

It has been challenging to obtain good accuracy in real time because of its lack of discriminative features and reflectance properties which makes the techniques for textured object recognition insufficient for textureless objects.

Paper
Add Code

A spatiotemporal style transfer algorithm for dynamic visual stimulus generation

no code yet • 7 Mar 2024

It is based on a two-stream deep neural network model that factorizes spatial and temporal features to generate dynamic visual stimuli whose model layer activations are matched to those of input videos.

Paper
Add Code

LoDisc: Learning Global-Local Discriminative Features for Self-Supervised Fine-Grained Visual Recognition

no code yet • 6 Mar 2024

In this paper, we present to incorporate the subtle local fine-grained feature learning into global self-supervised contrastive learning through a pure self-supervised global-local fine-grained contrastive learning framework.

Paper
Add Code

MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding

no code yet • 5 Mar 2024

3D visual grounding involves matching natural language descriptions with their corresponding objects in 3D spaces.

Paper
Add Code

Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval

no code yet • 1 Mar 2024

This paper presents an attention-based dual-encoder architecture with specially designed loss functions that optimize the inter- and intra-class distances simultaneously in two different embedding spaces, one for the category embeddings and the other for the object-level embeddings.

Paper
Add Code

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Model

no code yet • 29 Feb 2024

Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language.

Paper
Add Code

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

no code yet • 29 Feb 2024

Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments and has emerged as a particularly challenging task within the domain of Embodied AI.

Paper
Add Code

ISCUTE: Instance Segmentation of Cables Using Text Embedding

no code yet • 19 Feb 2024

In the field of robotics and automation, conventional object recognition and instance segmentation methods face a formidable challenge when it comes to perceiving Deformable Linear Objects (DLOs) like wires, cables, and flexible tubes.

Paper
Add Code

Object Recognition

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result