Object Recognition

486 papers with code • 7 benchmarks • 42 datasets

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Libraries

Use these libraries to find Object Recognition models and implementations

Probing Multimodal Large Language Models for Global and Local Semantic Representations

kobayashikanna01/probing_MLLM_rep 27 Feb 2024

The advancement of Multimodal Large Language Models (MLLMs) has greatly accelerated the development of applications in understanding integrated texts and images.

0
27 Feb 2024

CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models

netflix/clove 22 Feb 2024

Recent years have witnessed a significant increase in the performance of Vision and Language tasks.

11
22 Feb 2024

SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models

FaceOnLive/Face-Liveness-Detection-SDK-Linux 6 Feb 2024

For the face forgery detection task, we evaluate GAN-based and diffusion-based data with both visual and acoustic modalities.

200
06 Feb 2024

Lightweight Pixel Difference Networks for Efficient Visual Representation Learning

hellozhuo/pidinet 1 Feb 2024

With PDC and Bi-PDC, we further present two lightweight deep networks named \emph{Pixel Difference Networks (PiDiNet)} and \emph{Binary PiDiNet (Bi-PiDiNet)} respectively to learn highly efficient yet more accurate representations for visual tasks including edge detection and object recognition.

414
01 Feb 2024

Self-supervised learning of video representations from a child's perspective

eminorhan/video-models 1 Feb 2024

These results suggest that important temporal aspects of a child's internal model of the world may be learnable from their visual experience using highly generic learning algorithms and without strong inductive biases.

4
01 Feb 2024

Local Feature Matching Using Deep Learning: A Survey

vignywang/awesome-local-feature-matching 31 Jan 2024

The objective of this endeavor is to furnish a comprehensive overview of local feature matching methods.

37
31 Jan 2024

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

cvlab-columbia/pix2gestalt 25 Jan 2024

We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

90
25 Jan 2024

ContextMix: A context-aware data augmentation method for industrial visual inspection systems

hy2mk/contextmix 18 Jan 2024

With the minimal additional computation cost of image resizing, ContextMix enhances performance compared to existing augmentation techniques.

1
18 Jan 2024

Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery

isaaccorley/chesapeakersc 12 Jan 2024

In this work we propose a road segmentation benchmark dataset, Chesapeake Roads Spatial Context (RSC), for evaluating the spatial long-range context understanding of geospatial machine learning models and show how commonly used semantic segmentation models can fail at this task.

35
12 Jan 2024

CLIP-guided Federated Learning on Heterogeneous and Long-Tailed Data

shijiangming1/clip2fl 14 Dec 2023

For server-side learning, in order to mitigate the heterogeneity and class-distribution imbalance, we generate federated features to retrain the server model.

12
14 Dec 2023