TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	box mAP	53.5	# 56
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	AP50	71.1	# 27
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	AP75	59.2	# 19
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	APS	35.2	# 18
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	APM	56.4	# 17
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	APL	65.8	# 21
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	Hardware Burden	None	# 1
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	Operations per network pass	None	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/location-sensitive-visual-recognition-with/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=location-sensitive-visual-recognition-with)`

Location-Sensitive Visual Recognition with Cross-IOU Loss

11 Apr 2021 · Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian ·

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks. This paper summarizes these tasks as location-sensitive visual recognition and proposes a unified solution named location-sensitive network (LSNet). Based on a deep neural network as the backbone, LSNet predicts an anchor point and a set of landmarks which together define the shape of the target object. The key to optimizing the LSNet lies in the ability of fitting various scales, for which we design a novel loss function named cross-IOU loss that computes the cross-IOU of each anchor point-landmark pair to approximate the global IOU between the prediction and ground-truth. The flexibly located and accurately predicted landmarks also enable LSNet to incorporate richer contextual information for visual recognition. Evaluated on the MS-COCO dataset, LSNet set the new state-of-the-art accuracy for anchor-free object detection (a 53.5% box AP) and instance segmentation (a 40.2% mask AP), and shows promising performance in detecting multi-scale human poses. Code is available at https://github.com/Duankaiwen/LSNet

PDF Abstract

Code

Add Remove Mark official

Duankaiwen/LSNet official

154

Tasks

Add Remove

2D Human Pose Estimation

Instance Segmentation

Object

object-detection

Object Detection

Pose Estimation

Semantic Segmentation

Datasets

ImageNet

MS COCO

Results from the Paper

Add Remove

Ranked #45 on Object Detection on COCO test-dev

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	COCO test-dev	LSNet (Res2Net-101+ DCN, multi-scale)	box mAP	53.5	# 56	Compare
			AP50	71.1	# 27	Compare
			AP75	59.2	# 19	Compare
			APS	35.2	# 18	Compare
			APM	56.4	# 17	Compare
			APL	65.8	# 21	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	None	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Location-Sensitive Visual Recognition with Cross-IOU Loss

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove