TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Head Pose Estimation	AFLW2000	FSA-Net (Caps-Fusion)	MAE	5.07	# 16
Head Pose Estimation	AFLW2000	FSA-Net (Caps-Fusion)	Geodesic Error (GE)	8.16	# 4
Head Pose Estimation	BIWI	FSA-Net (Caps-Fusion)	MAE (trained with other data)	4.00	# 9
Head Pose Estimation	BIWI	FSA-Net (Caps-Fusion)	Geodesic Error (GE)	7.64	# 4
Head Pose Estimation	BIWI	FSA-Net (Caps-Fusion)	MAE-aligned (trained with other data)	2.92	# 1
Head Pose Estimation	BIWI	FSA-Net (Caps-Fusion)	Geodesic Error - aligned (GE)	5.36	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fsa-net-learning-fine-grained-structure/head-pose-estimation-on-biwi)](https://paperswithcode.com/sota/head-pose-estimation-on-biwi?p=fsa-net-learning-fine-grained-structure)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fsa-net-learning-fine-grained-structure/head-pose-estimation-on-aflw2000)](https://paperswithcode.com/sota/head-pose-estimation-on-aflw2000?p=fsa-net-learning-fine-grained-structure)`

FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image

CVPR 2019 · Tsun-Yi Yang, Yi-Ting Chen, Yen-Yu Lin, Yung-Yu Chuang ·

This paper proposes a method for head pose estimation from a single image. Previous methods often predict head poses through landmark or depth estimation and would require more computation than necessary. Our method is based on regression and feature aggregation. For having a compact model, we employ the soft stagewise regression scheme. Existing feature aggregation methods treat inputs as a bag of features and thus ignore their spatial relationship in a feature map. We propose to learn a fine-grained structure mapping for spatially grouping features before aggregation. The fine-grained structure provides part-based information and pooled values. By utilizing learnable and non-learnable importance over the spatial location, different model variants can be generated and form a complementary ensemble. Experiments show that our method outperforms the state-of-the-art methods including both the landmark-free ones and the ones based on landmark or depth estimation. With only a single RGB frame as input, our method even outperforms methods utilizing multi-modality information (RGB-D, RGB-Time) on estimating the yaw angle. Furthermore, the memory overhead of our model is 100 times smaller than those of previous methods.

PDF Abstract

Code

Add Remove Mark official

shamangary/FSA-Net

598

Tasks

Add Remove

Depth Estimation

Head Pose Estimation

Pose Estimation

regression

Datasets

300W

AFLW2000-3D

BIWI

Results from the Paper

Add Remove

Ranked #9 on Head Pose Estimation on BIWI

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Head Pose Estimation	AFLW2000	FSA-Net (Caps-Fusion)	MAE	5.07	# 16	Compare
Head Pose Estimation	AFLW2000	FSA-Net (Caps-Fusion)	Geodesic Error (GE)	8.16	# 4	Compare
Head Pose Estimation	BIWI	FSA-Net (Caps-Fusion)	MAE (trained with other data)	4.00	# 9	Compare
			Geodesic Error (GE)	7.64	# 4	Compare
			MAE-aligned (trained with other data)	2.92	# 1	Compare
			Geodesic Error - aligned (GE)	5.36	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove