TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Temporal Forgery Localization	LAV-DF	UMMAFormer	AR@100	92.42	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AR@50	92.48	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AR@20	92.47	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AR@10	92.10	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AP@0.5	98.83	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AP@0.75	95.54	# 1
Temporal Forgery Localization	LAV-DF	UMMAFormer	AP@0.95	37.61	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ummaformer-a-universal-multimodal-adaptive-1/temporal-forgery-localization-on-lav-df)](https://paperswithcode.com/sota/temporal-forgery-localization-on-lav-df?p=ummaformer-a-universal-multimodal-adaptive-1)`

UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for Temporal Forgery Localization

28 Aug 2023 · Rui Zhang, Hongxia Wang, Mingshan Du, Hanqing Liu, Yang Zhou, Qiang Zeng ·

The emergence of artificial intelligence-generated content (AIGC) has raised concerns about the authenticity of multimedia content in various fields. However, existing research for forgery content detection has focused mainly on binary classification tasks of complete videos, which has limited applicability in industrial settings. To address this gap, we propose UMMAFormer, a novel universal transformer framework for temporal forgery localization (TFL) that predicts forgery segments with multimodal adaptation. Our approach introduces a Temporal Feature Abnormal Attention (TFAA) module based on temporal feature reconstruction to enhance the detection of temporal differences. We also design a Parallel Cross-Attention Feature Pyramid Network (PCA-FPN) to optimize the Feature Pyramid Network (FPN) for subtle feature enhancement. To evaluate the proposed method, we contribute a novel Temporal Video Inpainting Localization (TVIL) dataset specifically tailored for video inpainting scenes. Our experiments show that our approach achieves state-of-the-art performance on benchmark datasets, including Lav-DF, TVIL, and Psynd, significantly outperforming previous methods. The code and data are available at https://github.com/ymhzyj/UMMAFormer/.

PDF Abstract

Code

Add Remove Mark official

ymhzyj/UMMAFormer official

Tasks

Add Remove

Binary Classification

Temporal Forgery Localization

Video Inpainting

Datasets

Introduced in the Paper:

TVIL

Used in the Paper:

LAV-DF

Results from the Paper

Edit

Ranked #1 on Temporal Forgery Localization on LAV-DF

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Temporal Forgery Localization	LAV-DF	UMMAFormer	AR@100	92.42	# 1	Compare
			AR@50	92.48	# 1	Compare
			AR@20	92.47	# 1	Compare
			AR@10	92.10	# 1	Compare
			AP@0.5	98.83	# 1	Compare
			AP@0.75	95.54	# 1	Compare
			AP@0.95	37.61	# 1	Compare

Methods

Add Remove

Adam • Attention Dropout • Dropout • Inpainting • Layer Normalization • Linear Layer • Multi-Head Attention • ReLU • Residual Connection • Scaled Dot-Product Attention • Softmax • Universal Transformer

Edit Social Preview

UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for Temporal Forgery Localization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove