Video Alignment

21 papers with code • 2 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Alignment

Trend	Dataset	Best Model	Paper	Code	Compare
	UPenn Action	TCC + TCN			See all
	MSU Video Alignment and Retrieval Benchmark Suite	VQMT3D			See all

Datasets

Latest papers

Most implemented Social Latest No code

Listen Then See: Video Alignment with Speaker Attention

sts-vlcc/sts-vlcc • 21 Apr 2024

Our approach exhibits an improved ability to leverage the video modality by using the audio modality as a bridge with the language modality.

21 Apr 2024

Paper
Code

Subjective-Aligned Dataset and Metric for Text-to-Video Quality Assessment

qmme/t2vqa • 18 Mar 2024

Based on T2VQA-DB, we propose a novel transformer-based model for subjective-aligned Text-to-Video Quality Assessment (T2VQA).

18 Mar 2024

Paper
Code

AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

benchcouncil/aigcbench • • 3 Jan 2024

To establish a unified evaluation framework for video generation tasks, our benchmark includes 11 metrics spanning four dimensions to assess algorithm performance.

03 Jan 2024

Paper
Code

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

EvalCrafter/EvalCrafter • • 17 Oct 2023

For video generation, various open-sourced models and public-available services have been developed to generate high-quality videos.

17 Oct 2023

Paper
Code

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

showlab/show-1 • • 27 Sep 2023

In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marries pixel-based and latent-based VDMs for text-to-video generation.

922

27 Sep 2023

Paper
Code

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

zcfinal/loveu-cvpr23-aqtc • • 26 Jun 2023

In this paper, we present a solution for enhancing video alignment to improve multi-step inference.

26 Jun 2023

Paper
Code

Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers

dominickrei/poseawarevt • • 15 Jun 2023

Both PAAT and PAAB surpass their respective backbone Transformers by up to 9. 8% in real-world action recognition and 21. 8% in multi-view robotic video alignment.

15 Jun 2023

Paper
Code

Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation

daooshee/hd-vg-130m • 18 May 2023

Moreover, to fully unlock model capabilities for high-quality video generation and promote the development of the field, we curate a large-scale and open-source video dataset called HD-VG-130M.

18 May 2023

Paper
Code

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

baaivision/vid2vid-zero • • 30 Mar 2023

Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video.

319

30 Mar 2023

Paper
Code

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations

DavidZhang73/AssemblyVideoManualAlignment • • CVPR 2023

In this paper, we consider a novel setting where such an alignment is between (i) instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) and (ii) video segments from in-the-wild videos; these videos comprising an enactment of the assembly actions in the real world.

24 Mar 2023

Paper
Code

Video Alignment

Benchmarks Add a Result

Datasets

Latest papers

Content

Benchmarks

Add a Result