Video Alignment

22 papers with code • 2 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Alignment

Trend	Dataset	Best Model	Paper	Code	Compare
	UPenn Action	TCC + TCN			See all
	MSU Video Alignment and Retrieval Benchmark Suite	VQMT3D			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Learning a Grammar Inducer from Massive Uncurated Instructional Videos

Sy-Zhang/MMC-PCFG • • 22 Oct 2022

While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence.

Paper
Code

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

svip-lab/weaksvr • • CVPR 2023

Sequential video understanding, as an emerging video understanding task, has driven lots of researchers' attention because of its goal-oriented nature.

Paper
Code

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations

DavidZhang73/AssemblyVideoManualAlignment • • CVPR 2023

In this paper, we consider a novel setting where such an alignment is between (i) instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) and (ii) video segments from in-the-wild videos; these videos comprising an enactment of the assembly actions in the real world.

Paper
Code

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

baaivision/vid2vid-zero • • 30 Mar 2023

Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video.

Paper
Code

Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation

daooshee/hd-vg-130m • 18 May 2023

Moreover, to fully unlock model capabilities for high-quality video generation and promote the development of the field, we curate a large-scale and open-source video dataset called HD-VG-130M.

Paper
Code

Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers

dominickrei/poseawarevt • • 15 Jun 2023

Both PAAT and PAAB surpass their respective backbone Transformers by up to 9. 8% in real-world action recognition and 21. 8% in multi-view robotic video alignment.

Paper
Code

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

zcfinal/loveu-cvpr23-aqtc • • 26 Jun 2023

In this paper, we present a solution for enhancing video alignment to improve multi-step inference.

Paper
Code

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

showlab/show-1 • • 27 Sep 2023

In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marries pixel-based and latent-based VDMs for text-to-video generation.

Paper
Code

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

EvalCrafter/EvalCrafter • • 17 Oct 2023

For video generation, various open-sourced models and public-available services have been developed to generate high-quality videos.

Paper
Code

AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

benchcouncil/aigcbench • • 3 Jan 2024

To establish a unified evaluation framework for video generation tasks, our benchmark includes 11 metrics spanning four dimensions to assess algorithm performance.

Paper
Code

Video Alignment

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result