Search Results for author: Sitong Su

Found 4 papers, 0 papers with code

Training-Free Semantic Video Composition via Pre-trained Diffusion Model

no code implementations17 Jan 2024 Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song

Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities.

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis

no code implementations6 Dec 2023 Sitong Su, Jianzhi Liu, Lianli Gao, Jingkuan Song

Recently Text-to-Video (T2V) synthesis has undergone a breakthrough by training transformers or diffusion models on large-scale datasets.

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control

no code implementations6 Dec 2023 Sitong Su, Litao Guo, Lianli Gao, Heng Tao Shen, Jingkuan Song

Story Visualization aims to generate images aligned with story prompts, reflecting the coherence of storybooks through visual consistency among characters and scenes. Whereas current approaches exclusively concentrate on characters and neglect the visual consistency among contextually correlated scenes, resulting in independent character images without inter-image coherence. To tackle this issue, we propose a new presentation form for Story Visualization called Storyboard, inspired by film-making, as illustrated in Fig. 1. Specifically, a Storyboard unfolds a story into visual representations scene by scene.

Story Visualization

MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation

no code implementations28 Nov 2023 Sitong Su, Litao Guo, Lianli Gao, HengTao Shen, Jingkuan Song

To tackle the two issues, we propose a prompt-adaptive and disentangled motion control strategy coined as MotionZero, which derives motion priors from prompts of different objects by Large-Language-Models and accordingly applies motion control of different objects to corresponding regions in disentanglement.

Disentanglement Text-to-Video Generation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.