no code implementations • 17 Jan 2024 • Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song
Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities.
no code implementations • 6 Dec 2023 • Sitong Su, Jianzhi Liu, Lianli Gao, Jingkuan Song
Recently Text-to-Video (T2V) synthesis has undergone a breakthrough by training transformers or diffusion models on large-scale datasets.
no code implementations • 6 Dec 2023 • Sitong Su, Litao Guo, Lianli Gao, Heng Tao Shen, Jingkuan Song
Story Visualization aims to generate images aligned with story prompts, reflecting the coherence of storybooks through visual consistency among characters and scenes. Whereas current approaches exclusively concentrate on characters and neglect the visual consistency among contextually correlated scenes, resulting in independent character images without inter-image coherence. To tackle this issue, we propose a new presentation form for Story Visualization called Storyboard, inspired by film-making, as illustrated in Fig. 1. Specifically, a Storyboard unfolds a story into visual representations scene by scene.
no code implementations • 28 Nov 2023 • Sitong Su, Litao Guo, Lianli Gao, HengTao Shen, Jingkuan Song
To tackle the two issues, we propose a prompt-adaptive and disentangled motion control strategy coined as MotionZero, which derives motion priors from prompts of different objects by Large-Language-Models and accordingly applies motion control of different objects to corresponding regions in disentanglement.