no code implementations • 20 Mar 2024 • Haoran Lang, Yuxuan Ge, Zheng Tian
For text-to-video generation tasks where temporal conditions are not explicitly given, we propose a two-stage generation strategy which can decouple the generation of temporal features from semantic-content features.