no code implementations • 29 Jun 2023 • Junchen Lu, Berrak Sisman, Mingyang Zhang, Haizhou Li
The goal of Automatic Voice Over (AVO) is to generate speech in sync with a silent video given its text script.
no code implementations • 7 Oct 2021 • Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li
The proposed VisualTTS adopts two novel mechanisms that are 1) textual-visual attention, and 2) visual fusion strategy during acoustic decoding, which both contribute to forming accurate alignment between the input text content and lip motion in input lip sequence.
no code implementations • 10 Aug 2020 • Junchen Lu, Kun Zhou, Berrak Sisman, Haizhou Li
We train an encoder to disentangle singer identity and singing prosody (F0 contour) from phonetic content.