1 code implementation • 8 Apr 2023 • Jeongkyun Park, Kwanghee Choi, Hyunjun Heo, Hyung-Min Park
However, the pooling problem remains; the length of speech representations is inherently variable.
1 code implementation • 16 Jan 2023 • Jeongkyun Park, Jung-Wook Hwang, Kwanghee Choi, Seung-Hyun Lee, Jun Hwan Ahn, Rae-Hong Park, Hyung-Min Park
Inspired by humans comprehending speech in a multi-modal manner, various audio-visual datasets have been constructed.