no code implementations • 25 Jun 2022 • Roshan Sharma, Tyler Vuong, Mark Lindsey, Hira Dhamyal, Rita Singh, Bhiksha Raj
This work presents a multitask approach to the simultaneous estimation of age, country of origin, and emotion given vocal burst audio for the 2022 ICML Expressive Vocalizations Challenge ExVo-MultiTask track.
1 code implementation • 19 Oct 2020 • Tyler Vuong, Yangyang Xia, Richard Stern
Voice Type Discrimination (VTD) refers to discrimination between regions in a recording where speech was produced by speakers that are physically within proximity of the recording device ("Live Speech") from speech and other types of audio that were played back such as traffic noise and television broadcasts ("Distractor Audio").
Audio and Speech Processing
no code implementations • 2 Sep 2018 • Ankit Shah, Tyler Vuong
Deep Reinforcement learning with appropriate constraints would look only for the relevant person in the image as opposed to an unconstrained approach where each individual objects in the image are ranked.