Search Results for author: Heinrich Dinkel

Found 7 papers, 7 papers with code

Towards duration robust weakly supervised sound event detection

1 code implementation19 Jan 2021 Heinrich Dinkel, Mengyue Wu, Kai Yu

Our model outperforms other approaches on the DCASE2018 and URBAN-SED datasets without requiring prior duration knowledge.

Data Augmentation Sound Event Detection Sound Audio and Speech Processing

Multiple Sound Sources Localization from Coarse to Fine

1 code implementation ECCV 2020 Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu, Weiyao Lin

How to visually localize multiple sound sources in unconstrained videos is a formidable problem, especially when lack of the pairwise sound-object annotations.

Voice activity detection in the wild via weakly supervised sound event detection

1 code implementation27 Mar 2020 Heinrich Dinkel, Yefei Chen, Mengyue Wu, Kai Yu

We proposed two GPVAD models, one full (GPV-F), trained on 527 Audioset sound events, and one binary (GPV-B), only distinguishing speech and noise.

Sound Audio and Speech Processing

Audio Caption in a Car Setting with a Sentence-Level Loss

1 code implementation31 May 2019 Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu

Captioning has attracted much attention in image and video understanding while a small amount of work examines audio captioning.

Audio captioning Decoder +6

Duration robust sound event detection

1 code implementation8 Apr 2019 Heinrich Dinkel, Kai Yu

Task 4 of the Dcase2018 challenge demonstrated that substantially more research is needed for a real-world application of sound event detection.

Sound Audio and Speech Processing

Text-based depression detection on sparse data

1 code implementation8 Apr 2019 Heinrich Dinkel, Mengyue Wu, Kai Yu

Previous text-based depression detection is commonly based on large user-generated data.

Depression Detection Sentence +1

Audio Caption: Listen and Tell

1 code implementation25 Feb 2019 Mengyue Wu, Heinrich Dinkel, Kai Yu

A baseline encoder-decoder model is provided for both English and Mandarin.

Decoder General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.