no code implementations • 4 Jun 2022 • Andrew Koh, Soham Tiwari, Chng Eng Siong
In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task.
1 code implementation • 1 Nov 2021 • Soham Tiwari, Kshitiz Lakhotia, Manjunath Mulimani
Inspired by the You Only Look Once (YOLO) algorithm in computer vision, the YOHO algorithm can match the performance of the various state-of-the-art algorithms on datasets such as Music Speech Detection Dataset, TUT Sound Event, and Urban-SED datasets but at lower inference times.