no code implementations • 19 Jan 2024 • Inmo Yeon, Jung-Woo Choi
Room geometry inference (RGI) aims at estimating room shapes from measured room impulse responses (RIRs) and has received lots of attention for its importance in environment-aware audio rendering and virtual acoustic representation of a real venue.
no code implementations • 20 Dec 2023 • Yusun Shul, Jung-Woo Choi
Prior studies employ spectral and channel information as the embedding for temporal attention.
no code implementations • 10 Oct 2023 • Soonhyeon Choi, Jung-Woo Choi
Unsupervised anomalous sound detection (ASD) aims to identify anomalous sounds by learning the features of normal operational sounds and sensing their deviations.
no code implementations • 4 Sep 2023 • Inmo Yeon, Jung-Woo Choi
However, the conventional RGI technique poses several assumptions, such as convex room shapes, the number of walls known in priori, and the visibility of first-order reflections.
no code implementations • 30 Aug 2023 • Dongheon Lee, Jung-Woo Choi
In the proposed split dense blocks extracting spatial features, a pair of subgroups is sequentially concatenated and processed by convolution layers to effectively reduce the computational complexity and memory usage.
no code implementations • 5 Jun 2023 • Yusun Shul, Byeong-Yun Ko, Jung-Woo Choi
Localizing sounds and detecting events in different room environments is a difficult task, mainly due to the wide range of reflections and reverberations.
no code implementations • 23 Apr 2023 • Wonjun Yi, Jung-Woo Choi, Jae-Woo Lee
The dataset was constructed by collecting the operating sounds of drones from microphones mounted on three different drones in an anechoic chamber.
1 code implementation • 15 Dec 2022 • Dongheon Lee, Jung-Woo Choi
In this study, we propose a dense frequency-time attentive network (DeFT-AN) for multichannel speech enhancement.
Ranked #1 on Speech Enhancement on spatialized DNS challenge
no code implementations • 8 Nov 2021 • Dongheon Lee, Seongrae Kim, Jung-Woo Choi
In this study, we propose an end-to-end time-domain speech enhancement network that can facilitate the use of inter-channel relationships at individual layers of a DNN.
Ranked #1 on Speech Enhancement on CHiME-3