Search Results for author: Philip J. B. Jackson

Found 9 papers, 4 papers with code

Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection

1 code implementation14 Dec 2023 Davide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang, Philip J. B. Jackson

Sound event localization and detection (SELD) combines two subtasks: sound event detection (SED) and direction of arrival (DOA) estimation.

Data Augmentation Event Detection +2

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

no code implementations9 Aug 2023 Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

To address this issue, we (i) embed relative positional encoding in the self-attention mechanism and (ii) exploit multi-scale temporal relationships by designing a novel non hierarchical network, in contrast to the recent transformer-based approaches that use a hierarchical structure.

Action Detection Event Detection +1

Audio Inputs for Active Speaker Detection and Localization via Microphone Array

no code implementations27 Jul 2023 Davide Berghi, Philip J. B. Jackson

This study considers the problem of detecting and locating an active talker's horizontal position from multichannel audio captured by a microphone array.

Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research

no code implementations4 Dec 2022 Davide Berghi, Marco Volino, Philip J. B. Jackson

This is partly due to the lack of available datasets enabling audio-visual research in this direction.

Visually Supervised Speaker Detection and Localization via Microphone Array

no code implementations7 Mar 2022 Davide Berghi, Adrian Hilton, Philip J. B. Jackson

We propose to generate weak labels using a pre-trained active speaker detector on pre-extracted face tracks.

Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks

1 code implementation14 Jun 2019 Qiuqiang Kong, Yong Xu, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Single-channel signal separation and deconvolution aims to separate and deconvolve individual sources from a single-channel mixture and is a challenging problem in which no prior knowledge of the mixing filters is available.

Generative Adversarial Network Image Inpainting

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

2 code implementations13 Jul 2016 Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Audio Tagging General Classification +1

Fully DNN-based Multi-label regression for audio tagging

no code implementations24 Jun 2016 Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input.

Audio Tagging Event Detection +4

Cannot find the paper you are looking for? You can Submit a new open access paper.