Search Results for author: Philip J. B. Jackson

Found 9 papers, 4 papers with code

Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization

1 code implementation • 21 Dec 2023 • Davide Berghi, Philip J. B. Jackson

The multichannel audio ``student'' network is trained to generate the same results.

Paper
Code

Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection

1 code implementation • 14 Dec 2023 • Davide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang, Philip J. B. Jackson

Sound event localization and detection (SELD) combines two subtasks: sound event detection (SED) and direction of arrival (DOA) estimation.

Data Augmentation Event Detection +2

Paper
Code

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

no code implementations • 9 Aug 2023 • Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

To address this issue, we (i) embed relative positional encoding in the self-attention mechanism and (ii) exploit multi-scale temporal relationships by designing a novel non hierarchical network, in contrast to the recent transformer-based approaches that use a hierarchical structure.

Ranked #1 on Action Detection on MultiTHUMOS

Action Detection Event Detection +1

Paper
Add Code

Audio Inputs for Active Speaker Detection and Localization via Microphone Array

no code implementations • 27 Jul 2023 • Davide Berghi, Philip J. B. Jackson

This study considers the problem of detecting and locating an active talker's horizontal position from multichannel audio captured by a microphone array.

Paper
Add Code

Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research

no code implementations • 4 Dec 2022 • Davide Berghi, Marco Volino, Philip J. B. Jackson

This is partly due to the lack of available datasets enabling audio-visual research in this direction.

Paper
Add Code

Visually Supervised Speaker Detection and Localization via Microphone Array

no code implementations • 7 Mar 2022 • Davide Berghi, Adrian Hilton, Philip J. B. Jackson

We propose to generate weak labels using a pre-trained active speaker detector on pre-extracted face tracks.

Paper
Add Code

Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks

1 code implementation • 14 Jun 2019 • Qiuqiang Kong, Yong Xu, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Single-channel signal separation and deconvolution aims to separate and deconvolve individual sources from a single-channel mixture and is a challenging problem in which no prior knowledge of the mixing filters is available.

Generative Adversarial Network Image Inpainting

Paper
Code

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

2 code implementations • 13 Jul 2016 • Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Audio Tagging General Classification +1

Paper
Code

Fully DNN-based Multi-label regression for audio tagging

no code implementations • 24 Jun 2016 • Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input.

Audio Tagging Event Detection +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.