Search Results for author: Jiqing Han

Found 12 papers, 4 papers with code

A Glance is Enough: Extract Target Sentence By Looking at A keyword

no code implementations9 Oct 2023 Ying Shi, Dong Wang, Lantian Li, Jiqing Han

This paper investigates the possibility of extracting a target sentence from multi-talker speech using only a keyword as input.

Sentence

Spot keywords from very noisy and mixed speech

no code implementations28 May 2023 Ying Shi, Dong Wang, Lantian Li, Jiqing Han, Shi Yin

We propose a novel Mix Training (MT) strategy that encourages the model to discover low-energy keywords from noisy and mixed speech.

Data Augmentation Keyword Spotting

Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

1 code implementation5 May 2023 Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang

This paper presents Time-Weighted Frequency Domain Representation (TWFR) with the GMM method (TWFR-GMM) for anomalous sound detection.

Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition

no code implementations27 Feb 2023 Dekai Sun, Yancheng He, Jiqing Han

For the difficulty of multimodal fusion, we use a K-layer multi-head attention mechanism as a downstream fusion module.

Multimodal Emotion Recognition

Exploring Transformer's potential on automatic piano transcription

no code implementations8 Apr 2022 Longshen Ou, Ziyi Guo, Emmanouil Benetos, Jiqing Han, Ye Wang

Most recent research about automatic music transcription (AMT) uses convolutional neural networks and recurrent neural networks to model the mapping from music signals to symbolic notation.

Music Transcription

Can We Trust Deep Speech Prior?

no code implementations4 Nov 2020 Ying Shi, Haolin Chen, Zhiyuan Tang, Lantian Li, Dong Wang, Jiqing Han

Recently, speech enhancement (SE) based on deep speech prior has attracted much attention, such as the variational auto-encoder with non-negative matrix factorization (VAE-NMF) architecture.

Speech Enhancement

Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss

1 code implementation6 Aug 2020 Ziqiang Shi, Rujie Liu, Jiqing Han

We have open sourced our re-implementation of the DPRNN-TasNet here (https://github. com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation), and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.

Speaker Separation Speech Separation

La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention

1 code implementation23 Jan 2020 Ziqiang Shi, Rujie Liu, Jiqing Han

We have open-sourced our re-implementation of the DPRNN-TasNet in https://github. com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our `La Furca' is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be smoothly reproduced.

Sound Audio and Speech Processing

A Multi-Task Learning Framework for Overcoming the Catastrophic Forgetting in Automatic Speech Recognition

no code implementations17 Apr 2019 Jiabin Xue, Jiqing Han, Tieran Zheng, Xiang Gao, Jiaxing Guo

On the one hand, we constrain the new parameters not to deviate too far from the original parameters and punish the new system when forgetting original knowledge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events

1 code implementation10 Apr 2019 Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du

In this paper, we propose a new strategy for acoustic scene classification (ASC) , namely recognizing acoustic scenes through identifying distinct sound events.

Acoustic Scene Classification Classification +2

FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks

no code implementations12 Feb 2019 Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Jiqing Han, Anyan Shi

Deep dilated temporal convolutional networks (TCN) have been proved to be very effective in sequence modeling.

Sound Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.