no code implementations • 10 Apr 2024 • Muer Tie, Julong Wei, Zhengjun Wang, Ke wu, Shansuai Yuan, Kaizhao Zhang, Jie Jia, Jieru Zhao, Zhongxue Gan, Wenchao Ding
Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required.
no code implementations • 30 Jun 2022 • Xinmeng Xu, Yang Wang, Jie Jia, Binbin Chen, Dejun Li
The proposed model alleviates these drawbacks by a) applying a model that fuses audio and visual features layer by layer in encoding phase, and that feeds fused audio-visual features to each corresponding decoder layer, and more importantly, b) introducing a 2-stage multi-head cross attention (MHCA) mechanism to infer audio-visual speech enhancement for balancing the fused audio-visual features and eliminating irrelevant features.
no code implementations • 30 Jun 2022 • Xinmeng Xu, Yang Wang, Jie Jia, Binbin Chen, Jianjun Hao
For monaural speech enhancement, contextual information is important for accurate speech estimation.
no code implementations • 24 Sep 2021 • Mingyang Zhang, Jie Jia, Jian Chen
A novel multi-scale temporal convolutional network (TCN) and long short-term memory network (LSTM) based magnetic localization approach is proposed.
no code implementations • 4 Feb 2021 • Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Binbin Chen
This paper proposes a novel frameworkthat involves visual information for speech enhancement, by in-corporating a Generative Adversarial Network (GAN).
no code implementations • 26 Apr 2018 • Honggang Zhou, Yunchun Li, Hailong Yang, Wei Li, Jie Jia
However, the learning and inference of BN model are NP-hard thus the number of stochastic variables in BN is highly constrained.
no code implementations • 31 Dec 2017 • Jie Jia, Honggang Zhou, Yunchun Li
We present a new method to approximate posterior probabilities of Bayesian Network using Deep Neural Network.