no code implementations • 13 Dec 2023 • Chengxi Lei, Satwinder Singh, Feng Hou, Xiaoyun Jia, Ruili Wang
Most of the current speech data augmentation methods operate on either the raw waveform or the amplitude spectrum of speech.
no code implementations • 4 Dec 2023 • Yan Tian, Zhaocheng Xu, Yujun Ma, Weiping Ding, Ruili Wang, Zhihong Gao, Guohua Cheng, Linyang He, Xuran Zhao
Finally, we discuss the current scope of work and provide directions for the future development of multimodal cancer detection.
1 code implementation • 13 Sep 2023 • Zhenguang Liu, Xinyang Yu, Ruili Wang, Shuai Ye, Zhe Ma, Jianfeng Dong, Sifeng He, Feng Qian, Xiaobo Zhang, Roger Zimmermann, Lei Yang
We theoretically analyzed the mutual information between the label and the disentangled features, arriving at a loss that maximizes the extraction of task-relevant information from the original feature.
1 code implementation • 23 Aug 2023 • Yujun Ma, Benjia Zhou, Ruili Wang, Pichao Wang
RGB-D action and gesture recognition remain an interesting topic in human-centered scene understanding, primarily due to the multiple granularities and large variation in human motion.
no code implementations • 10 Aug 2023 • Satwinder Singh, Feng Hou, Ruili Wang
In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 5 Apr 2023 • Yuan Gao, Ruili Wang, Feng Hou
Machine translation relies heavily on the abilities of language understanding and generation.
1 code implementation • CVPR 2023 • Yi Wang, Ruili Wang, Xin Fan, Tianzhu Wang, Xiangjian He
A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features.
no code implementations • 11 May 2022 • Satwinder Singh, Ruili Wang, Feng Hou
We propose a new meta learning based framework for low resource speech recognition that improves the previous model agnostic meta learning (MAML) approach.
no code implementations • 3 Mar 2022 • Kedi Lyu, Haipeng Chen, Zhenguang Liu, Beiqi Zhang, Ruili Wang
3D human motion prediction, predicting future poses from a given sequence, is an issue of great significance and challenge in computer vision and machine intelligence, which can help machines in understanding human behaviors.
1 code implementation • ACL 2020 • Feng Hou, Ruili Wang, Jun He, Yi Zhou
We propose a simple yet effective method, FGS2EE, to inject fine-grained semantic information into entity embeddings to reduce the distinctiveness and facilitate the learning of contextual commonality.
1 code implementation • 25 May 2021 • Yuhao Chen, Guoqing Zhang, Yujiang Lu, zhenxing Wang, yuhui Zheng, Ruili Wang
Text-based person search is a sub-task in the field of image retrieval, which aims to retrieve target person images according to a given textual description.
Ranked #11 on Text based Person Retrieval on CUHK-PEDES
no code implementations • 11 Feb 2021 • Satwinder Singh, Ruili Wang, Yuanhang Qiu
We propose a novel pitch estimation technique called DeepF0, which leverages the available annotated data to directly learns from the raw audio in a data-driven manner.
1 code implementation • 26 Dec 2020 • Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi, Jie Yang
However, there is a lack of comprehensive review in this field, especially lack of a collection of GANs loss-variant, evaluation metrics, remedies for diverse image generation, and stable training.
1 code implementation • 10 Aug 2020 • Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Ruili Wang, Jie Yang
We also propose a feature pyramid network that improves the performance of the proposed model by extracting effective features from all the layers of the network for describing different scales objects.
no code implementations • 17 Mar 2020 • Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Huiyu Zhou, Jie Yang
This paper presents a novel deep neural network structure for pixel-wise sea-land segmentation, a Residual Dense U-Net (RDU-Net), in complex and high-density remote sensing images.
no code implementations • 28 Aug 2018 • Shi Yin, Yi Zhou, Chenguang Li, Shangfei Wang, Jianmin Ji, Xiaoping Chen, Ruili Wang
We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning.