1 code implementation • 14 Feb 2024 • Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, Deniz Gündüz
The results showcase the potential of exploiting the temporal relations in video data using generative models.
no code implementations • 4 Feb 2024 • Xin Jin, Bohan Li, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng
Representation disentanglement may help AI fundamentally understand the real world and thus benefit both discrimination and generation tasks.
no code implementations • 15 Aug 2023 • Yi Liu, Hongrui Xuan, Bohan Li, Meng Wang, Tong Chen, Hongzhi Yin
However, the long-tail distribution of entities leads to sparsity in supervision signals, which weakens the quality of item representation when utilizing KG enhancement.
no code implementations • 22 Jun 2023 • Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin, Wenjun Zeng
Numerous studies have investigated the pivotal role of reliable 3D volume representation in scene perception tasks, such as multi-view stereo (MVS) and semantic scene completion (SSC).
no code implementations • 20 Jun 2023 • Lianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li, Xin Jin, Jianxin Lin
In this paper, we present a novel framework (EMoG) to tackle the above challenges with denoising diffusion models: 1) To alleviate the one-to-many problem, we incorporate emotion clues to guide the generation process, making the generation much easier; 2) To model joint correlation, we propose to decompose the difficult gesture generation into two sub-problems: joint correlation modeling and temporal dynamics modeling.
no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai
It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.
1 code implementation • ICCV 2023 • Baao Xie, Bohan Li, Zequn Zhang, Junting Dong, Xin Jin, Jingyu Yang, Wenjun Zeng
They are complementary -- the outer navigation is to identify global-view semantic directions, and the inner refinement dedicates to fine-grained attributes.
no code implementations • 19 Apr 2023 • Bohan Li, Longxu Dou, Yutai Hou, Yunlong Feng, Honglin Mu, Qingfu Zhu, Qinghua Sun, Wanxiang Che
Prompt-based learning has shown considerable promise in reformulating various downstream tasks as cloze problems by combining original input with a predetermined template.
no code implementations • 18 Apr 2023 • Yunlong Feng, Bohan Li, Libo Qin, Xiao Xu, Wanxiang Che
Cross-domain text classification aims to adapt models to a target domain that lacks labeled data.
1 code implementation • 24 Mar 2023 • Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, XiaoFeng Wang, Yunnan Wang, Xin Jin, Wenjun Zeng
However, due to the inherent representation gap between stereo geometry and BEV features, it is non-trivial to bridge them for dense prediction task of SSC.
no code implementations • 4 Feb 2023 • Bohan Li, Xiao Xu, Xinghao Wang, Yutai Hou, Yunlong Feng, Feng Wang, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che
In contrast, generative methods bring more image diversity in the augmented images but may not preserve semantic consistency, thus incorrectly changing the essential semantics of the original image.
no code implementations • 13 Jan 2023 • Hongrui Xuan, Yi Liu, Bohan Li, Hongzhi Yin
In particular, we design the multi-behavior learning module to extract users' personalized behavior information for user-embedding enhancement, and utilize knowledge graph in the knowledge enhancement module to derive more robust knowledge-aware representations for items.
1 code implementation • 30 Nov 2022 • Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian
In this paper, we propose a machine translation system tailored for the task of video dubbing, which directly considers the speech duration of each token in translation, to match the length of source and target speech.
no code implementations • 25 Oct 2022 • Lyndon R. Duong, Bohan Li, Cheng Chen, Jingning Han
Contemporary lossy image and video coding standards rely on transform coding, the process through which pixels are mapped to an alternative representation to facilitate efficient data compression.
1 code implementation • COLING 2022 • Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che
Prompting method is regarded as one of the crucial progress for few-shot nature language processing.
2 code implementations • 23 Jul 2022 • Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai
Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
no code implementations • 1 May 2022 • Bohan Li, Lie-Liang Yang, Robert G Maunder, Songlin Sun, Pei Xiao
In-band full duplex cell-free (CF) systems suffer from severe self-interference and cross-link interference, especially when CF systems are operated in distributed way.
1 code implementation • Findings (ACL) 2022 • Yutai Hou, Cheng Chen, Xianzhen Luo, Bohan Li, Wanxiang Che
Such inverse prompting only requires a one-turn prediction for each slot type and greatly speeds up the prediction.
no code implementations • 1 Apr 2022 • Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu
We model the speaker characteristics systematically to improve the generalization on new speakers.
no code implementations • 24 Nov 2021 • Changxu Cheng, Bohan Li, Qi Zheng, Yongpan Wang, Wenyu Liu
As a result, the learning of semantic features is prone to have a bias on the limited vocabulary of the training set, which is called vocabulary reliance.
1 code implementation • 25 Oct 2021 • Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao
The goal of this challenge is to synthesize natural and high-quality speech from text, and we approach this goal in two perspectives: The first is to directly model and generate waveform in 48 kHz sampling rate, which brings higher perception quality than previous systems with 16 kHz or 24 kHz sampling rate; The second is to model the variation information in speech through a systematic design, which improves the prosody and naturalness.
1 code implementation • 5 Oct 2021 • Bohan Li, Yutai Hou, Wanxiang Che
One of the main focuses of the DA methods is to improve the diversity of training data, thereby helping the model to better generalize to unseen testing data.
no code implementations • 20 Jul 2021 • Wenxian Shi, Yuxuan Song, Hao Zhou, Bohan Li, Lei LI
However, it has been observed that a converged heavy teacher model is strongly constrained for learning a compact student network and could make the optimization subject to poor local optima.
no code implementations • 6 Jul 2021 • Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu
While recent text to speech (TTS) models perform very well in synthesizing reading-style (e. g., audiobook) speech, it is still challenging to synthesize spontaneous-style speech (e. g., podcast or conversation), mainly because of two reasons: 1) the lack of training data for spontaneous speech; 2) the difficulty in modeling the filled pauses (um and uh) and diverse rhythms in spontaneous speech.
1 code implementation • 20 Apr 2021 • Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu
In adaptation, we use untranscribed speech data for speech reconstruction and only fine-tune the TTS decoder.
2 code implementations • ICLR 2021 • Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu
2) To better trade off the adaptation parameters and voice quality, we introduce conditional layer normalization in the mel-spectrogram decoder of AdaSpeech, and fine-tune this part in addition to speaker embedding for adaptation.
no code implementations • 1 Jan 2021 • Wenxian Shi, Yuxuan Song, Hao Zhou, Bohan Li, Lei LI
However, it has been observed that a converged heavy teacher model is strongly constrained for learning a compact student network and could make the optimization subject to poor local optima.
3 code implementations • EMNLP 2020 • Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei LI
Pre-trained contextual representations like BERT have achieved great success in natural language processing.
Ranked #16 on Semantic Textual Similarity on STS16
1 code implementation • IJCNLP 2019 • Bohan Li, Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick, Yiming Yang
In this paper, we investigate a simple fix for posterior collapse which yields surprisingly effective results.
no code implementations • 22 Jan 2019 • Xiang Kong, Bohan Li, Graham Neubig, Eduard Hovy, Yiming Yang
In this work, we propose a method for neural dialogue response generation that allows not only generating semantically reasonable responses according to the dialogue history, but also explicitly controlling the sentiment of the response via sentiment labels.
no code implementations • 8 Jan 2019 • Chunhua Liu, Yan Zhao, Qingyi Si, Haiou Zhang, Bohan Li, Dong Yu
From the experimental results, we can conclude that the difference fusion is comparable with union fusion, and the similarity fusion needs to be activated by the union fusion.
1 code implementation • 15 Jun 2018 • Guokun Lai, Bohan Li, Guoqing Zheng, Yiming Yang
In this paper, we combine the ideas from both stochastic latent variables and dilated convolutions, and propose a new architecture to model sequential data, termed as Stochastic WaveNet, where stochastic latent variables are injected into the WaveNet structure.