no code implementations • WMT (EMNLP) 2021 • Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang
Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.
no code implementations • WMT (EMNLP) 2020 • Shuangzhi Wu, Xing Wang, Longyue Wang, Fangxu Liu, Jun Xie, Zhaopeng Tu, Shuming Shi, Mu Li
This paper describes Tencent Neural Machine Translation systems for the WMT 2020 news translation tasks.
1 code implementation • EMNLP 2021 • Jiali Zeng, Shuangzhi Wu, Yongjing Yin, Yufan Jiang, Mu Li
Across an extensive set of experiments on 10 machine translation tasks, we find that RAN models are competitive and outperform their Transformer counterpart in certain scenarios, with fewer parameters and inference time.
1 code implementation • 31 Oct 2023 • Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong
The ability of data memorization and regurgitation in pretrained language models, revealed in previous studies, brings the risk of data leakage.
no code implementations • 28 Jun 2023 • Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi
Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge.
1 code implementation • 26 Apr 2023 • Bing Wang, Xinnian Liang, Jian Yang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.
1 code implementation • 23 Mar 2023 • Xinnian Liang, Shuangzhi Wu, Hui Huang, Jiaqi Bai, Chao Bian, Zhoujun Li
Retrieval augmented methods have shown promising results in various classification tasks.
1 code implementation • 20 Mar 2023 • Xinnian Liang, Zefan Zhou, Hui Huang, Shuangzhi Wu, Tong Xiao, Muyun Yang, Zhoujun Li, Chao Bian
We conduct extensive experiments on various Chinese NLP tasks to evaluate existing PLMs as well as the proposed MigBERT.
1 code implementation • 29 Jan 2023 • Xinnian Liang, Shuangzhi Wu, Chenhao Cui, Jiaqi Bai, Chao Bian, Zhoujun Li
The global one aims to identify vital sub-topics in the dialogue and the local one aims to select the most important context in each sub-topic.
no code implementations • 16 Dec 2022 • Weilong Dong, Xinwei Wu, Junzhuo Li, Shuangzhi Wu, Chao Bian, Deyi Xiong
It broadcasts the global model in the server to each client and produces pseudo data for clients so that knowledge from the global model can be explored to enhance few-shot learning of each client model.
no code implementations • 16 Dec 2022 • Junzhuo Li, Xinwei Wu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong
Knowledge distillation (KD) has been widely used for model compression and knowledge transfer.
no code implementations • 7 Nov 2022 • Jiali Zeng, Yongjing Yin, Yufan Jiang, Shuangzhi Wu, Yunbo Cao
Specifically, with the help of prompts, we construct virtual semantic prototypes to each instance, and derive negative prototypes by using the negative form of the prompts.
no code implementations • 24 Aug 2022 • Chenhao Cui, Xinnian Liang, Shuangzhi Wu, Zhoujun Li
The core of ViL-Sum is a joint multi-modal encoder with two well-designed tasks, image reordering and image selection.
1 code implementation • COLING 2022 • Xinnian Liang, Jing Li, Shuangzhi Wu, Jiali Zeng, Yufan Jiang, Mu Li, Zhoujun Li
To tackle this problem, in this paper, we proposed an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization, which is based on the semantic block.
1 code implementation • 11 Jul 2022 • Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Shuangzhi Wu, Hongcheng Guo, Zhoujun Li, Furu Wei
Most translation tasks among languages belong to the zero-resource translation problem where parallel corpora are unavailable.
1 code implementation • NAACL 2022 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In this paper, we propose a novel method to extract multi-granularity features based solely on the original input sentences.
1 code implementation • ACL 2022 • Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li
Confidence estimation aims to quantify the confidence of the model prediction, providing an expectation of success.
1 code implementation • Findings (ACL) 2022 • Jiali Zeng, Yufan Jiang, Shuangzhi Wu, Yongjing Yin, Mu Li
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks.
no code implementations • 7 Mar 2022 • Fan Zhang, Duyu Tang, Yong Dai, Cong Zhou, Shuangzhi Wu, Shuming Shi
The key feature of our approach is that it is sparsely activated guided by predefined skills.
no code implementations • 24 Feb 2022 • Zhangyin Feng, Duyu Tang, Cong Zhou, Junwei Liao, Shuangzhi Wu, Xiaocheng Feng, Bing Qin, Yunbo Cao, Shuming Shi
(2) how to predict a word via cloze test without knowing the number of wordpieces in advance?
1 code implementation • EMNLP 2021 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.
no code implementations • ACL 2021 • Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu, Mu Li
Attention mechanisms have achieved substantial improvements in neural machine translation by dynamically selecting relevant inputs for different predictions.
no code implementations • COLING 2020 • Deyu Zhou, Shuangzhi Wu, Qing Wang, Jun Xie, Zhaopeng Tu, Mu Li
Emotion lexicons have been shown effective for emotion classification (Baziotis et al., 2018).
no code implementations • COLING 2020 • Zhenyu Zhao, Shuangzhi Wu, Muyun Yang, Kehai Chen, Tiejun Zhao
Neural models have achieved great success on the task of machine reading comprehension (MRC), which are typically trained on hard labels.
no code implementations • 6 Nov 2020 • Yufan Jiang, Shuangzhi Wu, Jing Gong, Yahui Cheng, Peng Meng, Weiliang Lin, Zhibo Chen, Mu Li
In addition, by transferring knowledge from other kinds of MRC tasks, our model achieves a new state-of-the-art results in both single and ensemble settings.
Ranked #1 on Reading Comprehension on RACE
no code implementations • 5 Sep 2019 • Chengyi Wang, Shuangzhi Wu, Shujie Liu
Recently, Transformer has achieved the state-of-the-art performance on many machine translation tasks.
no code implementations • 5 Sep 2019 • Chengyi Wang, Shuangzhi Wu, Shujie Liu
Due to the highly parallelizable architecture, Transformer is faster to train than RNN-based models and popularly used in machine translation tasks.
no code implementations • 1 Nov 2018 • Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dong-dong Zhang, Xu sun
In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding.
no code implementations • 13 Aug 2018 • Zhirui Zhang, Shuangzhi Wu, Shujie Liu, Mu Li, Ming Zhou, Tong Xu
Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation.
2 code implementations • 15 Mar 2018 • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dong-dong Zhang, Zhirui Zhang, Ming Zhou
Machine translation has made rapid advances in recent years.
Ranked #3 on Machine Translation on WMT 2017 English-Chinese
no code implementations • ACL 2017 • Shuangzhi Wu, Dong-dong Zhang, Nan Yang, Mu Li, Ming Zhou
Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned.