1 code implementation • EMNLP 2021 • Jiali Zeng, Shuangzhi Wu, Yongjing Yin, Yufan Jiang, Mu Li
Across an extensive set of experiments on 10 machine translation tasks, we find that RAN models are competitive and outperform their Transformer counterpart in certain scenarios, with fewer parameters and inference time.
no code implementations • 28 Mar 2024 • Yufan Jiang, Qiaozhi He, Xiaomin Zhuang, Zhihua Wu
We present Code Comparison Tuning (CCT), a simple and effective tuning method for code large language models (Code LLMs) to better handle subtle code errors.
no code implementations • 7 Aug 2023 • Yufan Jiang, Qiaozhi He, Xiaomin Zhuang, Zhihua Wu, Kunpeng Wang, Wenlai Zhao, Guangwen Yang
Existing large language models have to run K times to generate a sequence of K tokens.
no code implementations • 13 Jun 2023 • Jiali Zeng, Yufan Jiang, Yongjing Yin, Yi Jing, Fandong Meng, Binghuai Lin, Yunbo Cao, Jie zhou
Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size.
no code implementations • 14 Dec 2022 • Xin Zheng, Tianyu Liu, Haoran Meng, Xu Wang, Yufan Jiang, Mengliang Rao, Binghuai Lin, Zhifang Sui, Yunbo Cao
Harvesting question-answer (QA) pairs from customer service chatlog in the wild is an efficient way to enrich the knowledge base for customer service chatbots in the cold start or continuous integration scenarios.
no code implementations • 15 Nov 2022 • Jiali Zeng, Yufan Jiang, Yongjing Yin, Xu Wang, Binghuai Lin, Yunbo Cao
We present DualNER, a simple and effective framework to make full use of both annotated source language corpus and unlabeled target language text for zero-shot cross-lingual named entity recognition (NER).
no code implementations • 7 Nov 2022 • Jiali Zeng, Yongjing Yin, Yufan Jiang, Shuangzhi Wu, Yunbo Cao
Specifically, with the help of prompts, we construct virtual semantic prototypes to each instance, and derive negative prototypes by using the negative form of the prompts.
1 code implementation • COLING 2022 • Xinnian Liang, Jing Li, Shuangzhi Wu, Jiali Zeng, Yufan Jiang, Mu Li, Zhoujun Li
To tackle this problem, in this paper, we proposed an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization, which is based on the semantic block.
1 code implementation • Findings (ACL) 2022 • Jiali Zeng, Yufan Jiang, Shuangzhi Wu, Yongjing Yin, Mu Li
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks.
no code implementations • COLING 2020 • Chen Xu, Bojie Hu, Yufan Jiang, Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu
This eases training by highlighting easy samples that the current model has enough competence to learn.
no code implementations • 6 Nov 2020 • Yufan Jiang, Shuangzhi Wu, Jing Gong, Yahui Cheng, Peng Meng, Weiliang Lin, Zhibo Chen, Mu Li
In addition, by transferring knowledge from other kinds of MRC tasks, our model achieves a new state-of-the-art results in both single and ensemble settings.
Ranked #1 on Reading Comprehension on RACE
1 code implementation • EMNLP 2020 • Bei Li, Ziyang Wang, Hui Liu, Yufan Jiang, Quan Du, Tong Xiao, Huizhen Wang, Jingbo Zhu
We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly.
1 code implementation • ACL 2020 • Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li
In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence.
no code implementations • ACL 2020 • Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li
Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell.
1 code implementation • IJCNLP 2019 • Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu
In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing.
Ranked #1 on Language Modelling on PTB Diagnostic ECG Database