no code implementations • WMT (EMNLP) 2021 • Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang
Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.
no code implementations • WMT (EMNLP) 2020 • Shuangzhi Wu, Xing Wang, Longyue Wang, Fangxu Liu, Jun Xie, Zhaopeng Tu, Shuming Shi, Mu Li
This paper describes Tencent Neural Machine Translation systems for the WMT 2020 news translation tasks.
no code implementations • WMT (EMNLP) 2020 • Longyue Wang, Zhaopeng Tu, Xing Wang, Li Ding, Liang Ding, Shuming Shi
This paper describes the Tencent AI Lab’s submission of the WMT 2020 shared task on chat translation in English-German.
1 code implementation • WMT (EMNLP) 2020 • Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi
This paper describes the Tencent AI Lab submission of the WMT2020 shared task on biomedical translation in four language directions: German<->English, English<->German, Chinese<->English and English<->Chinese.
1 code implementation • ACL 2022 • Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu
In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.
1 code implementation • 20 Mar 2024 • Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, Siqi Sun, Jianxin Lin, Longyue Wang, Xiangxiang Zeng
While various models and computational tools have been proposed for structure and property analysis of molecules, generating molecules that conform to all desired structures and properties remains a challenge.
no code implementations • 12 Feb 2024 • Jianhui Pang, Fanghua Ye, Derek F. Wong, Longyue Wang
Large language models (LLMs) predominantly employ decoder-only transformer architectures, necessitating the retention of keys/values information for historical tokens to provide contextual information and avoid redundant computation.
1 code implementation • 23 Jan 2024 • Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu
The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods.
1 code implementation • 16 Jan 2024 • Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu
This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.
1 code implementation • 28 Dec 2023 • Geyan Ye, Xibao Cai, Houtim Lai, Xing Wang, Junhong Huang, Longyue Wang, Wei Liu, Xiangxiang Zeng
Recently, the impressive performance of large language models (LLMs) on a wide range of tasks has attracted an increasing number of attempts to apply LLMs in drug discovery.
1 code implementation • 12 Dec 2023 • Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu
Our analysis reveals a correlation between the calibration performance of reward models (RMs) and the alignment performance of LLMs.
no code implementations • 4 Dec 2023 • Bingshuai Liu, Chenyang Lyu, Zijun Min, Zhanyu Wang, Jinsong Su, Longyue Wang
The advancement of Large Language Models (LLMs) has brought substantial attention to the Chain of Thought (CoT) approach, primarily due to its ability to enhance the capability of LLMs on complex reasoning tasks.
1 code implementation • 29 Nov 2023 • Zhen Zhao, Zicheng Wang, Longyue Wang, Yixuan Yuan, Luping Zhou
To mitigate the confirmation bias from the diverse supervision, the core of AD-MT lies in two proposed modules: the Random Periodic Alternate (RPA) Updating Module and the Conflict-Combating Module (CCM).
no code implementations • 25 Nov 2023 • Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.
no code implementations • 13 Nov 2023 • Yunxin Li, Longyue Wang, Baotian Hu, Xinyu Chen, Wanqi Zhong, Chenyang Lyu, Wei Wang, Min Zhang
The emergence of multimodal large models (MLMs) has significantly advanced the field of visual understanding, offering remarkable capabilities in the realm of visual question answering (VQA).
no code implementations • 6 Nov 2023 • Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, WeiYu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi
To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation.
no code implementations • 31 Oct 2023 • Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lei Wang, Lingqiao Liu, Leyang Cui, Zhaopeng Tu, Longyue Wang, Luping Zhou
This work conducts an evaluation of GPT-4V's multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.
no code implementations • 17 Sep 2023 • Yi Chen, Haiyun Jiang, Wei Bi, Rui Wang, Longyue Wang, Shuming Shi, Ruifeng Xu
This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings.
1 code implementation • 14 Sep 2023 • Huayang Li, Siheng Li, Deng Cai, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi
We release our dataset, model, and demo to foster future research in the area of multimodal instruction following.
Ranked #93 on Visual Question Answering on MM-Vet
1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
no code implementations • 16 Jul 2023 • Longyue Wang, Zefeng Du, Donghuai Liu, Deng Cai, Dian Yu, Haiyun Jiang, Yan Wang, Leyang Cui, Shuming Shi, Zhaopeng Tu
Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP).
no code implementations • 6 Jul 2023 • Bingshuai Liu, Longyue Wang, Chenyang Lyu, Yong Zhang, Jinsong Su, Shuming Shi, Zhaopeng Tu
Accordingly, we propose a novel multi-modal metric that considers object-text alignment to filter the fine-tuning data in the target culture, which is used to fine-tune a T2I model to improve cross-cultural generation.
1 code implementation • 15 Jun 2023 • Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu
Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.
no code implementations • 31 May 2023 • Zhihong Huang, Longyue Wang, Siyou Liu, Derek F. Wong
To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge.
1 code implementation • 29 May 2023 • Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang
Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.
1 code implementation • 25 May 2023 • Zhihao Wang, Longyue Wang, Jinsong Su, Junfeng Yao, Zhaopeng Tu
Experimental results on the large-scale WMT20 En-De show that the asymmetric architecture (e. g. bigger encoder and smaller decoder) can achieve comparable performance with the scaling model, while maintaining the superiority of decoding speed with standard NAT models.
1 code implementation • 22 May 2023 • Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang
In practical scenarios, the detector faces texts from various domains or LLMs without knowing their sources.
no code implementations • 17 May 2023 • Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi, Zhaopeng Tu
Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e. g. Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop languages (e. g. English).
no code implementations • 2 May 2023 • Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Siyou Liu, Longyue Wang
We conclude by emphasizing the critical role of LLMs in guiding the future evolution of MT and offer a roadmap for future exploration in the sector.
1 code implementation • 20 Apr 2023 • Chiaming Hsu, Changtong Zan, Liang Ding, Longyue Wang, Xiaoting Wang, Weifeng Liu, Fu Lin, Wenbin Hu
Experiments on WMT17-EnZh XRE also show the effectiveness of our Prompt-XRE against other competitive baselines.
1 code implementation • 5 Apr 2023 • Longyue Wang, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, Zhaopeng Tu
Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
1 code implementation • 16 Feb 2023 • Ante Wang, Linfeng Song, Qi Liu, Haitao Mi, Longyue Wang, Zhaopeng Tu, Jinsong Su, Dong Yu
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation.
1 code implementation • 13 Nov 2022 • Nuo Chen, Yan Wang, Haiyun Jiang, Deng Cai, Yuhan Li, Ziyang Chen, Longyue Wang, Jia Li
In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment.
no code implementations • COLING 2022 • Cunxiao Du, Zhaopeng Tu, Longyue Wang, Jing Jiang
Recently, a new training oaxe loss has proven effective to ameliorate the effect of multimodality for non-autoregressive translation (NAT), which removes the penalty of word order errors in the standard cross-entropy loss.
1 code implementation • Findings (EMNLP) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).
1 code implementation • Findings (ACL) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding.
no code implementations • Findings (ACL) 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence.
1 code implementation • ACL 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.
no code implementations • 27 May 2021 • Guoping Huang, Lemao Liu, Xing Wang, Longyue Wang, Huayang Li, Zhaopeng Tu, Chengyan Huang, Shuming Shi
Automatic machine translation is super efficient to produce translations yet their quality is not guaranteed.
no code implementations • ICLR 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.
1 code implementation • ICLR 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu
Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.
1 code implementation • COLING 2020 • Liang Ding, Longyue Wang, Di wu, DaCheng Tao, Zhaopeng Tu
Non-autoregressive translation (NAT) significantly accelerates the inference process by predicting the entire target sequence.
no code implementations • EMNLP 2020 • Yong Wang, Longyue Wang, Victor O. K. Li, Zhaopeng Tu
Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yilin Yang, Longyue Wang, Shuming Shi, Prasad Tadepalli, Stefan Lee, Zhaopeng Tu
There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role.