no code implementations • WMT (EMNLP) 2021 • Meng Zhang, Minghao Wu, Pengfei Li, Liangyou Li, Qun Liu
This paper describes the NoahNMT system submitted to the WMT 2021 shared task of Very Low Resource Supervised Machine Translation.
no code implementations • 21 Feb 2024 • Chenyang Lyu, Minghao Wu, Alham Fikri Aji
Large Language Models (LLMs) have demonstrated remarkable capabilities across various applications, fundamentally reshaping the landscape of natural language processing (NLP) research.
no code implementations • 27 Jan 2024 • Minghao Wu, YuFei Wang, George Foster, Lizhen Qu, Gholamreza Haffari
Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive, in contrast to its sentence-level counterpart.
no code implementations • 12 Jan 2024 • Minghao Wu, Thuy-Trang Vu, Lizhen Qu, George Foster, Gholamreza Haffari
Large language models (LLMs) have made significant strides in various natural language processing (NLP) tasks.
1 code implementation • 17 Dec 2023 • Renxi Wang, Haonan Li, Minghao Wu, Yuxia Wang, Xudong Han, Chiyu Zhang, Timothy Baldwin
Instruction tuning significantly enhances the performance of large language models (LLMs) across various tasks.
no code implementations • 25 Nov 2023 • Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.
no code implementations • 6 Jul 2023 • Minghao Wu, Alham Fikri Aji
This study investigates the behavior of crowd-sourced and expert annotators, as well as LLMs, when comparing outputs from different models.
1 code implementation • 15 Jun 2023 • Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu
Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.
1 code implementation • 24 May 2023 • Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin
However, research on multilingual instruction tuning has been limited due to the scarcity of high-quality instruction-response datasets across different languages.
no code implementations • 2 May 2023 • Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Siyou Liu, Longyue Wang
We conclude by emphasizing the critical role of LLMs in guiding the future evolution of MT and offer a roadmap for future exploration in the sector.
1 code implementation • 27 Apr 2023 • Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji
The results demonstrate that our proposed LaMini-LM models are comparable to competitive baselines, while being much smaller in size.
Ranked #15 on Word Sense Disambiguation on Words in Context
no code implementations • 16 Feb 2023 • Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari
Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies.
1 code implementation • ACL 2022 • Pengfei Li, Liangyou Li, Meng Zhang, Minghao Wu, Qun Liu
To the best of our knowledge, this is the first work to pre-train a unified model for fine-tuning on both NMT tasks.
no code implementations • EMNLP 2021 • Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari, Qun Liu
In this work, we propose an approach, MultiUAT, that dynamically adjusts the training data usage based on the model's uncertainty on a small set of trusted clean data for multi-corpus machine translation.
1 code implementation • EMNLP 2018 • Minghao Wu, Fei Liu, Trevor Cohn
Conventional wisdom is that hand-crafted features are redundant for deep learning models, as they already learn adequate representations of text automatically from corpora.
Ranked #42 on Named Entity Recognition (NER) on CoNLL 2003 (English)