Search Results for author: Shuming Shi

Found 144 papers, 60 papers with code

Tencent AI Lab Machine Translation Systems for the WMT20 Biomedical Translation Task

1 code implementation WMT (EMNLP) 2020 Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi

This paper describes the Tencent AI Lab submission of the WMT2020 shared task on biomedical translation in four language directions: German<->English, English<->German, Chinese<->English and English<->Chinese.

Machine Translation Translation

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation

1 code implementation ACL 2022 Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu

In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.

Knowledge Distillation Translation +1

On the Relationship between Neural Machine Translation and Word Alignment

no code implementations Xintong Li, Lemao Liu, Guanlin Li, Max Meng, Shuming Shi

We find that although NMT models are difficult to capture word alignment for CFT words but these words do not sacrifice translation quality significantly, which provides an explanation why NMT is more successful for translation yet worse for word alignment compared to statistical machine translation.

Machine Translation NMT +2

Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing

no code implementations ACL 2022 Yi Chen, Jiayang Cheng, Haiyun Jiang, Lemao Liu, Haisong Zhang, Shuming Shi, Ruifeng Xu

In this paper, we firstly empirically find that existing models struggle to handle hard mentions due to their insufficient contexts, which consequently limits their overall typing performance.

Entity Typing

An Empirical Study on Multiple Information Sources for Zero-Shot Fine-Grained Entity Typing

no code implementations EMNLP 2021 Yi Chen, Haiyun Jiang, Lemao Liu, Shuming Shi, Chuang Fan, Min Yang, Ruifeng Xu

Auxiliary information from multiple sources has been demonstrated to be effective in zero-shot fine-grained entity typing (ZFET).

Entity Typing

Fine-grained Entity Typing without Knowledge Base

1 code implementation EMNLP 2021 Jing Qian, Yibin Liu, Lemao Liu, Yangming Li, Haiyun Jiang, Haisong Zhang, Shuming Shi

Existing work on Fine-grained Entity Typing (FET) typically trains automatic models on the datasets obtained by using Knowledge Bases (KB) as distant supervision.

Entity Typing named-entity-recognition +2

Tencent Translation System for the WMT21 News Translation Task

no code implementations WMT (EMNLP) 2021 Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang

Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.

Data Augmentation Translation

Tencent AI Lab Machine Translation Systems for the WMT21 Biomedical Translation Task

no code implementations WMT (EMNLP) 2021 Xing Wang, Zhaopeng Tu, Shuming Shi

This paper describes the Tencent AI Lab submission of the WMT2021 shared task on biomedical translation in eight language directions: English-German, English-French, English-Spanish and English-Russian.

Machine Translation Translation

Retrieval is Accurate Generation

no code implementations27 Feb 2024 Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi

To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.

Language Modelling Retrieval +1

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model

1 code implementation23 Jan 2024 Zhiwei He, Xing Wang, Wenxiang Jiao, Zhuosheng Zhang, Rui Wang, Shuming Shi, Zhaopeng Tu

In this work, we investigate the potential of employing the QE model as the reward model to predict human preferences for feedback training.

Machine Translation Translation

Benchmarking LLMs via Uncertainty Quantification

1 code implementation23 Jan 2024 Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu

The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods.

Benchmarking Uncertainty Quantification

Knowledge Fusion of Large Language Models

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi

In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.

Code Generation

Knowledge Verification to Nip Hallucination in the Bud

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi

While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as \emph{hallucination}.

Hallucination World Knowledge

Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

1 code implementation16 Jan 2024 Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu

This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.

Machine Translation NMT +2

Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models

1 code implementation16 Jan 2024 Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li

We present Inferflow, an efficient and highly configurable inference engine for large language models (LLMs).

Quantization

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

2 code implementations25 Dec 2023 Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi

Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.

Hallucination Hallucination Evaluation

Emage: Non-Autoregressive Text-to-Image Generation

no code implementations22 Dec 2023 Zhangyin Feng, Runyi Hu, Liangxin Liu, Fan Zhang, Duyu Tang, Yong Dai, Xiaocheng Feng, Jiwei Li, Bing Qin, Shuming Shi

Compared with autoregressive baselines that needs to run one thousand times, our model only runs 16 times to generate images of competitive quality with an order of magnitude lower inference latency.

Denoising Text-to-Image Generation

Reasons to Reject? Aligning Language Models with Judgments

1 code implementation22 Dec 2023 Weiwen Xu, Deng Cai, Zhisong Zhang, Wai Lam, Shuming Shi

As humans, we consistently engage in interactions with our peers and receive feedback in the form of natural language.

When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning

no code implementations16 Dec 2023 Qihang Ai, Jianwu Zhou, Haiyun Jiang, Lemao Liu, Shuming Shi

Graph data is ubiquitous in the physical world, and it has always been a challenge to efficiently model graph structures using a unified paradigm for the understanding and reasoning on various graphs.

Optical Character Recognition (OCR)

GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation

no code implementations25 Nov 2023 Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu

While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.

Instruction Following Language Modelling +7

StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving

no code implementations15 Nov 2023 Chang Gao, Haiyun Jiang, Deng Cai, Shuming Shi, Wai Lam

Most existing chain-of-thought (CoT) prompting methods suffer from the issues of generalizability and consistency, as they often rely on instance-specific solutions that may not be applicable to other cases and lack task-level consistency in their reasoning steps.

Math

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models

1 code implementation31 Oct 2023 Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang

Ideally, an advanced agent should possess the ability to accurately describe a given word using an aggressive description while concurrently maximizing confusion in the conservative description, enhancing its participation in the game.

IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing Interactive Machine Translation Systems

1 code implementation17 Oct 2023 Xu Huang, Zhirui Zhang, Ruize Gao, Yichao Du, Lemao Liu, Gouping Huang, Shuming Shi, Jiajun Chen, ShuJian Huang

We present IMTLab, an open-source end-to-end interactive machine translation (IMT) system platform that enables researchers to quickly build IMT systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems.

Machine Translation Translation

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

1 code implementation13 Oct 2023 Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi

Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.

RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation

1 code implementation11 Oct 2023 Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi

In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.

Grammatical Error Correction Sentence

A Benchmark for Text Expansion: Datasets, Metrics, and Baselines

no code implementations17 Sep 2023 Yi Chen, Haiyun Jiang, Wei Bi, Rui Wang, Longyue Wang, Shuming Shi, Ruifeng Xu

This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings.

Informativeness Text Infilling

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

1 code implementation3 Sep 2023 Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.

Hallucination World Knowledge

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

1 code implementation12 Aug 2023 Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Pinjia He, Shuming Shi, Zhaopeng Tu

We propose a novel framework CipherChat to systematically examine the generalizability of safety alignment to non-natural languages -- ciphers.

Ethics

Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

no code implementations16 Jul 2023 Longyue Wang, Zefeng Du, Donghuai Liu, Deng Cai, Dian Yu, Haiyun Jiang, Yan Wang, Leyang Cui, Shuming Shi, Zhaopeng Tu

Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP).

Language Modelling Sentence

On the Cultural Gap in Text-to-Image Generation

no code implementations6 Jul 2023 Bingshuai Liu, Longyue Wang, Chenyang Lyu, Yong Zhang, Jinsong Su, Shuming Shi, Zhaopeng Tu

Accordingly, we propose a novel multi-modal metric that considers object-text alignment to filter the fine-tuning data in the target culture, which is used to fine-tune a T2I model to improve cross-cultural generation.

Text-to-Image Generation

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills

no code implementations28 Jun 2023 Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi

Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge.

Natural Language Understanding

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration

1 code implementation15 Jun 2023 Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu

Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.

Language Modelling

Rethinking Translation Memory Augmented Neural Machine Translation

no code implementations12 Jun 2023 Hongkun Hao, Guoping Huang, Lemao Liu, Zhirui Zhang, Shuming Shi, Rui Wang

The finding demonstrates that TM-augmented NMT is good at the ability of fitting data (i. e., lower bias) but is more sensitive to the fluctuations in the training data (i. e., higher variance), which provides an explanation to a recently reported contradictory phenomenon on the same translation task: TM-augmented NMT substantially advances vanilla NMT under the high-resource scenario whereas it fails under the low-resource scenario.

Machine Translation NMT +2

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model

no code implementations4 Jun 2023 Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

Sentence embedding is one of the most fundamental tasks in Natural Language Processing and plays an important role in various tasks.

Language Modelling Sentence +2

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

1 code implementation30 May 2023 Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi

To address the DoT problem, we propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.

Arithmetic Reasoning Machine Translation

Improved Visual Story Generation with Adaptive Context Modeling

no code implementations26 May 2023 Zhangyin Feng, Yuchen Ren, Xinmiao Yu, Xiaocheng Feng, Duyu Tang, Shuming Shi, Bing Qin

Diffusion models developed on top of powerful text-to-image generation models like Stable Diffusion achieve remarkable success in visual story generation.

Story Generation Story Visualization +1

Enhancing Grammatical Error Correction Systems with Explanations

1 code implementation25 May 2023 Yuejiao Fei, Leyang Cui, Sen yang, Wai Lam, Zhenzhong Lan, Shuming Shi

Grammatical error correction systems improve written communication by detecting and correcting language mistakes.

Grammatical Error Correction

Deepfake Text Detection in the Wild

1 code implementation22 May 2023 Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang

In practical scenarios, the detector faces texts from various domains or LLMs without knowing their sources.

Face Swapping Story Generation +1

A Frustratingly Simple Decoding Method for Neural Text Generation

1 code implementation22 May 2023 Haoran Yang, Deng Cai, Huayang Li, Wei Bi, Wai Lam, Shuming Shi

We introduce a frustratingly simple, super efficient and surprisingly effective decoding method, which we call Frustratingly Simple Decoding (FSD), for neural text generation.

Language Modelling Text Generation

A Survey on Zero Pronoun Translation

no code implementations17 May 2023 Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi, Zhaopeng Tu

Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e. g. Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop languages (e. g. English).

Language Modelling Large Language Model +2

A Simple and Plug-and-play Method for Unsupervised Sentence Representation Enhancement

no code implementations13 May 2023 Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

Generating proper embedding of sentences through an unsupervised way is beneficial to semantic matching and retrieval problems in real-world scenarios.

Retrieval Sentence +2

Exploring Human-Like Translation Strategy with Large Language Models

2 code implementations6 May 2023 Zhiwei He, Tian Liang, Wenxiang Jiao, Zhuosheng Zhang, Yujiu Yang, Rui Wang, Zhaopeng Tu, Shuming Shi, Xing Wang

Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation.

Hallucination Machine Translation +2

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback

1 code implementation5 Apr 2023 Wenxiang Jiao, Jen-tse Huang, Wenxuan Wang, Zhiwei He, Tian Liang, Xing Wang, Shuming Shi, Zhaopeng Tu

Therefore, we propose ParroT, a framework to enhance and regulate the translation abilities during chat based on open-source LLMs (e. g., LLaMA), human-written translation and feedback data.

Instruction Following Machine Translation +1

Document-Level Machine Translation with Large Language Models

1 code implementation5 Apr 2023 Longyue Wang, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, Zhaopeng Tu

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.

Document Level Machine Translation Machine Translation +1

Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine

1 code implementation20 Jan 2023 Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, Shuming Shi, Zhaopeng Tu

By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e. g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages.

Machine Translation Sentence +1

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning

1 code implementation2 Dec 2022 Hongzhan Lin, Pengyao Yi, Jing Ma, Haiyun Jiang, Ziyang Luo, Shuming Shi, Ruifang Liu

The spread of rumors along with breaking events seriously hinders the truth in the era of social media.

Domain Adaptation

Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure

no code implementations22 Oct 2022 Xueliang Zhao, Lemao Liu, Tingchen Fu, Shuming Shi, Dongyan Zhao, Rui Yan

With the availability of massive general-domain dialogue data, pre-trained dialogue generation appears to be super appealing to transfer knowledge from the general domain to downstream applications.

Dialogue Generation

Tencent's Multilingual Machine Translation System for WMT22 Large-Scale African Languages

1 code implementation18 Oct 2022 Wenxiang Jiao, Zhaopeng Tu, Jiarui Li, Wenxuan Wang, Jen-tse Huang, Shuming Shi

This paper describes Tencent's multilingual machine translation systems for the WMT22 shared task on Large-Scale Machine Translation Evaluation for African Languages.

Data Augmentation Machine Translation +1

Effidit: Your AI Writing Assistant

no code implementations3 Aug 2022 Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma

In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).

Keywords to Sentences Retrieval +3

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code

no code implementations12 May 2022 Yong Dai, Duyu Tang, Liangxin Liu, Minghuan Tan, Cong Zhou, Jingquan Wang, Zhangyin Feng, Fan Zhang, Xueyu Hu, Shuming Shi

Moreover, our model supports self-supervised pretraining with the same sparsely activated way, resulting in better initialized parameters for different modalities.

Image Retrieval Retrieval

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors

no code implementations26 Apr 2022 Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi

We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word.

Language Modelling Masked Language Modeling +1

SkillNet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach

no code implementations26 Apr 2022 Junwei Liao, Duyu Tang, Fan Zhang, Shuming Shi

We present SkillNet-NLG, a sparsely activated approach that handles many natural language generation tasks with one model.

Multi-Task Learning Text Generation

A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

1 code implementation ACL 2022 Yu Cao, Wei Bi, Meng Fang, Shuming Shi, DaCheng Tao

To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance.

Dialogue Generation

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation

no code implementations ACL 2022 Wenxuan Wang, Wenxiang Jiao, Yongchang Hao, Xing Wang, Shuming Shi, Zhaopeng Tu, Michael Lyu

In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation~(NMT).

Machine Translation NMT +1

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

1 code implementation ACL 2022 Zhiwei He, Xing Wang, Rui Wang, Shuming Shi, Zhaopeng Tu

By carefully designing experiments, we identify two representative characteristics of the data gap in source: (1) style gap (i. e., translated vs. natural text style) that leads to poor generalization capability; (2) content gap that induces the model to produce hallucination content biased towards the target language.

Hallucination Machine Translation +1

Efficient Sub-structured Knowledge Distillation

1 code implementation9 Mar 2022 Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng

Specifically, we transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.

Knowledge Distillation Structured Prediction

On the Evaluation Metrics for Paraphrase Generation

1 code implementation17 Feb 2022 Lingfeng Shen, Lemao Liu, Haiyun Jiang, Shuming Shi

In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts.

Machine Translation Paraphrase Generation

Rethink the Evaluation for Attack Strength of Backdoor Attacks in Natural Language Processing

no code implementations9 Jan 2022 Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

It has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a `backdoor trigger' paradigm to mislead the models.

Backdoor Attack Text Classification

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

1 code implementation Findings (EMNLP) 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).

Machine Translation NMT +2

On the Language Coverage Bias for Neural Machine Translation

no code implementations Findings (ACL) 2021 Shuo Wang, Zhaopeng Tu, Zhixing Tan, Shuming Shi, Maosong Sun, Yang Liu

Language coverage bias, which indicates the content-dependent differences between sentence pairs originating from the source and target languages, is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.

Data Augmentation Machine Translation +3

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

1 code implementation ACL 2021 Piji Li, Shuming Shi

We investigate the problem of Chinese Grammatical Error Correction (CGEC) and present a new framework named Tail-to-Tail (\textbf{TtT}) non-autoregressive sequence prediction to address the deep issues hidden in CGEC.

Grammatical Error Correction Sentence

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation

1 code implementation ACL 2021 Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Shuming Shi, Michael R. Lyu, Irwin King

In this work, we propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.

Machine Translation NMT +1

GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation

no code implementations ACL 2021 Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi

In this paper, we propose the task of general word-level autocompletion (GWLAN) from a real-world CAT scenario, and construct the first public benchmark to facilitate research in this topic.

Sentence Translation

REAM$\sharp$: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation

no code implementations30 May 2021 Jun Gao, Wei Bi, Ruifeng Xu, Shuming Shi

We first clarify an assumption on reference-based metrics that, if more high-quality references are added into the reference set, the reliability of the metric will increase.

Open-Domain Dialog

Dialogue Response Selection with Hierarchical Curriculum Learning

1 code implementation ACL 2021 Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang

As for IC, it progressively strengthens the model's ability in identifying the mismatching information between the dialogue context and a response candidate.

Conversational Response Selection

Predicting Events in MOBA Games: Prediction, Attribution, and Evaluation

no code implementations17 Dec 2020 Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang, Wei Bi

The multiplayer online battle arena (MOBA) games have become increasingly popular in recent years.

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

1 code implementation ICLR 2021 Yangming Li, Lemao Liu, Shuming Shi

Experiments on synthetic datasets and real-world datasets show that our model is robust to unlabeled entity problem and surpasses prior baselines.

named-entity-recognition Named Entity Recognition +2

On the Sub-Layer Functionalities of Transformer Decoder

no code implementations Findings of the Association for Computational Linguistics 2020 Yilin Yang, Longyue Wang, Shuming Shi, Prasad Tadepalli, Stefan Lee, Zhaopeng Tu

There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role.

Machine Translation NMT +1

On the Branching Bias of Syntax Extracted from Pre-trained Language Models

no code implementations Findings of the Association for Computational Linguistics 2020 Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi

Many efforts have been devoted to extracting constituency trees from pre-trained language models, often proceeding in two stages: feature definition and parsing.

Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

no code implementations14 Aug 2020 Zelong Yang, Zhufeng Pan, Yan Wang, Deng Cai, Xiaojiang Liu, Shuming Shi, Shao-Lun Huang

With the rapid prevalence and explosive development of MOBA esports (Multiplayer Online Battle Arena electronic sports), much research effort has been devoted to automatically predicting game results (win predictions).

Attribute

Evaluating Explanation Methods for Neural Machine Translation

no code implementations ACL 2020 Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi

Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has been made on metrics to evaluate explanation methods.

Machine Translation NMT +2

On the Inference Calibration of Neural Machine Translation

1 code implementation ACL 2020 Shuo Wang, Zhaopeng Tu, Shuming Shi, Yang Liu

Confidence calibration, which aims to make model predictions equal to the true correctness measures, is important for neural machine translation (NMT) because it is able to offer useful indicators of translation errors in the generated output.

Machine Translation NMT +1

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models

no code implementations28 Apr 2020 Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons.

Machine Translation NMT +1

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

no code implementations EMNLP 2020 Zibo Lin, Deng Cai, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi

Despite that response selection is naturally a learning-to-rank problem, most prior works take a point-wise view and train binary classifiers for this task: each response candidate is labeled either relevant (one) or irrelevant (zero).

Conversational Response Selection Learning-To-Rank +2

Understanding Learning Dynamics for Neural Machine Translation

no code implementations5 Apr 2020 Conghui Zhu, Guanlin Li, Lemao Liu, Tiejun Zhao, Shuming Shi

Despite the great success of NMT, there still remains a severe challenge: it is hard to interpret the internal dynamics during its training process.

Machine Translation NMT +1

CASE: Context-Aware Semantic Expansion

no code implementations31 Dec 2019 Jialong Han, Aixin Sun, Haisong Zhang, Chenliang Li, Shuming Shi

In this study, we demonstrate that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner.

Word Sense Disambiguation

Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks

2 code implementations22 Nov 2019 Yong Wang, Long-Yue Wang, Shuming Shi, Victor O. K. Li, Zhaopeng Tu

The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model.

General Knowledge Knowledge Distillation +3

A Discrete CVAE for Response Generation on Short-Text Conversation

no code implementations IJCNLP 2019 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi

In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation.

Response Generation Short-Text Conversation +1

Neuron Interaction Based Representation Composition for Neural Machine Translation

no code implementations22 Nov 2019 Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e. g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors.

Machine Translation Translation

Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework

no code implementations IJCNLP 2019 Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi

End-to-end sequence generation is a popular technique for developing open domain dialogue systems, though they suffer from the \textit{safe response problem}.

Response Generation Retrieval

Semi-supervised Text Style Transfer: Cross Projection in Latent Space

no code implementations IJCNLP 2019 Mingyue Shang, Piji Li, Zhenxin Fu, Lidong Bing, Dongyan Zhao, Shuming Shi, Rui Yan

Text style transfer task requires the model to transfer a sentence of one style to another style while retaining its original content meaning, which is a challenging problem that has long suffered from the shortage of parallel data.

Sentence Style Transfer +1

Multi-Granularity Self-Attention for Neural Machine Translation

no code implementations IJCNLP 2019 Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu

Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information.

Machine Translation NMT +1

Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons

no code implementations IJCNLP 2019 Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu

Recent studies have shown that a hybrid of self-attention networks (SANs) and recurrent neural networks (RNNs) outperforms both individual architectures, while not much is known about why the hybrid models work.

Inductive Bias Machine Translation +1

Towards Understanding Neural Machine Translation with Word Importance

no code implementations IJCNLP 2019 Shilin He, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Michael R. Lyu, Shuming Shi

Although neural machine translation (NMT) has advanced the state-of-the-art on various language pairs, the interpretability of NMT remains unsatisfactory.

Machine Translation NMT +1

Self-Attention with Structural Position Representations

no code implementations IJCNLP 2019 Xing Wang, Zhaopeng Tu, Long-Yue Wang, Shuming Shi

Although self-attention networks (SANs) have advanced the state-of-the-art on various NLP tasks, one criticism of SANs is their ability of encoding positions of input words (Shaw et al., 2018).

Position Sentence +1

Fine-Grained Sentence Functions for Short-Text Conversation

no code implementations ACL 2019 Wei Bi, Jun Gao, Xiaojiang Liu, Shuming Shi

Classification models are trained on this dataset to (i) recognize the sentence function of new data in a large corpus of short-text conversations; (ii) estimate a proper sentence function of the response given a test query.

Information Retrieval Retrieval +2

On the Word Alignment from Neural Machine Translation

no code implementations ACL 2019 Xintong Li, Guanlin Li, Lemao Liu, Max Meng, Shuming Shi

Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds attention may almost fail to capture word alignment for some NMT models.

Machine Translation NMT +2

Topic-Aware Neural Keyphrase Generation for Social Media Language

2 code implementations ACL 2019 Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, Shuming Shi

Further discussions show that our model learns meaningful topics, which interprets its superiority in social media keyphrase generation.

Keyphrase Generation

Exploiting Sentential Context for Neural Machine Translation

no code implementations ACL 2019 Xing Wang, Zhaopeng Tu, Long-Yue Wang, Shuming Shi

In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT).

Machine Translation NMT +1

Microblog Hashtag Generation via Encoding Conversation Contexts

1 code implementation NAACL 2019 Yue Wang, Jing Li, Irwin King, Michael R. Lyu, Shuming Shi

Automatic hashtag annotation plays an important role in content understanding for microblog posts.

Topic Models

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement

no code implementations15 Feb 2019 Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Shuming Shi, Tong Zhang

With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation.

Machine Translation Translation

Neural Machine Translation with Adequacy-Oriented Learning

no code implementations21 Nov 2018 Xiang Kong, Zhaopeng Tu, Shuming Shi, Eduard Hovy, Tong Zhang

Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation.

Attribute Machine Translation +3

Generating Multiple Diverse Responses for Short-Text Conversation

no code implementations14 Nov 2018 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Shuming Shi

In this paper, we propose a novel response generation model, which considers a set of responses jointly and generates multiple diverse responses simultaneously.

Informativeness Response Generation +1

Exploiting Deep Representations for Neural Machine Translation

no code implementations EMNLP 2018 Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Shuming Shi, Tong Zhang

Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated linguistic structures.

Machine Translation NMT +1

QuaSE: Sequence Editing under Quantifiable Guidance

1 code implementation EMNLP 2018 Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, Tong Zhang

For example, an input sequence could be a word sequence, such as review sentence and advertisement text.

Disentanglement Sentence +1

Towards Less Generic Responses in Neural Conversation Models: A Statistical Re-weighting Method

1 code implementation EMNLP 2018 Yahui Liu, Wei Bi, Jun Gao, Xiaojiang Liu, Jian Yao, Shuming Shi

We observe that in the conversation tasks, each query could have multiple responses, which forms a 1-to-n or m-to-n relationship in the view of the total corpus.

Dialogue Generation Machine Translation +1

Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory

1 code implementation NAACL 2019 Deng Cai, Yan Wang, Victoria Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi

Such models rely on insufficient information for generating a specific response since a certain query could be answered in multiple ways.

Dialogue Generation Information Retrieval +3

Language Style Transfer from Sentences with Arbitrary Unknown Styles

no code implementations13 Aug 2018 Yanpeng Zhao, Wei Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi

Then, by recombining the content with the target style, we decode a sentence aligned in the target domain.

Sentence Sentence ReWriting +1

Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings

no code implementations NAACL 2018 Yan Song, Shuming Shi, Jing Li, Haisong Zhang

In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction.

Learning Word Embeddings Part-Of-Speech Tagging +1

Target Foresight Based Attention for Neural Machine Translation

no code implementations NAACL 2018 Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng

In neural machine translation, an attention model is used to identify the aligned source words for a target word (target foresight word) in order to select translation context, but it does not make use of any information of this target foresight word at all.

Language Modelling Machine Translation +1

A Manually Annotated Chinese Corpus for Non-task-oriented Dialogue Systems

no code implementations15 May 2018 Jing Li, Yan Song, Haisong Zhang, Shuming Shi

This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media.

Informativeness Task-Oriented Dialogue Systems

hyperdoc2vec: Distributed Representations of Hypertext Documents

1 code implementation ACL 2018 Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, Haisong Zhang

Hypertext documents, such as web pages and academic papers, are of great importance in delivering information in our daily life.

Citation Recommendation Document Embedding +1

Generative Stock Question Answering

no code implementations21 Apr 2018 Zhaopeng Tu, Yong Jiang, Xiaojiang Liu, Lei Shu, Shuming Shi

We study the problem of stock related question answering (StockQA): automatically generating answers to stock related questions, just like professional stock analysts providing action recommendations to stocks upon user's requests.

Question Answering Retrieval

Translating Pro-Drop Languages with Reconstruction Models

1 code implementation10 Jan 2018 Long-Yue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu

Next, the annotated source sentence is reconstructed from hidden representations in the NMT model.

Machine Translation NMT +2

Learning to Remember Translation History with a Continuous Cache

1 code implementation TACL 2018 Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang

Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information.

Machine Translation NMT +1

Learning Fine-Grained Expressions to Solve Math Word Problems

no code implementations EMNLP 2017 Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin

This method learns the mappings between math concept phrases in math word problems and their math expressions from training data.

Math Math Word Problem Solving

Cannot find the paper you are looking for? You can Submit a new open access paper.