Search Results for author: Xiaojun Wan

Found 147 papers, 42 papers with code

Paper
Add Code

How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation?

1 code implementation • ACL 2022 • Xunjian Yin, Xiaojun Wan

With the rapid development of deep learning, Seq2Seq paradigm has become prevalent for end-to-end data-to-text generation, and the BLEU scores have been increasing in recent years.

Data-to-Text Generation

Paper
Code

Homophonic Pun Generation with Lexically Constrained Rewriting

no code implementations • EMNLP 2020 • Zhiwei Yu, Hongyu Zang, Xiaojun Wan

Punning is a creative way to make conversation enjoyable and literary writing elegant.

Sentence

Paper
Add Code

DialSummEval: Revisiting Summarization Evaluation for Dialogues

1 code implementation • NAACL 2022 • Mingqi Gao, Xiaojun Wan

Dialogue summarization is receiving increasing attention from researchers due to its extraordinary difficulty and unique application value.

Paper
Code

WIND: Weighting Instances Differentially for Model-Agnostic Domain Adaptation

1 code implementation • Findings (ACL) 2021 • Xiang Chen, Yue Cao, Xiaojun Wan

Domain Adaptation

Paper
Code

DelibGAN: Coarse-to-Fine Text Generation via Adversarial Network

no code implementations • ICLR 2019 • Ke Wang, Xiaojun Wan

In this paper, we propose a novel adversarial learning framework, namely DelibGAN, for generating high-quality sentences without supervision.

Descriptive Text Generation

Paper
Add Code

Routing Enforced Generative Model for Recipe Generation

no code implementations • EMNLP 2020 • Zhiwei Yu, Hongyu Zang, Xiaojun Wan

One of the most challenging part of recipe generation is to deal with the complex restrictions among the input ingredients.

Recipe Generation

Paper
Add Code

Structure-Aware Pre-Training for Table-to-Text Generation

no code implementations • Findings (ACL) 2021 • Xinyu Xing, Xiaojun Wan

Table-to-Text Generation

Paper
Add Code

Comparing Knowledge-Intensive and Data-Intensive Models for English Resource Semantic Parsing

no code implementations • CL (ACL) 2021 • Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

Abstract In this work, we present a phenomenon-oriented comparative analysis of the two dominant approaches in English Resource Semantic (ERS) parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

Paper
Add Code

Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot

no code implementations • EMNLP 2021 • Yitao Cai, Yue Cao, Xiaojun Wan

Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations.

Paraphrase Generation Sentence

Paper
Add Code

Automated Similarity Metric Generation for Recommendation

no code implementations • 18 Apr 2024 • Liang Qu, Yun Lin, Wei Yuan, Xiaojun Wan, Yuhui Shi, Hongzhi Yin

Given the critical role of similarity metrics in recommender systems, existing methods mainly employ handcrafted similarity metrics to capture the complex characteristics of user-item interactions.

Recommendation Systems

Paper
Add Code

WikiTableEdit: A Benchmark for Table Editing by Natural Language Instruction

no code implementations • 5 Mar 2024 • Zheng Li, Xiang Chen, Xiaojun Wan

Subsequently, we evaluate several representative large language models on the WikiTableEdit dataset to demonstrate the challenge of this task.

Paper
Add Code

Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models

no code implementations • 3 Mar 2024 • Huixuan Zhang, Junzhe Zhang, Xiaojun Wan

Large-scale vision-language models have demonstrated impressive skill in handling tasks that involve both areas.

Hallucination

Paper
Add Code

DPP-Based Adversarial Prompt Searching for Lanugage Models

no code implementations • 1 Mar 2024 • Xu Zhang, Xiaojun Wan

Language models risk generating mindless and offensive content, which hinders their safe deployment.

Language Modelling

Paper
Add Code

EAMA : Entity-Aware Multimodal Alignment Based Approach for News Image Captioning

no code implementations • 29 Feb 2024 • Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Xiaojun Wan

News image captioning requires model to generate an informative caption rich in entities, with the news image and the associated news article.

Image Captioning Sentence

Paper
Add Code

Are LLM-based Evaluators Confusing NLG Quality Criteria?

no code implementations • 19 Feb 2024 • Xinyu Hu, Mingqi Gao, Sen Hu, Yang Zhang, Yicheng Chen, Teng Xu, Xiaojun Wan

Some prior work has shown that LLMs perform well in NLG evaluation for different tasks.

nlg evaluation

Paper
Add Code

Benchmarking Knowledge Boundary for Large Language Model: A Different Perspective on Model Evaluation

no code implementations • 18 Feb 2024 • Xunjian Yin, Xu Zhang, Jie Ruan, Xiaojun Wan

In recent years, substantial advancements have been made in the development of large language models, achieving remarkable performance across diverse tasks.

Benchmarking Language Modelling +2

Paper
Add Code

Selecting Large Language Model to Fine-tune via Rectified Scaling Law

no code implementations • 4 Feb 2024 • Haowei Lin, Baizhou Huang, Haotian Ye, Qinyu Chen, ZiHao Wang, Sujian Li, Jianzhu Ma, Xiaojun Wan, James Zou, Yitao Liang

The ever-growing ecosystem of LLMs has posed a challenge in selecting the most appropriate pre-trained model to fine-tune amidst a sea of options.

Language Modelling Large Language Model

Paper
Add Code

LLM-based NLG Evaluation: Current Status and Challenges

no code implementations • 2 Feb 2024 • Mingqi Gao, Xinyu Hu, Jie Ruan, Xiao Pu, Xiaojun Wan

Evaluating natural language generation (NLG) is a vital but challenging problem in artificial intelligence.

nlg evaluation Text Generation

Paper
Add Code

History Matters: Temporal Knowledge Editing in Large Language Model

1 code implementation • 9 Dec 2023 • Xunjian Yin, Jin Jiang, Liming Yang, Xiaojun Wan

The imperative task of revising or updating the knowledge stored within large language models arises from two distinct sources: intrinsic errors inherent in the model which should be corrected and outdated knowledge due to external shifts in the real world which should be updated.

knowledge editing Language Modelling +1

Paper
Code

OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization

1 code implementation • 27 Oct 2023 • Yuchen Shen, Xiaojun Wan

Opinion summarization sets itself apart from other types of summarization tasks due to its distinctive focus on aspects and sentiments.

Opinion Summarization

Paper
Code

Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models

no code implementations • 25 Oct 2023 • Xiang Chen, Xiaojun Wan

Advancements in natural language generation (NLG) and large language models (LLMs) have led to proficient text generation in various tasks.

Text Generation

Paper
Add Code

ALCUNA: Large Language Models Meet New Knowledge

1 code implementation • 23 Oct 2023 • Xunjian Yin, Baizhou Huang, Xiaojun Wan

With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now.

Paper
Code

A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection

1 code implementation • 10 Oct 2023 • Shiping Yang, Renliang Sun, Xiaojun Wan

Contrasting previous studies of zero-resource hallucination detection, our method and benchmark concentrate on passage-level detection instead of sentence-level.

Hallucination Sentence

Paper
Code

WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction

1 code implementation • 8 Oct 2023 • Xiang Chen, Zheng Li, Xiaojun Wan

In this paper, we study the problem of controlled text editing by natural language instruction.

Informativeness

Paper
Code

Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

no code implementations • 29 Sep 2023 • Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan

We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives.

Code Generation

Paper
Add Code

Summarization is (Almost) Dead

no code implementations • 18 Sep 2023 • Xiao Pu, Mingqi Gao, Xiaojun Wan

How well can large language models (LLMs) generate summaries?

Text Summarization

Paper
Add Code

A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check

no code implementations • 25 Jul 2023 • Xunjian Yin, Xiaojun Wan

With the development of pre-trained models and the incorporation of phonetic and graphic information, neural models have achieved high scores in Chinese Spelling Check (CSC).

Paper
Add Code

Image Matters: A New Dataset and Empirical Study for Multimodal Hyperbole Detection

1 code implementation • 1 Jul 2023 • Huixuan Zhang, Xiaojun Wan

We create a multimodal detection dataset from Weibo (a Chinese social media) and carry out some studies on it.

Paper
Code

SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning

2 code implementations • NeurIPS 2023 • Yunxiang Zhang, Xiaojun Wan

Generative commonsense reasoning is the task that requires machines, given a group of keywords, to compose a single coherent sentence with commonsense plausibility.

Sentence Text Generation

698

Paper
Code

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

1 code implementation • 8 Jun 2023 • Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai

To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories.

Benchmarking

Paper
Code

A New Dataset and Empirical Study for Sentence Simplification in Chinese

1 code implementation • 7 Jun 2023 • Shiping Yang, Renliang Sun, Xiaojun Wan

Sentence Simplification is a valuable technique that can benefit language learners and children a lot.

Few-Shot Learning Sentence

Paper
Code

Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks

no code implementations • 24 May 2023 • Xiao Pu, Mingqi Gao, Xiaojun Wan

The results show that summaries generated by fine-tuned models lead to higher consistency in usefulness across all three tasks, as rankings of fine-tuned summarization systems are close across downstream tasks according to the proposed extrinsic metrics.

Informativeness Question Answering +4

Paper
Add Code

Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

1 code implementation • 21 May 2023 • Renliang Sun, Wei Xu, Xiaojun Wan

In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.

Lexical Simplification Sentence +1

Paper
Code

Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang

We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.

Paper
Add Code

Human-like Summarization Evaluation with ChatGPT

1 code implementation • 5 Apr 2023 • Mingqi Gao, Jie Ruan, Renliang Sun, Xunjian Yin, Shiping Yang, Xiaojun Wan

Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory.

Text Summarization

Paper
Code

Models See Hallucinations: Evaluating the Factuality in Video Captioning

no code implementations • 6 Mar 2023 • Hui Liu, Xiaojun Wan

In this work, we conduct a detailed human evaluation of the factuality in video captioning and collect two annotated factuality datasets.

Text Generation Video Captioning

Paper
Add Code

Exploiting Summarization Data to Help Text Simplification

1 code implementation • 14 Feb 2023 • Renliang Sun, Zhixian Yang, Xiaojun Wan

One of the major problems with text simplification is the lack of high-quality data.

Sentence Text Simplification +1

Paper
Code

How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation

no code implementations • 20 Nov 2022 • Jie Ruan, Yue Wu, Xiaojun Wan, Yuesheng Zhu

Sarcasm generation has been investigated in previous studies by considering it as a text-to-text generation problem, i. e., generating a sarcastic sentence for an input sentence.

Descriptive Sentence +1

Paper
Add Code

Error-Robust Retrieval for Chinese Spelling Check

1 code implementation • 15 Nov 2022 • Xunjian Yin, Xinyu Hu, Jin Jiang, Xiaojun Wan

Chinese Spelling Check (CSC) aims to detect and correct error tokens in Chinese contexts, which has a wide range of applications.

Retrieval

Paper
Code

Social Biases in Automatic Evaluation Metrics for NLG

no code implementations • 17 Oct 2022 • Mingqi Gao, Xiaojun Wan

Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias.

Sentence Sentence Embeddings +3

Paper
Add Code

An Empirical Study of Automatic Post-Editing

no code implementations • 16 Sep 2022 • Xu Zhang, Xiaojun Wan

In view of the importance of data augmentation in APE, we separately study the impact of the construction method of artificial corpora and artificial data domain on the performance of APE models.

Automatic Post-Editing Data Augmentation

Paper
Add Code

CrossDial: An Entertaining Dialogue Dataset of Chinese Crosstalk

no code implementations • 3 Sep 2022 • Baizhou Huang, Shikang Du, Xiaojun Wan

Crosstalk is a traditional Chinese theatrical performance art.

Dialogue Generation

Paper
Add Code

CC-Riddle: A Question Answering Dataset of Chinese Character Riddles

2 code implementations • 28 Jun 2022 • Fan Xu, Yunxiang Zhang, Xiaojun Wan

Solving Chinese character riddles is a challenging task that demands understanding of character glyph, general knowledge, and a grasp of figurative language.

General Knowledge Language Modelling +2

Paper
Code

Nearest Neighbor Knowledge Distillation for Neural Machine Translation

1 code implementation • NAACL 2022 • Zhixian Yang, Renliang Sun, Xiaojun Wan

k-nearest-neighbor machine translation (NN-MT), proposed by Khandelwal et al. (2021), has achieved many state-of-the-art results in machine translation tasks.

Knowledge Distillation Machine Translation +2

Paper
Code

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

no code implementations • 16 Apr 2022 • Renliang Sun, Xiaojun Wan

We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words from the texts.

Language Modelling Lexical Simplification +3

Paper
Add Code

Dependency-based Mixture Language Models

1 code implementation • ACL 2022 • Zhixian Yang, Xiaojun Wan

Various models have been proposed to incorporate knowledge of syntactic structures into neural language models.

Language Modelling Text Generation

Paper
Code

A Simple Information-Based Approach to Unsupervised Domain-Adaptive Aspect-Based Sentiment Analysis

1 code implementation • 29 Jan 2022 • Xiang Chen, Xiaojun Wan

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task which aims to extract the aspects from sentences and identify their corresponding sentiments.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Paper
Code

Visual Information Guided Zero-Shot Paraphrase Generation

1 code implementation • COLING 2022 • Zhe Lin, Xiaojun Wan

Zero-shot paraphrase generation has drawn much attention as the large-scale high-quality paraphrase corpus is limited.

Image Captioning Paraphrase Generation +1

Paper
Code

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

no code implementations • 27 Dec 2021 • Yuan YAO, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan, Xiaodong He, Xiaojun Wan, Xin Zhao, Xu sun, Yang Liu, Zhiyuan Liu, Xianpei Han, Erhong Yang, Zhifang Sui, Maosong Sun

We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.

Paper
Add Code

Neural Content Extraction for Poster Generation of Scientific Papers

no code implementations • 16 Dec 2021 • Sheng Xu, Xiaojun Wan

Then we propose a three-step framework to tackle this task and focus on the content extraction step in this study.

Document Summarization

Paper
Add Code

A Syntax-Guided Grammatical Error Correction Model with Dependency Tree Correction

no code implementations • 5 Nov 2021 • Zhaohong Wan, Xiaojun Wan

However, these methods lack the use of syntactic knowledge which plays an important role in the correction of grammatical errors.

Data Augmentation Grammatical Error Correction +3

Paper
Add Code

Document-Level Text Simplification: Dataset, Criteria and Baseline

1 code implementation • EMNLP 2021 • Renliang Sun, Hanqi Jin, Xiaojun Wan

Finally, we select several representative models as baseline models for this task and perform automatic evaluation and human evaluation.

Sentence Text Simplification

Paper
Code

BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles

no code implementations • 23 Sep 2021 • Yunxiang Zhang, Xiaojun Wan

A riddle is a question or statement with double or veiled meanings, followed by an unexpected answer.

Multiple-choice Question Answering

Paper
Add Code

CodeQA: A Question Answering Dataset for Source Code Comprehension

1 code implementation • Findings (EMNLP) 2021 • Chenxiao Liu, Xiaojun Wan

We propose CodeQA, a free-form question answering dataset for the purpose of source code comprehension: given a code snippet and a question, a textual answer is required to be generated.

Machine Reading Comprehension Question Answering

Paper
Code

MOVER: Mask, Over-generate and Rank for Hyperbole Generation

1 code implementation • NAACL 2022 • Yunxiang Zhang, Xiaojun Wan

In this paper, we tackle the challenging task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase.

Sentence

Paper
Code

Towards Document-Level Paraphrase Generation with Sentence Rewriting and Reordering

1 code implementation • Findings (EMNLP) 2021 • Zhe Lin, Yitao Cai, Xiaojun Wan

Paraphrase generation is an important task in natural language processing.

Paraphrase Generation Sentence +1

Paper
Code

Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase Generation Approach

1 code implementation • Findings (ACL) 2021 • Zhe Lin, Xiaojun Wan

Both automatic and human evaluation show BTmPG can improve the diversity of paraphrase while preserving the semantics of the original sentence.

Paraphrase Generation Sentence +1

Paper
Code

Video Paragraph Captioning as a Text Summarization Task

no code implementations • ACL 2021 • Hui Liu, Xiaojun Wan

Most previous methods simplify this task by using ground-truth event segments.

Sentence Text Summarization

Paper
Add Code

Making Better Use of Bilingual Information for Cross-Lingual AMR Parsing

1 code implementation • Findings (ACL) 2021 • Yitao Cai, Zhe Lin, Xiaojun Wan

We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.

AMR Parsing

Paper
Code

Continual Learning for Neural Machine Translation

no code implementations • NAACL 2021 • Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan

In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus.

Continual Learning Knowledge Distillation +3

Paper
Add Code

Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

no code implementations • AAAI 2021 • Ke Wang, Guandan Chen, Zhongqiang Huang, Xiaojun Wan, Fei Huang

Despite the near-human performances already achieved on formal texts such as news articles, neural machine transla- tion still has difficulty in dealing with ”user-generated” texts that have diverse linguistic phenomena but lack large-scale high-quality parallel corpora.

counterfactual Domain Adaptation +2

Paper
Add Code

Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

1 code implementation • COLING 2022 • Zhixian Yang, Pengxuan Xu, Xiaojun Wan

Neural text generation models are likely to suffer from the low-diversity problem.

POS Text Generation

Paper
Code

Learning a Product Relevance Model from Click-Through Data in E-Commerce

no code implementations • 14 Feb 2021 • Shaowei Yao, Jiwei Tan, Xi Chen, Keping Yang, Rong Xiao, Hongbo Deng, Xiaojun Wan

We propose a novel way to consider samples of different relevance confidence, and come up with a new training objective to learn a robust relevance model with desirable score distribution.

Click-Through Rate Prediction Computational Efficiency

Paper
Add Code

ParaSCI: A Large Scientific Paraphrase Dataset for Longer Paraphrase Generation

1 code implementation • EACL 2021 • Qingxiu Dong, Xiaojun Wan, Yue Cao

We propose ParaSCI, the first large-scale paraphrase dataset in the scientific field, including 33, 981 paraphrase pairs from ACL (ParaSCI-ACL) and 316, 063 pairs from arXiv (ParaSCI-arXiv).

Paraphrase Generation

Paper
Code

On the Helpfulness of Document Context to Sentence Simplification

1 code implementation • COLING 2020 • Renliang Sun, Zhe Lin, Xiaojun Wan

Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model.

Sentence Text Simplification

Paper
Code

Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation

no code implementations • COLING 2020 • Zhaohong Wan, Xiaojun Wan, Wenguang Wang

The incorporation of data augmentation method in grammatical error correction task has attracted much attention.

Data Augmentation Grammatical Error Correction

Paper
Add Code

IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation

2 code implementations • EMNLP 2020 • Yitao Cai, Xiaojun Wan

Our model outperforms previous state-of-the-art model by a large margin and achieves new state-of-the-art results on the two datasets.

Text-To-SQL

11,420

Paper
Code

DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network

no code implementations • Findings of the Association for Computational Linguistics 2020 • Yue Cao, Xiaojun Wan

In this paper, we propose a deep generative model to generate diverse paraphrases.

Generative Adversarial Network Paraphrase Generation

Paper
Add Code

Adversarial Text Generation via Sequence Contrast Discrimination

no code implementations • Findings of the Association for Computational Linguistics 2020 • Ke Wang, Xiaojun Wan

In this paper, we propose a sequence contrast loss driven text generation framework, which learns the difference between real texts and generated texts and uses that difference.

Adversarial Text Text Generation

Paper
Add Code

Abstractive Multi-Document Summarization via Joint Learning with Single-Document Summarization

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Hanqi Jin, Xiaojun Wan

Single-document and multi-document summarizations are very closely related in both task definition and solution method.

Document Summarization Multi-Document Summarization

Paper
Code

TransModality: An End2End Fusion Method with Transformer for Multimodal Sentiment Analysis

no code implementations • 7 Sep 2020 • Zilong Wang, Zhaohong Wan, Xiaojun Wan

Enlightened by recent success of Transformer in the area of machine translation, we propose a new fusion method, TransModality, to address the task of multimodal sentiment analysis.

Ranked #1 on Multimodal Sentiment Analysis on CMU-MOSI (F1-score (Weighted) metric)

Machine Translation Multimodal Sentiment Analysis +1

Paper
Add Code

Constructing a Family Tree of Ten Indo-European Languages with Delexicalized Cross-linguistic Transfer Patterns

no code implementations • 17 Jul 2020 • Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan

It is reasonable to hypothesize that the divergence patterns formulated by historical linguists and typologists reflect constraints on human languages, and are thus consistent with Second Language Acquisition (SLA) in a certain way.

Language Acquisition

Paper
Add Code

Heterogeneous Graph Transformer for Graph-to-Sequence Learning

no code implementations • ACL 2020 • Shaowei Yao, Tianming Wang, Xiaojun Wan

The graph-to-sequence (Graph2Seq) learning aims to transduce graph-structured representations to word sequences for text generation.

AMR-to-Text Generation Graph-to-Sequence +3

Paper
Add Code

Learning to Ask More: Semi-Autoregressive Sequential Question Generation under Dual-Graph Interaction

no code implementations • ACL 2020 • Zi Chai, Xiaojun Wan

Traditional Question Generation (TQG) aims to generate a question given an input passage and an answer.

Question Generation Question-Generation

Paper
Add Code

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study

no code implementations • ACL 2020 • Xinyu Xing, Xiaosheng Fan, Xiaojun Wan

In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers.

Text Generation

Paper
Add Code

Multimodal Transformer for Multimodal Machine Translation

1 code implementation • ACL 2020 • Shaowei Yao, Xiaojun Wan

Multimodal Machine Translation (MMT) aims to introduce information from other modality, generally static images, to improve the translation quality.

Ranked #6 on Multimodal Machine Translation on Multi30K

Multimodal Machine Translation Translation

Paper
Code

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

no code implementations • ACL 2020 • Yue Cao, Hui Liu, Xiaojun Wan

However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time.

Cross-Lingual Transfer

Paper
Add Code

Semantic Parsing for English as a Second Language

no code implementations • ACL 2020 • Yuanyuan Zhao, Weiwei Sun, Junjie Cao, Xiaojun Wan

This paper is concerned with semantic parsing for English as a second language (ESL).

Grammatical Error Correction Language Acquisition +1

Paper
Add Code

Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization

no code implementations • ACL 2020 • Hanqi Jin, Tianming Wang, Xiaojun Wan

In this paper, we propose a multi-granularity interaction network for extractive and abstractive multi-document summarization, which jointly learn semantic representations for words, sentences, and documents.

Document Summarization Extractive Summarization +2

Paper
Add Code

AMR-To-Text Generation with Graph Transformer

no code implementations • TACL 2020 • Tianming Wang, Xiaojun Wan, Hanqi Jin

Abstract meaning representation (AMR)-to-text generation is the challenging task of generating natural language texts from AMR graphs, where nodes represent concepts and edges denote relations.

AMR-to-Text Generation Graph-to-Sequence +1

Paper
Add Code

Towards a Unified End-to-End Approach for Fully Unsupervised Cross-Lingual Sentiment Analysis

no code implementations • CONLL 2019 • Yanlin Feng, Xiaojun Wan

Cross-lingual sentiment analysis (CLSA) aims to improve the performance on these languages by leveraging annotated data from other languages.

Cross-Lingual Word Embeddings Sentiment Analysis +1

Paper
Add Code

Automated Chess Commentator Powered by Neural Chess Engine

2 code implementations • ACL 2019 • Hongyu Zang, Zhiwei Yu, Xiaojun Wan

In this paper, we explore a new approach for automated chess commentary generation, which aims to generate chess commentary texts in different categories (e. g., description, comparison, planning, etc.).

Text Generation

Paper
Code

A Neural Approach to Irony Generation

1 code implementation • 13 Sep 2019 • Mengdi Zhu, Zhiwei Yu, Xiaojun Wan

Ironies can not only express stronger emotions but also show a sense of humor.

Style Transfer

Paper
Code

INS: An Interactive Chinese News Synthesis System

no code implementations • NAACL 2019 • Hui Liu, Wentao Qin, Xiaojun Wan

So it is of vital importance to automatically synthesize a batch of news articles related to the event or topic into a new synthesis article (or overview article) for reader's convenience.

Paper
Add Code

A Comparative Analysis of Knowledge-Intensive and Data-Intensive Semantic Parsers

no code implementations • 4 Jul 2019 • Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

We present a phenomenon-oriented comparative analysis of the two dominant approaches in task-independent semantic parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

Paper
Add Code

Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model

no code implementations • ACL 2019 • Yitao Cai, Huiyu Cai, Xiaojun Wan

We create a multi-modal sarcasm detection dataset based on Twitter.

Attribute Sarcasm Detection

Paper
Add Code

Asking the Crowd: Question Analysis, Evaluation and Generation for Open Discussion on Online Forums

1 code implementation • ACL 2019 • Zi Chai, Xinyu Xing, Xiaojun Wan, Bo Huang

For openQG task, we construct OQGenD, the first dataset as far as we know, and propose a model based on conditional generative adversarial networks and our question evaluation model.

Text Generation

Paper
Code

T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion

1 code implementation • International Joint Conference on Artificial Intelligence 2019 • Tianming Wang, Xiaojun Wan

Our model uses shared attention layers for encoder and decoder, which make the most of the contextual clues, and a latent variable for learning the distribution of coherent story plots.

Story Completion

Paper
Code

A Semi-Supervised Approach for Low-Resourced Text Generation

1 code implementation • 3 Jun 2019 • Hongyu Zang, Xiaojun Wan

The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant.

Denoising Language Modelling +2

Paper
Code

Massive Styles Transfer with Limited Labeled Data

1 code implementation • 3 Jun 2019 • Hongyu Zang, Xiaojun Wan

In this paper, we propose a multi-agent style transfer system (MAST) for addressing multiple style transfer tasks with limited labeled data, by leveraging abundant unlabeled data and the mutual benefit among the multiple styles.

Denoising Style Transfer +1

Paper
Code

How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation

no code implementations • NAACL 2019 • Zhiwei Yu, Xiaojun Wan

In order to create novel metaphors, we propose a neural approach to metaphor generation and explore the shared inferential structure of a metaphorical usage and a literal usage of a verb.

Language Modelling Text Generation

Paper
Add Code

Learning Bilingual Sentiment-Specific Word Embeddings without Cross-lingual Supervision

no code implementations • NAACL 2019 • Yanlin Feng, Xiaojun Wan

Our method only requires a sentiment corpus in the source language and pretrained monolingual word embeddings of both languages.

Sentiment Analysis Translation +3

Paper
Add Code

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

2 code implementations • NeurIPS 2019 • Ke Wang, Hang Hua, Xiaojun Wan

Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e. g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content.

Attribute Text Attribute Transfer

128

Paper
Code

AMRec: An Intelligent System for Academic Method Recommendation

no code implementations • 10 Apr 2019 • Shanshan Huang, Xiaojun Wan, Xuewei Tang

Finding new academic Methods for research problems is the key task in a researcher's research career.

Paper
Add Code

Parsing Chinese Sentences with Grammatical Relations

no code implementations • CL 2019 • Weiwei Sun, Yufei Chen, Xiaojun Wan, Meichun Liu

In this work, we propose to represent grammatical information using general directed dependency graphs.

Paper
Add Code

Book Review: Automatic Text Simplification by Horacio Saggion

no code implementations • CL 2018 • Xiaojun Wan

Lexical Simplification Text Generation +1

Paper
Add Code

Adapting Neural Single-Document Summarization Model for Abstractive Multi-Document Summarization: A Pilot Study

no code implementations • WS 2018 • Jianmin Zhang, Jiwei Tan, Xiaojun Wan

In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS.

Abstractive Text Summarization Document Summarization +2

Paper
Add Code

Neural Maximum Subgraph Parsing for Cross-Domain Semantic Dependency Analysis

1 code implementation • CONLL 2018 • Yufei Chen, Sheng Huang, Fang Wang, Junjie Cao, Weiwei Sun, Xiaojun Wan

We present experiments for cross-domain semantic dependency analysis with a neural Maximum Subgraph parser.

Dependency Parsing Semantic Parsing

Paper
Code

Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data

1 code implementation • EMNLP 2018 • Zi Lin, Yuguang Duan, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

This paper studies semantic parsing for interlanguage (L2), taking semantic role labeling (SRL) as a case task and learner Chinese as a case language.

Semantic Parsing Semantic Role Labeling +1

Paper
Code

Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism

no code implementations • COLING 2018 • Liunian Li, Xiaojun Wan

Our approach first adopts an encoder-decoder model to generate a template text with data slots to be filled and then leverages a proposed delayed copy mechanism to fill in the slots with proper data records.

Data-to-Text Generation Descriptive +1

Paper
Add Code

Accurate SHRG-Based Semantic Parsing

no code implementations • ACL 2018 • Yufei Chen, Weiwei Sun, Xiaojun Wan

We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating synchronous production rules to the syntacto-semantic composition process.

Semantic Composition Semantic Parsing

Paper
Add Code

Language Generation via DAG Transduction

no code implementations • ACL 2018 • Yajie Ye, Weiwei Sun, Xiaojun Wan

This remarkable result demonstrates the feasibility of applying a DAG transducer to resolve NLG, as well as the effectiveness of our design.

Semantic Parsing Text Generation

Paper
Add Code

Sense-Aware Neural Models for Pun Location in Texts

no code implementations • ACL 2018 • Yitao Cai, Yin Li, Xiaojun Wan

In this paper, we focus on the task of pun location, which aims to identify the pun word in a given short text.

Word Sense Disambiguation

Paper
Add Code

A Neural Approach to Pun Generation

no code implementations • ACL 2018 • Zhiwei Yu, Jiwei Tan, Xiaojun Wan

Since sequence-to-sequence models provide an effective technique for text generation, it is promising to investigate these models on the pun generation task.

Image Captioning Language Modelling +3

Paper
Add Code

Pre- and In-Parsing Models for Neural Empty Category Detection

no code implementations • ACL 2018 • Yufei Chen, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

Motivated by the positive impact of empty category on syntactic parsing, we study neural models for pre- and in-parsing detection of empty category, which has not previously been investigated.

Dependency Parsing Structured Prediction

Paper
Add Code

Towards a Neural Network Approach to Abstractive Multi-Document Summarization

no code implementations • 24 Apr 2018 • Jianmin Zhang, Jiwei Tan, Xiaojun Wan

In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS.

Abstractive Text Summarization Document Summarization +1

Paper
Add Code

Towards Automatic Generation of Entertaining Dialogues in Chinese Crosstalks

no code implementations • 1 Nov 2017 • Shikang Du, Xiaojun Wan, Yajie Ye

Crosstalk, also known by its Chinese name xiangsheng, is a traditional Chinese comedic performing art featuring jokes and funny dialogues, and one of China's most popular cultural elements.

Dialogue Generation Translation

Paper
Add Code

Leveraging Diverse Lexical Chains to Construct Essays for Chinese College Entrance Examination

no code implementations • IJCNLP 2017 • Liunian Li, Xiaojun Wan, Jin-Ge Yao, Siming Yan

In this work we study the challenging task of automatically constructing essays for Chinese college entrance examination where the topic is specified in advance.

Sentence

Paper
Add Code

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

no code implementations • EMNLP 2017 • Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan

We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs.

Dependency Parsing

Paper
Add Code

Towards Automatic Construction of News Overview Articles by News Synthesis

no code implementations • EMNLP 2017 • Jianmin Zhang, Xiaojun Wan

In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event.

Document Summarization Multi-Document Summarization

Paper
Add Code

Towards Automatic Generation of Product Reviews from Aspect-Sentiment Scores

no code implementations • WS 2017 • Hongyu Zang, Xiaojun Wan

Data-to-text generation is very essential and important in machine writing applications.

Data-to-Text Generation Retrieval

Paper
Add Code

Towards a Universal Sentiment Classifier in Multiple languages

no code implementations • EMNLP 2017 • Kui Xu, Xiaojun Wan

We present the evaluation results of our universal sentiment classifier in five languages, and the results are very promising even when the parallel data between English and the target languages are not used.

General Classification Machine Translation +2

Paper
Add Code

Content Selection for Real-time Sports News Construction from Commentary Texts

no code implementations • WS 2017 • Jin-Ge Yao, Jianmin Zhang, Xiaojun Wan, Jianguo Xiao

We study the task of constructing sports news report automatically from live commentary and focus on content selection.

Document Summarization Feature Engineering +2

Paper
Add Code

The Covert Helps Parse the Overt

no code implementations • CONLL 2017 • Xun Zhang, Weiwei Sun, Xiaojun Wan

This paper is concerned with whether deep syntactic information can help surface parsing, with a particular focus on empty categories.

Dependency Parsing

Paper
Add Code

Parsing for Grammatical Relations via Graph Merging

no code implementations • CONLL 2017 • Weiwei Sun, Yantao Du, Xiaojun Wan

This paper is concerned with building deep grammatical relation (GR) analysis using data-driven approach.

Paper
Add Code

Abstractive Document Summarization with a Graph-Based Attentional Neural Model

no code implementations • ACL 2017 • Jiwei Tan, Xiaojun Wan, Jianguo Xiao

Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques.

Ranked #12 on Text Summarization on CNN / Daily Mail (Anonymized)

Abstractive Text Summarization Document Summarization +5

Paper
Add Code

Semantic Dependency Parsing via Book Embedding

no code implementations • ACL 2017 • Weiwei Sun, Junjie Cao, Xiaojun Wan

We model a dependency graph as a book, a particular kind of topological space, for semantic dependency parsing.

Combinatorial Optimization Dependency Parsing +2

Paper
Add Code

Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs

no code implementations • ACL 2017 • Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan

We study the Maximum Subgraph problem in deep dependency parsing.

Dependency Parsing Semantic Dependency Parsing

Paper
Add Code

Learning to Identify Ambiguous and Misleading News Headlines

no code implementations • 17 May 2017 • Wei Wei, Xiaojun Wan

For the identification of misleading headlines, we extract features based on the congruence between headlines and bodies.

Paper
Add Code

PKUSUMSUM : A Java Platform for Multilingual Document Summarization

no code implementations • COLING 2016 • Jianmin Zhang, Tianming Wang, Xiaojun Wan

PKUSUMSUM is a Java platform for multilingual document summarization, and it sup-ports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks.

Chinese Word Segmentation Document Summarization +1

Paper
Add Code

Attention-based LSTM Network for Cross-Lingual Sentiment Classification

no code implementations • EMNLP 2016 • Xinjie Zhou, Xiaojun Wan, Jianguo Xiao

Cross-Lingual Sentiment Classification General Classification +5

Paper
Add Code

Towards Accurate and Efficient Chinese Part-of-Speech Tagging

no code implementations • CL 2016 • Weiwei Sun, Xiaojun Wan

Chinese Part-of-Speech Tagging Part-Of-Speech Tagging +1

Paper
Add Code

Transition-Based Parsing for Deep Dependency Structures

no code implementations • CL 2016 • Xun Zhang, Yantao Du, Weiwei Sun, Xiaojun Wan

Paper
Add Code

Towards Constructing Sports News from Live Text Commentary

no code implementations • ACL 2016 • Jianmin Zhang, Jin-Ge Yao, Xiaojun Wan

Document Summarization Learning-To-Rank

Paper
Add Code

Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning

no code implementations • ACL 2016 • Xinjie Zhou, Xiaojun Wan, Jianguo Xiao

Classification Cross-Lingual Sentiment Classification +6

Paper
Add Code

User Embedding for Scholarly Microblog Recommendation

no code implementations • ACL 2016 • Yang Yu, Xiaojun Wan, Xinjie Zhou

Collaborative Filtering Collaborative Ranking

Paper
Add Code

Automatic Labeling of Topic Models Using Text Summaries

no code implementations • ACL 2016 • Xiaojun Wan, Tianming Wang

Information Retrieval Topic Models

Paper
Add Code

Phrase-based Compressive Cross-Language Summarization

no code implementations • EMNLP 2015 • Jin-Ge Yao, Xiaojun Wan, Jianguo Xiao

Document Summarization Machine Translation +2

Paper
Add Code

Mining and Analyzing the Future Works in Scientific Articles

no code implementations • 8 Jul 2015 • Yue Hu, Xiaojun Wan

Third, we apply the extraction method and the classification model to a paper dataset in the computer science field and conduct a further analysis of the future works.

Classification General Classification

Paper
Add Code

Multi-Document Summarization via Discriminative Summary Reranking

no code implementations • 8 Jul 2015 • Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, Ming Zhou

However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets.

Document Summarization Multi-Document Summarization +1

Paper
Add Code

Learning to Mine Chinese Coordinate Terms Using the Web

no code implementations • 8 Jul 2015 • Xiaojiang Huang, Xiaojun Wan, Jianguo Xiao

Coordinate relation refers to the relation between instances of a concept and the relation between the directly hyponyms of a concept.

Relation

Paper
Add Code

BrailleSUM: A News Summarization System for the Blind and Visually Impaired People

no code implementations • IJCNLP 2015 • Xiaojun Wan, Yue Hu

Document Summarization News Summarization

Paper
Add Code

A Data-Driven, Factorization Parser for CCG Dependency Structures

no code implementations • IJCNLP 2015 • Yantao Du, Weiwei Sun, Xiaojun Wan

Dependency Parsing Question Answering

Paper
Add Code

Peking: Building Semantic Dependency Graphs with a Hybrid Parser

no code implementations • SEMEVAL 2015 • Yantao Du, Fan Zhang, Xun Zhang, Weiwei Sun, Xiaojun Wan

Dependency Parsing

Paper
Add Code

Representation Learning for Aspect Category Detection in Online Reviews

no code implementations • AAAI 2015 • Xinjie Zhou, Xiaojun Wan, Jianguo Xiao

Afterwards, we propose to generate deeper and hybrid features through neural networks stacked on the word vectors.

Aspect Category Detection Decision Making +5

Paper
Add Code

Joint Decoding of Tree Transduction Models for Sentence Compression

no code implementations • EMNLP 2014 • Jin-Ge Yao, Xiaojun Wan, Jianguo Xiao

Language Modelling Sentence +1

Paper
Add Code

Automatic Generation of Related Work Sections in Scientific Papers: An Optimization Approach

no code implementations • EMNLP 2014 • Yue Hu, Xiaojun Wan

Document Summarization Multi-Document Summarization +1

Paper
Add Code

Peking: Profiling Syntactic Tree Parsing Techniques for Semantic Graph Parsing

no code implementations • SEMEVAL 2014 • Yantao Du, Fan Zhang, Weiwei Sun, Xiaojun Wan

Dependency Parsing

Paper
Add Code

Grammatical Relations in Chinese: GB-Ground Extraction and Data-Driven Parsing

no code implementations • ACL 2014 • Weiwei Sun, Yantao Du, Xin Kou, Shuoyang Ding, Xiaojun Wan

Dependency Parsing

Paper
Add Code

Capturing Long-distance Dependencies in Sequence Models: A Case Study of Chinese Part-of-speech Tagging

no code implementations • IJCNLP 2013 • Weiwei Sun, Xiaochang Peng, Xiaojun Wan

Chinese Part-of-Speech Tagging Chunking +2

Paper
Add Code

Collective Opinion Target Extraction in Chinese Microblogs

no code implementations • EMNLP 2013 • Xinjie Zhou, Xiaojun Wan, Jianguo Xiao

Dependency Parsing Sentiment Analysis +1

Paper
Add Code

Learning to Order Natural Language Texts

no code implementations • ACL 2013 • Jiwei Tan, Xiaojun Wan, Jianguo Xiao

Concept-To-Text Generation Document Summarization +2

Paper
Add Code

Co-Regression for Cross-Language Review Rating Prediction

no code implementations • ACL 2013 • Xiaojun Wan

Machine Translation regression +1

Paper
Add Code

Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing

no code implementations • TACL 2013 • Weiwei Sun, Xiaojun Wan

We present a comparative study of transition-, graph- and PCFG-based models aimed at illuminating more precisely the likely contribution of CFGs in improving Chinese dependency parsing accuracy, especially by combining heterogeneous models.

Chinese Dependency Parsing Dependency Parsing +2

Paper
Add Code

Update Summarization Based on Co-Ranking with Constraints

no code implementations • COLING 2012 • Xiaojun Wan

Document Summarization Multi-Document Summarization

Paper
Add Code

Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations

no code implementations • ACL 2012 • Weiwei Sun, Xiaojun Wan

Chinese Word Segmentation Part-Of-Speech Tagging +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.