Search Results for author: Yixiao Song

Found 6 papers, 5 papers with code

GEE! Grammar Error Explanation with Large Language Models

1 code implementation16 Nov 2023 Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Kevin Gimpel, Mohit Iyyer

To address this gap, we propose the task of grammar error explanation, where a system needs to provide one-sentence explanations for each grammatical error in a pair of erroneous and corrected sentences.

Grammatical Error Correction Sentence

A Critical Evaluation of Evaluations for Long-form Question Answering

1 code implementation29 May 2023 Fangyuan Xu, Yixiao Song, Mohit Iyyer, Eunsol Choi

We present a careful analysis of experts' evaluation, which focuses on new aspects such as the comprehensiveness of the answer.

Long Form Question Answering Text Generation

KNN-LM Does Not Improve Open-ended Text Generation

no code implementations24 May 2023 Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, Mohit Iyyer

Digging deeper, we find that interpolating with a retrieval distribution actually increases perplexity compared to a baseline Transformer LM for the majority of tokens in the WikiText-103 test set, even though the overall perplexity is lower due to a smaller number of tokens for which perplexity dramatically decreases after interpolation.

Retrieval Text Generation

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense

1 code implementation NeurIPS 2023 Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, Mohit Iyyer

To increase the robustness of AI-generated text detection to paraphrase attacks, we introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.

Language Modelling Outlier Detection +3

DEMETR: Diagnosing Evaluation Metrics for Translation

1 code implementation25 Oct 2022 Marzena Karpinska, Nishant Raj, Katherine Thai, Yixiao Song, Ankita Gupta, Mohit Iyyer

While machine translation evaluation metrics based on string overlap (e. g., BLEU) have their limitations, their computations are transparent: the BLEU score assigned to a particular candidate translation can be traced back to the presence or absence of certain words.

Machine Translation Translation

SLING: Sino Linguistic Evaluation of Large Language Models

1 code implementation21 Oct 2022 Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Mohit Iyyer

To understand what kinds of linguistic knowledge are encoded by pretrained Chinese language models (LMs), we introduce the benchmark of Sino LINGuistics (SLING), which consists of 38K minimal sentence pairs in Mandarin Chinese grouped into 9 high-level linguistic phenomena.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.