Linguistic Acceptability

47 papers with code • 5 benchmarks • 5 datasets

Linguistic Acceptability is the task of determining whether a sentence is grammatical or ungrammatical.

Image Source: Warstadt et al

Most implemented papers

Entailment as Few-Shot Learner

PaddlePaddle/PaddleNLP 29 Apr 2021

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

huggingface/transformers-bloom-inference 15 Aug 2022

We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.

Neural Network Acceptability Judgments

nyu-mll/CoLA-baselines TACL 2019

This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence.

ERNIE: Enhanced Language Representation with Informative Entities

thunlp/ERNIE ACL 2019

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks.

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

google-research/google-research ICLR 2022

In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.

Can BERT eat RuCoLA? Topological Data Analysis to Explain

upunaprosk/la-tda 4 Apr 2023

Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs.

JCoLA: Japanese Corpus of Linguistic Acceptability

osekilab/jcola 22 Sep 2023

In this paper, we introduce JCoLA (Japanese Corpus of Linguistic Acceptability), which consists of 10, 020 sentences annotated with binary acceptability judgments.

Natural Language Generation for Effective Knowledge Distillation

castorini/d-bert WS 2019

Knowledge distillation can effectively transfer knowledge from BERT, a deep language representation model, to traditional, shallow word embedding-based neural networks, helping them approach or exceed the quality of other heavyweight language representation models.

Learning to Encode Position for Transformer with Continuous Dynamical Model

xuanqing94/FLOATER ICML 2020

The main reason is that position information among input units is not inherently encoded, i. e., the models are permutation equivalent; this problem justifies why all of the existing models are accompanied by a sinusoidal encoding/embedding layer at the input.

Synthesizer: Rethinking Self-Attention in Transformer Models

10-zin/Synthesizer 2 May 2020

The dot product self-attention is known to be central and indispensable to state-of-the-art Transformer models.