Search Results for author: Davis Liang

Found 14 papers, 7 papers with code

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

1 code implementation • 7 Dec 2023 • Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa

Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.

Adversarial Robustness

Paper
Code

Co-training and Co-distillation for Quality Improvement and Compression of Language Models

no code implementations • 6 Nov 2023 • Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Hongbo Zhang, Sung Ju Hwang, Alexander Min

2) The enhanced performance of the larger model further boosts the performance of the smaller model.

Data Augmentation Knowledge Distillation

Paper
Add Code

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

1 code implementation • 31 Aug 2023 • Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa

We use this dataset to evaluate the capabilities of multilingual masked language models (MLMs) and large language models (LLMs).

Cross-Lingual Transfer Machine Reading Comprehension +2

303

Paper
Code

A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models

1 code implementation • 26 May 2023 • Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Sung Ju Hwang, Alexander Min

Distillation from Weak Teacher (DWT) is a method of transferring knowledge from a smaller, weaker teacher model to a larger student model to improve its performance.

Knowledge Distillation

125,059

Paper
Code

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

2 code implementations • 25 Jan 2023 • Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa

Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.

named-entity-recognition Named Entity Recognition +4

Paper
Code

Attention-guided Generative Models for Extractive Question Answering

no code implementations • 12 Oct 2021 • Peng Xu, Davis Liang, Zhiheng Huang, Bing Xiang

We propose a simple strategy to obtain an extractive answer span from the generative model by leveraging the decoder cross-attention patterns.

Extractive Question-Answering Hallucination +2

Paper
Add Code

Multiplicative Position-aware Transformer Models for Language Understanding

no code implementations • 27 Sep 2021 • Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang

Transformer models, which leverage architectural improvements like self-attention, perform remarkably well on Natural Language Processing (NLP) tasks.

Position

Paper
Add Code

Decoding and Diversity in Machine Translation

no code implementations • 26 Nov 2020 • Nicholas Roberts, Davis Liang, Graham Neubig, Zachary C. Lipton

This makes human-level BLEU a misleading benchmark in that modern MT systems cannot approach human-level BLEU while simultaneously maintaining human-level translation diversity.

Machine Translation NMT +1

Paper
Add Code

Improve Transformer Models with Better Relative Position Embeddings

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang

In this paper, we first review absolute position embeddings and existing methods for relative position embeddings.

Position

44,936

Paper
Code

Embedding-based Zero-shot Retrieval through Query Generation

1 code implementation • 22 Sep 2020 • Davis Liang, Peng Xu, Siamak Shakeri, Cicero Nogueira dos Santos, Ramesh Nallapati, Zhiheng Huang, Bing Xiang

In some cases, our model trained on synthetic data can even outperform the same model trained on real data

Passage Retrieval Retrieval

Paper
Code

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

no code implementations • 16 Mar 2020 • Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang

Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering.

Ranked #1 on Text Classification on GLUE MRPC

Machine Translation Natural Language Inference +5

Paper
Add Code

Masked Language Model Scoring

6 code implementations • ACL 2020 • Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff

Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one.

Attribute Domain Adaptation +4

2,294

Paper
Code

Learning Noise-Invariant Representations for Robust Speech Recognition

no code implementations • 17 Jul 2018 • Davis Liang, Zhiheng Huang, Zachary C. Lipton

Despite rapid advances in speech recognition, current models remain brittle to superficial perturbations to their inputs.

Data Augmentation Representation Learning +2

Paper
Add Code

Deep Automated Multi-task Learning

no code implementations • IJCNLP 2017 • Davis Liang, Yan Shu

Our results show that training on a primary task in parallel with a secondary automated task improves both the convergence speed and accuracy for the primary task.

Multi-Task Learning Sentiment Analysis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.