Search Results for author: Ben Athiwaratkun

Found 14 papers, 7 papers with code

Token Alignment via Character Matching for Subword Completion

no code implementations • 13 Mar 2024 • Ben Athiwaratkun, Shiqi Wang, Mingyue Shang, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Rob Kwiatowski, Ramesh Nallapati, Bing Xiang

Generative models, widely utilized in various applications, can often struggle with prompts corresponding to partial tokens.

Code Completion

Paper
Add Code

Bifurcated Attention for Single-Context Large-Batch Sampling

no code implementations • 13 Mar 2024 • Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Jiacheng Guo, Liangfu Chen, Parminder Bhatia, Ramesh Nallapati, Sudipta Sengupta, Bing Xiang

In our study, we present bifurcated attention, a method developed for language model inference in single-context batch sampling contexts.

Answer Generation Language Modelling

Paper
Add Code

Greener yet Powerful: Taming Large Code Generation Models with Quantization

no code implementations • 9 Mar 2023 • Xiaokai Wei, Sujan Gonugondla, Wasi Ahmad, Shiqi Wang, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang

Such large models incur significant resource usage (in terms of memory, latency, and dollars) as well as carbon footprint.

Code Generation Code Summarization +2

Paper
Add Code

Multi-lingual Evaluation of Code Generation Models

2 code implementations • 26 Oct 2022 • Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings.

Code Completion Code Translation +1

Paper
Code

Joint Text and Label Generation for Spoken Language Understanding

no code implementations • 11 May 2021 • Yang Li, Ben Athiwaratkun, Cicero Nogueira dos santos, Bing Xiang

In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data.

intent-classification Intent Classification +2

Paper
Add Code

Generative Context Pair Selection for Multi-hop Question Answering

no code implementations • EMNLP 2021 • Dheeru Dua, Cicero Nogueira dos santos, Patrick Ng, Ben Athiwaratkun, Bing Xiang, Matt Gardner, Sameer Singh

Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question.

Multi-hop Question Answering Question Answering

Paper
Add Code

Structured Prediction as Translation between Augmented Natural Languages

1 code implementation • ICLR 2021 • Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos santos, Bing Xiang, Stefano Soatto

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.

Ranked #3 on Relation Classification on TACRED

coreference-resolution Dialogue State Tracking +11

126

Paper
Code

Augmented Natural Language for Generative Sequence Labeling

no code implementations • EMNLP 2020 • Ben Athiwaratkun, Cicero Nogueira dos santos, Jason Krone, Bing Xiang

We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75. 0\% \rightarrow 90. 9\%$) and 1-shot ($70. 4\% \rightarrow 81. 0\%$) state-of-the-art results.

intent-classification Intent Classification +4

Paper
Add Code

There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average

2 code implementations • ICLR 2019 • Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson

Presently the most successful approaches to semi-supervised learning are based on consistency regularization, whereby a model is trained to be robust to small perturbations of its inputs and parameters.

Ranked #19 on Semi-Supervised Image Classification on CIFAR-10, 4000 Labels

Domain Adaptation Semi-Supervised Image Classification

185

Paper
Code

Probabilistic FastText for Multi-Sense Word Embeddings

1 code implementation • ACL 2018 • Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information.

Word Embeddings Word Similarity

148

Paper
Code

Hierarchical Density Order Embeddings

2 code implementations • ICLR 2018 • Ben Athiwaratkun, Andrew Gordon Wilson

By representing words with probability densities rather than point vectors, probabilistic word embeddings can capture rich and interpretable semantic information and uncertainty.

Ranked #2 on Lexical Entailment on HyperLex

Lexical Entailment Word Embeddings

Paper
Code

Multimodal Word Distributions

2 code implementations • ACL 2017 • Ben Athiwaratkun, Andrew Gordon Wilson

Word embeddings provide point representations of words containing useful semantic information.

Word Embeddings Word Similarity

279

Paper
Code

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

2 code implementations • TACL 2018 • Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, Kilian Weinberger

To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists.

Classification Cross-Lingual Document Classification +5

Paper
Code

Feature Representation in Convolutional Neural Networks

no code implementations • 8 Jul 2015 • Ben Athiwaratkun, Keegan Kang

Our results show that CNN feature maps can be used with Random Forests and SVM to yield classification results that outperforms the original CNN.

Classification General Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.