Search Results for author: Haokun Liu

Found 22 papers, 11 papers with code

BLiMP: A Benchmark of Linguistic Minimal Pairs for English

no code implementations • SCiL 2020 • Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Mohananey, Wei Peng, Sheng-Fu Wang, Samuel R. Bowman

Paper
Add Code

Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

no code implementations • 8 Apr 2024 • Bowen Pan, Yikang Shen, Haokun Liu, Mayank Mishra, Gaoyuan Zhang, Aude Oliva, Colin Raffel, Rameswar Panda

Mixture-of-Experts (MoE) language models can reduce computational costs by 2-4$\times$ compared to dense models without sacrificing performance, making them more efficient in computation-bounded scenarios.

Paper
Add Code

Hypothesis Generation with Large Language Models

no code implementations • 5 Apr 2024 • Yangqiaoyu Zhou, Haokun Liu, Tejes Srivastava, Hongyuan Mei, Chenhao Tan

We focus on hypothesis generation based on data (i. e., labeled examples).

Multi-Armed Bandits

Paper
Add Code

Learning to Route Among Specialized Experts for Zero-Shot Generalization

1 code implementation • 8 Feb 2024 • Mohammed Muqeeth, Haokun Liu, Yufan Liu, Colin Raffel

Unlike past methods that learn to route among specialized models, PHATGOOSE explores the possibility that zero-shot generalization will be improved if different experts can be adaptively chosen for each token and at each layer in the model.

Zero-shot Generalization

Paper
Code

LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks

no code implementations • 29 Aug 2023 • Haokun Liu, Yaonan Zhu, Kenji Kato, Izumi Kondo, Tadayoshi Aoyama, Yasuhisa Hasegawa

This paper presents a novel approach to enhance autonomous robotic manipulation using the Large Language Model (LLM) for logical inference, converting high-level language commands into sequences of executable motion functions.

Language Modelling Large Language Model

Paper
Add Code

Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models

1 code implementation • 7 Jun 2023 • Nikhil Kandpal, Brian Lester, Mohammed Muqeeth, Anisha Mascarenhas, Monty Evans, Vishal Baskaran, Tenghao Huang, Haokun Liu, Colin Raffel

Currently, most machine learning models are trained by centralized teams and are rarely updated.

187

Paper
Code

Soft Merging of Experts with Adaptive Routing

no code implementations • 6 Jun 2023 • Mohammed Muqeeth, Haokun Liu, Colin Raffel

To address this issue, we introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids discrete routing by using a single "merged" expert constructed via a weighted average of all of the experts' parameters.

Paper
Add Code

The Semantic Scholar Open Data Platform

1 code implementation • 24 Jan 2023 • Rodney Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, Jason Dunkelberger, Oren Etzioni, Rob Evans, Sergey Feldman, Joseph Gorney, David Graham, Fangzhou Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Chris Newell, Smita Rao, Shaurya Rohatgi, Paul Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, Amber Tanaka, Alex D. Wade, Linda Wagner, Lucy Lu Wang, Chris Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Madeleine van Zuylen, Daniel S. Weld

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.

graph construction

Paper
Code

Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting

1 code implementation • 29 Nov 2022 • Elena Orlova, Haokun Liu, Raphael Rossellini, Benjamin Cash, Rebecca Willett

This study explores an application of machine learning (ML) models as post-processing tools for subseasonal forecasting.

Feature Importance regression

Paper
Code

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

2 code implementations • 11 May 2022 • Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel

ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.

Ranked #1 on Few-Shot Text Classification on RAFT

Few-Shot Text Classification In-Context Learning

1,953

Paper
Code

Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers

no code implementations • EMNLP (BlackboxNLP) 2021 • Jason Phang, Haokun Liu, Samuel R. Bowman

Despite the success of fine-tuning pretrained language encoders like BERT for downstream natural language understanding (NLU) tasks, it is still poorly understood how neural networks change after fine-tuning.

Natural Language Understanding

Paper
Add Code

Comparing Test Sets with Item Response Theory

no code implementations • ACL 2021 • Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman

Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks.

Natural Language Understanding

Paper
Add Code

Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

1 code implementation • EMNLP 2020 • Alex Warstadt, Yian Zhang, Haau-Sing Li, Haokun Liu, Samuel R. Bowman

One reason pretraining on self-supervised linguistic tasks is effective is that it teaches models features that are helpful for language understanding.

Binary Classification

Paper
Code

Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data

1 code implementation • EMNLP (insights) 2020 • William Huang, Haokun Liu, Samuel R. Bowman

A growing body of work shows that models exploit annotation artifacts to achieve state-of-the-art performance on standard crowdsourced benchmarks---datasets collected from crowdworkers to create an evaluation task---while still failing on out-of-domain examples for the same task.

counterfactual Natural Language Inference +2

Paper
Code

Precise Task Formalization Matters in Winograd Schema Evaluations

1 code implementation • EMNLP 2020 • Haokun Liu, William Huang, Dhara A. Mungra, Samuel R. Bowman

Performance on the Winograd Schema Challenge (WSC), a respected English commonsense reasoning benchmark, recently rocketed from chance accuracy to 89% on the SuperGLUE leaderboard, with relatively little corroborating evidence of a correspondingly large improvement in reasoning ability.

Language Modelling Multiple-choice

Paper
Code

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

no code implementations • ACL 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.

coreference-resolution Natural Language Understanding +1

Paper
Add Code

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.

Ranked #20 on Zero-Shot Cross-Lingual Transfer on XTREME

Question Answering Retrieval +3

Paper
Add Code

Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?

no code implementations • 1 May 2020 • Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

However, we fail to observe more granular correlations between probing and target task performance, highlighting the need for further work on broad-coverage probing benchmarks.

coreference-resolution Natural Language Understanding +1

Paper
Add Code

jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

6 code implementations • ACL 2020 • Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman

We introduce jiant, an open source toolkit for conducting multitask and transfer learning experiments on English NLU tasks.

Transfer Learning

1,604

Paper
Code

BLiMP: The Benchmark of Linguistic Minimal Pairs for English

4 code implementations • TACL 2020 • Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Mohananey, Wei Peng, Sheng-Fu Wang, Samuel R. Bowman

We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLiMP), a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English.

124

Paper
Code

Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs

1 code implementation • IJCNLP 2019 • Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretič, Samuel R. Bowman

We conclude that a variety of methods is necessary to reveal all relevant aspects of a model's grammatical knowledge in a given domain.

Negation Open-Ended Question Answering +1

Paper
Code

MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation

no code implementations • COLING 2018 • Meng Zou, Xihan Li, Haokun Liu, Zhi-Hong Deng

Neural encoder-decoder models have been widely applied to conversational response generation, which is a research hot spot in recent years.

Conversational Response Generation Response Generation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.