Search Results for author: Weijia Shi

Found 34 papers, 19 papers with code

Instruction-tuned Language Models are Better Knowledge Learners

no code implementations • 20 Feb 2024 • Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodriguez, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srinivasan Iyer

The standard recipe for doing so involves continued pre-training on new documents followed by instruction-tuning on question-answer (QA) pairs.

Language Modelling Large Language Model

Paper
Add Code

Do Membership Inference Attacks Work on Large Language Models?

1 code implementation • 12 Feb 2024 • Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi

Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.

Membership Inference Attack

Paper
Code

Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

no code implementations • 1 Feb 2024 • Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov

Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps -- missing or outdated information in LLMs -- might always persist given the evolving nature of knowledge.

Retrieval

Paper
Add Code

Detecting Pretraining Data from Large Language Models

no code implementations • 25 Oct 2023 • Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer

Min-K% Prob can be applied without any knowledge about the pretraining corpus or any additional training, departing from previous detection methods that require training a reference model on data that is similar to the pretraining data.

Machine Unlearning

Paper
Add Code

In-Context Pretraining: Language Modeling Beyond Document Boundaries

no code implementations • 16 Oct 2023 • Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion.

In-Context Learning Language Modelling +1

Paper
Add Code

Lemur: Harmonizing Natural Language and Code for Language Agents

1 code implementation • 10 Oct 2023 • Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.

518

Paper
Code

RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

1 code implementation • 6 Oct 2023 • Fangyuan Xu, Weijia Shi, Eunsol Choi

Retrieving documents and prepending them in-context at inference time improves performance of language model (LMs) on a wide range of tasks.

Language Modelling Open-Domain Question Answering +1

Paper
Code

Resolving Knowledge Conflicts in Large Language Models

1 code implementation • 2 Oct 2023 • Yike Wang, Shangbin Feng, Heng Wang, Weijia Shi, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals.

Paper
Code

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

no code implementations • 2 Oct 2023 • Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih

Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build.

Few-Shot Learning Open-Domain Question Answering +1

Paper
Add Code

Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

no code implementations • NeurIPS 2023 • Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi

We introduce Fine-Grained RLHF, a framework that enables training and learning from reward functions that are fine-grained in two respects: (1) density, providing a reward after every segment (e. g., a sentence) is generated; and (2) incorporating multiple reward models associated with different feedback types (e. g., factual incorrectness, irrelevance, and information incompleteness).

Language Modelling Long Form Question Answering +2

Paper
Add Code

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding

no code implementations • 24 May 2023 • Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih

Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.

Paper
Add Code

Getting MoRE out of Mixture of Language Model Reasoning Experts

no code implementations • 24 May 2023 • Chenglei Si, Weijia Shi, Chen Zhao, Luke Zettlemoyer, Jordan Boyd-Graber

Beyond generalizability, the interpretable design of MoRE improves selective question answering results compared to baselines without incorporating inter-expert agreement.

Answer Selection Language Modelling

Paper
Add Code

Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models

2 code implementations • 17 May 2023 • Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains.

Retrieval

Paper
Code

Scaling Expert Language Models with Unsupervised Domain Discovery

1 code implementation • 24 Mar 2023 • Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

Large language models are typically trained densely: all parameters are updated with respect to all inputs.

Language Modelling

104

Paper
Code

$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

no code implementations • 21 Feb 2023 • Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee

Fine-tuning a language model on a new domain is standard practice for domain adaptation.

Domain Adaptation Language Modelling +1

Paper
Add Code

REPLUG: Retrieval-Augmented Black-Box Language Models

1 code implementation • 30 Jan 2023 • Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih

We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.

Ranked #9 on Question Answering on Natural Questions

Language Modelling Multi-task Language Understanding +2

929

Paper
Code

PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3

no code implementations • ICCV 2023 • Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Question Answering +3

Paper
Add Code

Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?

no code implementations • 20 Dec 2022 • Weijia Shi, Xiaochuang Han, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer

Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior.

Paper
Add Code

One Embedder, Any Task: Instruction-Finetuned Text Embeddings

3 code implementations • 19 Dec 2022 • Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Our analysis suggests that INSTRUCTOR is robust to changes in instructions, and that instruction finetuning mitigates the challenge of training a single model on diverse datasets.

Information Retrieval Learning Word Embeddings +3

4,091

Paper
Code

Nonparametric Masked Language Modeling

1 code implementation • 2 Dec 2022 • Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases.

Language Modelling Masked Language Modeling +2

153

Paper
Code

Retrieval-Augmented Multimodal Language Modeling

no code implementations • 22 Nov 2022 • Michihiro Yasunaga, Armen Aghajanyan, Weijia Shi, Rich James, Jure Leskovec, Percy Liang, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih

To integrate knowledge in a more scalable and modular way, we propose a retrieval-augmented multimodal model, which enables a base multimodal model (generator) to refer to relevant text and images fetched by a retriever from external memory (e. g., documents on the web).

Ranked #7 on Image Captioning on MS COCO

Caption Generation Image Captioning +5

Paper
Add Code

PromptCap: Prompt-Guided Task-Aware Image Captioning

1 code implementation • 15 Nov 2022 • Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Ranked #1 on Visual Question Answering on TextVQA test-standard

Image Captioning Language Modelling +5

Paper
Code

RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering

1 code implementation • 25 Oct 2022 • Victor Zhong, Weijia Shi, Wen-tau Yih, Luke Zettlemoyer

Moreover, existing models are not robust to variations in question constraints, but can be made more robust by tuning on clusters of related questions.

Question Answering Retrieval

Paper
Code

Selective Annotation Makes Language Models Better Few-Shot Learners

1 code implementation • 5 Sep 2022 • Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time.

Code Generation In-Context Learning +1

Paper
Code

kNN-Prompt: Nearest Neighbor Zero-Shot Inference

1 code implementation • 27 May 2022 • Weijia Shi, Julian Michael, Suchin Gururangan, Luke Zettlemoyer

Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy.

Domain Adaptation Language Modelling +6

Paper
Code

DESCGEN: A Distantly Supervised Datasetfor Generating Entity Descriptions

1 code implementation • ACL 2021 • Weijia Shi, Mandar Joshi, Luke Zettlemoyer

Short textual descriptions of entities provide summaries of their key attributes and have been shown to be useful sources of background knowledge for tasks such as entity linking and question answering.

Document Summarization Entity Linking +2

Paper
Code

DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions

1 code implementation • 9 Jun 2021 • Weijia Shi, Mandar Joshi, Luke Zettlemoyer

Entity Linking Question Answering

Paper
Code

Design Challenges in Low-resource Cross-lingual Entity Linking

1 code implementation • EMNLP 2020 • Xingyu Fu, Weijia Shi, Xiaodong Yu, Zian Zhao, Dan Roth

Cross-lingual Entity Linking (XEL), the problem of grounding mentions of entities in a foreign language text into an English knowledge base such as Wikipedia, has seen a lot of research in recent years, with a range of promising techniques.

Cross-Lingual Entity Linking Entity Linking

Paper
Code

Cross-lingual Entity Alignment with Incidental Supervision

1 code implementation • EACL 2021 • Muhao Chen, Weijia Shi, Ben Zhou, Dan Roth

Much research effort has been put to multilingual knowledge graph (KG) embedding methods to address the entity alignment task, which seeks to match entities in different languagespecific KGs that refer to the same real-world object.

Ranked #19 on Entity Alignment on DBP15k zh-en

Entity Alignment Knowledge Graphs

Paper
Code

On Tractable Representations of Binary Neural Networks

no code implementations • 5 Apr 2020 • Weijia Shi, Andy Shih, Adnan Darwiche, Arthur Choi

We consider the compilation of a binary neural network's decision function into tractable representations such as Ordered Binary Decision Diagrams (OBDDs) and Sentential Decision Diagrams (SDDs).

Paper
Add Code

Retrofitting Contextualized Word Embeddings with Paraphrases

no code implementations • IJCNLP 2019 • Weijia Shi, Muhao Chen, Pei Zhou, Kai-Wei Chang

Contextualized word embedding models, such as ELMo, generate meaningful representations of words and their context.

Sentence Sentence Classification +1

Paper
Add Code

Examining Gender Bias in Languages with Grammatical Gender

1 code implementation • IJCNLP 2019 • Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, Kai-Wei Chang

Recent studies have shown that word embeddings exhibit gender bias inherited from the training corpora.

Translation Word Embeddings +2

Paper
Code

Learning Bilingual Word Embeddings Using Lexical Definitions

no code implementations • WS 2019 • Weijia Shi, Muhao Chen, Yingtao Tian, Kai-Wei Chang

Bilingual word embeddings, which representlexicons of different languages in a shared em-bedding space, are essential for supporting se-mantic and knowledge transfers in a variety ofcross-lingual NLP tasks.

Translation Word Alignment +1

Paper
Add Code

Embedding Uncertain Knowledge Graphs

1 code implementation • 26 Nov 2018 • Xuelu Chen, Muhao Chen, Weijia Shi, Yizhou Sun, Carlo Zaniolo

However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge.

Binary Classification General Classification +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.