Search Results for author: Kevin Clark

Found 17 papers, 10 papers with code

Directly Fine-Tuning Diffusion Models on Differentiable Rewards

no code implementations29 Sep 2023 Kevin Clark, Paul Vicol, Kevin Swersky, David J Fleet

We present Direct Reward Fine-Tuning (DRaFT), a simple and effective method for fine-tuning diffusion models to maximize differentiable reward functions, such as scores from human preference models.

Intriguing properties of generative classifiers

1 code implementation28 Sep 2023 Priyank Jaini, Kevin Clark, Robert Geirhos

What is the best paradigm to recognize objects -- discriminative inference (fast but potentially prone to shortcut learning) or using a generative model (slow but potentially more robust)?

Object Recognition

Text-to-Image Diffusion Models are Zero-Shot Classifiers

no code implementations27 Mar 2023 Kevin Clark, Priyank Jaini

The key idea is using a diffusion model's ability to denoise a noised image given a text description of a label as a proxy for that label's likelihood.

Attribute Contrastive Learning +2

Meta-Learning Fast Weight Language Models

no code implementations5 Dec 2022 Kevin Clark, Kelvin Guu, Ming-Wei Chang, Panupong Pasupat, Geoffrey Hinton, Mohammad Norouzi

Dynamic evaluation of language models (LMs) adapts model parameters at test time using gradient information from previous tokens and substantially improves LM performance.

Language Modelling Meta-Learning

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

17 code implementations ICLR 2020 Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

Language Modelling Masked Language Modeling +3

What Does BERT Look At? An Analysis of BERT's Attention

2 code implementations WS 2019 Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.

Language Modelling Sentence

Sample Efficient Text Summarization Using a Single Pre-Trained Transformer

2 code implementations21 May 2019 Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser

Language model (LM) pre-training has resulted in impressive performance and sample efficiency on a variety of language understanding tasks.

 Ranked #1 on Text Summarization on DUC 2004 Task 1 (ROUGE-2 metric)

Abstractive Text Summarization Language Modelling

Semi-Supervised Sequence Modeling with Cross-View Training

2 code implementations EMNLP 2018 Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.

CCG Supertagging Dependency Parsing +7

Cross-View Training for Semi-Supervised Learning

no code implementations ICLR 2018 Kevin Clark, Thang Luong, Quoc V. Le

The students can learn from the teacher (the full model) because the teacher sees more of each example.

Ranked #4 on Chunking on CoNLL 2000 (using extra training data)

Chunking

Improving Coreference Resolution by Learning Entity-Level Distributed Representations

1 code implementation ACL 2016 Kevin Clark, Christopher D. Manning

A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs.

coreference-resolution

Cannot find the paper you are looking for? You can Submit a new open access paper.