Search Results for author: Ben Krause

Found 12 papers, 7 papers with code

XGen-7B Technical Report

1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong

Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.

2k 8k

711

Paper
Code

Don’t throw away that linear head: Few-shot protein fitness prediction with generative models

no code implementations • 29 Sep 2021 • Ben Krause, Nikhil Naik, Wenhao Liu, Ali Madani

Predicting the fitness, i. e. functional value, of a protein sequence is an important and challenging task in biology, particularly due to the scarcity of assay-labeled data.

Transfer Learning

Paper
Add Code

Deep Extrapolation for Attribute-Enhanced Generation

1 code implementation • NeurIPS 2021 • Alvin Chan, Ali Madani, Ben Krause, Nikhil Naik

Attribute extrapolation in sample generation is challenging for deep neural networks operating beyond the training distribution.

Attribute

Paper
Code

Explaining and Improving Model Behavior with k Nearest Neighbor Representations

no code implementations • 18 Oct 2020 • Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong

Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens.

Natural Language Inference

Paper
Add Code

GeDi: Generative Discriminator Guided Sequence Generation

3 code implementations • Findings (EMNLP) 2021 • Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani

While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate.

Attribute Linguistic Acceptability +1

209

Paper
Code

Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width

no code implementations • 10 Feb 2020 • Yu Bai, Ben Krause, Huan Wang, Caiming Xiong, Richard Socher

We propose \emph{Taylorized training} as an initiative towards better understanding neural network training at finite width.

Paper
Add Code

Dynamic Evaluation of Transformer Language Models

1 code implementation • 17 Apr 2019 • Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

This research note combines two methods that have recently improved the state of the art in language modeling: Transformers and dynamic evaluation.

Ranked #1 on Language Modelling on Hutter Prize

Language Modelling

Paper
Code

Talking to myself: self-dialogues as data for conversational agents

1 code implementation • 18 Sep 2018 • Joachim Fainberg, Ben Krause, Mihai Dobre, Marco Damonte, Emmanuel Kahembwe, Daniel Duma, Bonnie Webber, Federico Fancellu

Conversational agents are gaining popularity with the increasing ubiquity of smart devices.

105

Paper
Code

Edina: Building an Open Domain Socialbot with Self-dialogues

no code implementations • 28 Sep 2017 • Ben Krause, Marco Damonte, Mihai Dobre, Daniel Duma, Joachim Fainberg, Federico Fancellu, Emmanuel Kahembwe, Jianpeng Cheng, Bonnie Webber

We present Edina, the University of Edinburgh's social bot for the Amazon Alexa Prize competition.

Paper
Add Code

Dynamic Evaluation of Neural Sequence Models

3 code implementations • ICML 2018 • Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

We present methodology for using dynamic evaluation to improve neural sequence models.

Ranked #10 on Language Modelling on Hutter Prize

Language Modelling

105

Paper
Code

Multiplicative LSTM for sequence modelling

1 code implementation • 26 Sep 2016 • Ben Krause, Liang Lu, Iain Murray, Steve Renals

We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures.

Ranked #14 on Language Modelling on Hutter Prize

Density Estimation Language Modelling

Paper
Code

Optimizing and Contrasting Recurrent Neural Network Architectures

no code implementations • 16 Oct 2015 • Ben Krause

Recurrent Neural Networks (RNNs) have long been recognized for their potential to model complex time series.

Time Series Time Series Analysis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.