1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
no code implementations • 29 Sep 2021 • Ben Krause, Nikhil Naik, Wenhao Liu, Ali Madani
Predicting the fitness, i. e. functional value, of a protein sequence is an important and challenging task in biology, particularly due to the scarcity of assay-labeled data.
1 code implementation • NeurIPS 2021 • Alvin Chan, Ali Madani, Ben Krause, Nikhil Naik
Attribute extrapolation in sample generation is challenging for deep neural networks operating beyond the training distribution.
no code implementations • 18 Oct 2020 • Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong
Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens.
3 code implementations • Findings (EMNLP) 2021 • Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani
While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate.
no code implementations • 10 Feb 2020 • Yu Bai, Ben Krause, Huan Wang, Caiming Xiong, Richard Socher
We propose \emph{Taylorized training} as an initiative towards better understanding neural network training at finite width.
1 code implementation • 17 Apr 2019 • Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals
This research note combines two methods that have recently improved the state of the art in language modeling: Transformers and dynamic evaluation.
Ranked #1 on Language Modelling on Hutter Prize
1 code implementation • 18 Sep 2018 • Joachim Fainberg, Ben Krause, Mihai Dobre, Marco Damonte, Emmanuel Kahembwe, Daniel Duma, Bonnie Webber, Federico Fancellu
Conversational agents are gaining popularity with the increasing ubiquity of smart devices.
no code implementations • 28 Sep 2017 • Ben Krause, Marco Damonte, Mihai Dobre, Daniel Duma, Joachim Fainberg, Federico Fancellu, Emmanuel Kahembwe, Jianpeng Cheng, Bonnie Webber
We present Edina, the University of Edinburgh's social bot for the Amazon Alexa Prize competition.
3 code implementations • ICML 2018 • Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals
We present methodology for using dynamic evaluation to improve neural sequence models.
Ranked #10 on Language Modelling on Hutter Prize
1 code implementation • 26 Sep 2016 • Ben Krause, Liang Lu, Iain Murray, Steve Renals
We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures.
Ranked #14 on Language Modelling on Hutter Prize
no code implementations • 16 Oct 2015 • Ben Krause
Recurrent Neural Networks (RNNs) have long been recognized for their potential to model complex time series.