1 code implementation • 20 Mar 2024 • Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi
In this paper, we present RewardBench, a benchmark dataset and code-base for evaluation, to enhance scientific understanding of reward models.
1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
Language models have become a critical technology to tackling a wide range of natural language processing tasks, yet many details about how the best-performing language models were developed are not reported.
no code implementations • 16 Nov 2023 • YuHan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar, Yulia Tsvetkov
In this work, we take a first step towards designing summarization systems that are faithful to the author's intent, not only the semantic content of the article.
no code implementations • 13 Nov 2023 • Sachin Kumar, Chan Young Park, Yulia Tsvetkov
GEN-Z is generative, as it measures the LM likelihood of input text, conditioned on natural language descriptions of labels.
no code implementations • 1 Jun 2023 • Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi, Yulia Tsvetkov
We present SymbolicToM, a plug-and-play approach to reason about the belief states of multiple characters in reading comprehension tasks via explicit symbolic representation.
no code implementations • 24 May 2023 • Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad
Diffusion-based language models are emerging as a promising alternative to autoregressive LMs: they approach the competence of autoregressive LMs while offering nuanced controllability at inference time.
no code implementations • 23 May 2023 • Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov
Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products.
2 code implementations • 31 Mar 2023 • Leon Derczynski, Hannah Rose Kirk, Vidhisha Balachandran, Sachin Kumar, Yulia Tsvetkov, M. R. Leiser, Saif Mohammad
However, there is no risk-centric framework for documenting the complexity of a landscape in which some risks are shared across models and contexts, while others are specific, and where certain conditions may be required for risks to manifest as harms.
1 code implementation • 20 Dec 2022 • Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov
In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data.
1 code implementation • 31 Oct 2022 • Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov
Despite the growing success of diffusion models in continuous-valued domains (e. g., images), similar efforts for discrete domains such as text have yet to match the performance of autoregressive language models.
no code implementations • 25 Oct 2022 • Melanie Sclar, Peter West, Sachin Kumar, Yulia Tsvetkov, Yejin Choi
Moreover, we uniquely propose iterative distillation of knowledge, where student models from the previous iteration of distillation serve as teacher models in the next iteration.
no code implementations • 14 Oct 2022 • Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
Recent advances in the capacity of large language models to generate human-like text have resulted in their increased adoption in user-facing settings.
no code implementations • 25 May 2022 • Sachin Kumar, Biswajit Paria, Yulia Tsvetkov
Large pretrained language models generate fluent text but are notoriously hard to controllably sample from.
1 code implementation • EMNLP (MRL) 2021 • Monisha Jegadeesan, Sachin Kumar, John Wieting, Yulia Tsvetkov
We present a novel technique for zero-shot paraphrase generation.
1 code implementation • NeurIPS 2021 • Sachin Kumar, Eric Malmi, Aliaksei Severyn, Yulia Tsvetkov
As large-scale language model pretraining pushes the state-of-the-art in text generation, recent work has turned to controlling attributes of the text such models generate.
no code implementations • ACL 2021 • Sachin Kumar, Antonios Anastasopoulos, Shuly Wintner, Yulia Tsvetkov
State-of-the-art machine translation (MT) systems are typically trained to generate the "standard" target language; however, many languages have multiple varieties (regional varieties, dialects, sociolects, non-native varieties) that are different from the standard language.
no code implementations • 31 Mar 2021 • Lidia Kidane, Sachin Kumar, Yulia Tsvetkov
It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions, often requiring large amounts of auxiliary data to achieve competitive results.
no code implementations • 21 Dec 2020 • Sachin Kumar, Garima Gupta, Ranjitha Prasad, Arnab Chatterjee, Lovekesh Vig, Gautam Shroff
Advertising channels have evolved from conventional print media, billboards and radio advertising to online digital advertising (ad), where the users are exposed to a sequence of ad campaigns via social networks, display ads, search etc.
no code implementations • NeurIPS Workshop ICBINB 2020 • Sachin Kumar, Yulia Tsvetkov
We posit that this gap is due to autoregressive nature and architectural requirements for text generation as well as a fundamental difference between the definition of Wasserstein distance in image and text domains.
1 code implementation • WS 2020 • Zi-Yi Dou, Sachin Kumar, Yulia Tsvetkov
The model uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language.
no code implementations • 21 Apr 2020 • Tanya Chowdhury, Sachin Kumar, Tanmoy Chakraborty
This problem is exacerbated in multi-document summarization tasks such as summarizing the popular opinion in threads present in community question answering (CQA) websites such as Yahoo!
Abstractive Text Summarization Community Question Answering +4
no code implementations • WS 2019 • Gayatri Bhat, Sachin Kumar, Yulia Tsvetkov
Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters.
1 code implementation • IJCNLP 2019 • Sachin Kumar, Shuly Wintner, Noah A. Smith, Yulia Tsvetkov
Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well.
1 code implementation • ICLR 2019 • Sachin Kumar, Yulia Tsvetkov
The Softmax function is used in the final layer of nearly all existing sequence-to-sequence models for language generation.
no code implementations • 26 Mar 2018 • Sachin Kumar, Sumita Mishra, Pooja Khanna, Pragya
India is agriculture based economy and sugarcane is one of the major crops produced in northern India.
no code implementations • 6 Mar 2018 • Sachin Kumar, Sumita Mishra, Pallavi Asthana, Pragya
Leukemia is a hematologic cancer which develops in blood tissue and triggers rapid production of immature and abnormal shaped white blood cells.
no code implementations • 14 Aug 2013 • Sachin Kumar, Ashish Kumar, Pinaki Mitra, Girish Sundaram
For conversion of speech into English text HTK and Julius tools have been used and for conversion of English text query into SQL query we have implemented a System which uses rule based translation to translate English Language Query into SQL Query.