no code implementations • 2 May 2024 • Dhananjay Ashok, Barnabas Poczos
While most research on controllable text generation has focused on steering base Language Models, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability.
no code implementations • 28 Jul 2023 • Matthew Barker, Emma Kallina, Dhananjay Ashok, Katherine M. Collins, Ashley Casovan, Adrian Weller, Ameet Talwalkar, Valerie Chen, Umang Bhatt
We propose FeedbackLogs, addenda to existing documentation of ML pipelines, to track the input of multiple stakeholders.
no code implementations • 24 May 2023 • Dhananjay Ashok, Zachary C. Lipton
In a surprising turn, Large Language Models (LLMs) together with a growing arsenal of prompt-based heuristics now offer powerful off-the-shelf approaches providing few-shot solutions to myriad classic NLP problems.
Ranked #3 on Zero-shot Named Entity Recognition (NER) on CrossNER (using extra training data)
1 code implementation • 24 May 2023 • Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnabás Póczos
Our method outperforms the very LLM that was used to generate the annotated dataset -- with Few-Shot Prompting on GPT3. 5 achieving 58%, 61%, and 64% on the respective datasets, a consistently lower correction accuracy, despite using nearly 800 times as many parameters as our model.
no code implementations • 7 Jul 2022 • Dhananjay Ashok, Vineel Nagisetty, Christopher Srinivasa, Vijay Ganesh
We present a novel hybrid algorithm for training Deep Neural Networks that combines the state-of-the-art Gradient Descent (GD) method with a Mixed Integer Linear Programming (MILP) solver, outperforming GD and variants in terms of accuracy, as well as resource and data efficiency for both regression and classification tasks.
no code implementations • 21 Oct 2020 • Dhananjay Ashok, Joseph Scott, Sebastian Wetzel, Maysum Panju, Vijay Ganesh
Our method, logic-guided genetic algorithm (LGGA), takes as input a set of labelled data points and auxiliary truths (ATs) (mathematical facts known a priori about the unknown function the regressor aims to learn) and outputs a specially generated and curated dataset that can be used with any SR method.