1 code implementation • 22 Apr 2024 • Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman
On single-turn dialogue and summarization, a SAMI-trained mistral-7b outperforms the initial pretrained model, with win rates between 66% and 77%.
1 code implementation • 17 Apr 2024 • Jan-Philipp Fränken, Kanishk Gandhi, Tori Qiu, Ayesha Khawaja, Noah D. Goodman, Tobias Gerstenberg
We collected moral permissibility and intention judgments from human participants for a subset of our items and compared these judgments to those from two language models (GPT-4 and Claude-2) across eight conditions.
1 code implementation • 1 Apr 2024 • Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman
In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (SoS).
1 code implementation • 28 Mar 2024 • Chinmaya Andukuri, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman
After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks.
1 code implementation • 14 Mar 2024 • Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman
Crucially, these improvements require no fine-tuning on these tasks.
3 code implementations • 12 Mar 2024 • Zhengxuan Wu, Atticus Geiger, Aryaman Arora, Jing Huang, Zheng Wang, Noah D. Goodman, Christopher D. Manning, Christopher Potts
Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability.
no code implementations • 5 Mar 2024 • Joy He-Yueya, Noah D. Goodman, Emma Brunskill
We propose an alternative approach that uses Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.
no code implementations • 27 Feb 2024 • Michael Y. Li, Emily B. Fox, Noah D. Goodman
We evaluate our method in three common settings in probabilistic modeling: searching within a restricted space of models, searching over an open-ended space, and improving classic models under natural language constraints (e. g., this model should be interpretable to an ecologist).
1 code implementation • 23 Jan 2024 • Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, Noah D. Goodman
We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions".
1 code implementation • 26 Oct 2023 • Alex Tamkin, Mohammad Taufeeque, Noah D. Goodman
In this setting, our approach overcomes the superposition problem by assigning states to distinct codes, and we find that we can make the neural network behave as if it is in a different state by activating the code for that state.
1 code implementation • 26 Oct 2023 • Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions.
no code implementations • 5 Oct 2023 • Jiayuan Mao, Xuelin Yang, Xikun Zhang, Noah D. Goodman, Jiajun Wu
First, there is a lack of diversity in both event types and natural language descriptions; second, causal relationships based on manually-defined heuristics are different from human judgments.
no code implementations • 11 Sep 2023 • Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman
Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses.
1 code implementation • 22 Jun 2023 • Lionel Wong, Gabriel Grand, Alexander K. Lew, Noah D. Goodman, Vikash K. Mansinghka, Jacob Andreas, Joshua B. Tenenbaum
Our architecture integrates two computational tools that have not previously come together: we model thinking with probabilistic programs, an expressive representation for commonsense reasoning; and we model meaning construction with large language models (LLMs), which support broad-coverage translation from natural language utterances to code expressions in a probabilistic programming language.
no code implementations • NeurIPS 2023 • Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman
Using our framework, we create a new social reasoning benchmark (BigToM) for LLMs which consists of 25 controls and 5, 000 model-written evaluations.
1 code implementation • 16 Jun 2023 • Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman
Language model training in distributed settings is limited by the communication cost of gradient exchanges.
no code implementations • 6 Jun 2023 • Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman
In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.
no code implementations • 30 May 2023 • Kanishk Gandhi, Dorsa Sadigh, Noah D. Goodman
Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining.
no code implementations • 19 May 2023 • Dhara Yu, Noah D. Goodman, Jesse Mu
Humans teach others about the world through language and demonstration.
1 code implementation • NeurIPS 2023 • Zhengxuan Wu, Atticus Geiger, Thomas Icard, Christopher Potts, Noah D. Goodman
With Boundless DAS, we discover that Alpaca does this by implementing a causal model with two interpretable boolean variables.
no code implementations • 11 May 2023 • Polina Tsvilodub, Michael Franke, Robert D. Hawkins, Noah D. Goodman
When faced with a polar question, speakers often provide overinformative answers going beyond a simple "yes" or "no".
no code implementations • 5 May 2023 • Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy
All biological and artificial agents must learn and make decisions given limits on their ability to process information.
1 code implementation • 16 Apr 2023 • Joy He-Yueya, Gabriel Poesia, Rose E. Wang, Noah D. Goodman
Automatically generating high-quality step-by-step solutions to math word problems has many applications in education.
1 code implementation • NeurIPS 2023 • Ben Prystawski, Michael Y. Li, Noah D. Goodman
We investigate why and how chain-of-thought reasoning is useful in language models, testing the hypothesis that reasoning is effective when training data consists of overlapping local clusters of variables that influence each other strongly.
no code implementations • 5 Mar 2023 • Atticus Geiger, Zhengxuan Wu, Christopher Potts, Thomas Icard, Noah D. Goodman
In DAS, we find the alignment between high-level and low-level models using gradient descent rather than conducting a brute-force search, and we allow individual neurons to play multiple distinct roles by analyzing representations in non-standard bases-distributed representations.
1 code implementation • 20 Dec 2022 • Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.
Ranked #8 on Code Generation on HumanEval
1 code implementation • 30 Nov 2022 • Joy Hsu, Jiajun Wu, Noah D. Goodman
In contrast, low-level and high-level visual features from standard computer vision models pretrained on natural images do not support correct generalization.
1 code implementation • 29 Nov 2022 • Gabriel Poesia, Noah D. Goodman
We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform.
no code implementations • 30 Oct 2022 • Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy
Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources.
1 code implementation • 16 Sep 2022 • Ben Prystawski, Paul Thibodeau, Christopher Potts, Noah D. Goodman
Probabilistic models of language understanding are valuable tools for investigating human language use.
1 code implementation • 18 May 2022 • Fei Fang, Kunal Sinha, Noah D. Goodman, Christopher Potts, Elisa Kreiss
It seems likely that these patterns are shaped by the environment a speaker is exposed to in complex ways.
1 code implementation • 28 Mar 2022 • Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA.
Ranked #17 on Common Sense Reasoning on CommonsenseQA
1 code implementation • NAACL 2022 • Zhengxuan Wu, Atticus Geiger, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D. Goodman
Distillation efforts have led to language models that are more compact and efficient without serious drops in performance.
2 code implementations • 1 Dec 2021 • Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, Noah D. Goodman, Christopher Potts
In IIT, we (1) align variables in a causal model (e. g., a deterministic program or Bayesian network) with representations in a neural model and (2) train the neural model to match the counterfactual behavior of the causal model on a base input when aligned representations in both models are set to be the value they would be for a source input.
1 code implementation • Findings (EMNLP) 2021 • Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman
We propose a method that uses a population of neural listeners to regularize speaker training.
1 code implementation • 17 Sep 2021 • Robert D. Hawkins, Megumi Sano, Noah D. Goodman, Judith E. Fan
From photorealistic sketches to schematic diagrams, drawing provides a versatile medium for communicating about the visual world.
1 code implementation • 28 Jul 2021 • Michael Henry Tessler, Jason Madeano, Pedro A. Tsividis, Brin Harper, Noah D. Goodman, Joshua B. Tenenbaum
The video game paradigm we pioneer here is thus a rich test bed for developing AI systems capable of acquiring and transmitting cultural knowledge.
1 code implementation • 16 Apr 2021 • Elisa Kreiss, Fei Fang, Noah D. Goodman, Christopher Potts
Current deep learning models often achieve excellent results on benchmark image-to-text datasets but fail to generate texts that are useful in practice.
1 code implementation • 12 Apr 2021 • Robert D. Hawkins, Michael Franke, Michael C. Frank, Adele E. Goldberg, Kenny Smith, Thomas L. Griffiths, Noah D. Goodman
Languages are powerful solutions to coordination problems: they provide stable, shared expectations about how the words we say correspond to the beliefs and intentions in our heads.
1 code implementation • 31 May 2020 • Julia White, Jesse Mu, Noah D. Goodman
A hallmark of human language is the ability to effectively and efficiently convey contextually relevant information.
1 code implementation • 4 Feb 2020 • Robert D. Hawkins, Noah D. Goodman, Adele E. Goldberg, Thomas L. Griffiths
A key property of linguistic conventions is that they hold over an entire community of speakers, allowing us to communicate efficiently even with people we have never met before.
1 code implementation • 16 Dec 2019 • Robert D. Hawkins, Michael C. Frank, Noah D. Goodman
The language we use over the course of conversation changes as we establish common ground and learn what our partner finds meaningful.
1 code implementation • CONLL 2020 • Robert D. Hawkins, Minae Kwon, Dorsa Sadigh, Noah D. Goodman
To communicate with new partners in new contexts, humans rapidly form new linguistic conventions.
1 code implementation • 12 Sep 2019 • Ishita Dasgupta, Demi Guo, Samuel J. Gershman, Noah D. Goodman
Analyzing performance on these diagnostic tests indicates a lack of systematicity in the representations and decision rules, and reveals a set of heuristic strategies.
1 code implementation • WS 2019 • Allen Nie, Erin D. Bennett, Noah D. Goodman
We demonstrate that our strategy is sufficient to generate highly plausible explanations for general open-domain phenomena compared to other models trained on different datasets.
1 code implementation • ICCV 2019 • Panos Achlioptas, Judy Fan, Robert X. D. Hawkins, Noah D. Goodman, Leonidas J. Guibas
We also find that these models are amenable to zero-shot transfer learning to novel object classes (e. g. transfer from training on chairs to testing on lamps), as well as to real-world images drawn from furniture catalogs.
no code implementations • ICLR 2019 • Panos Achlioptas, Judy E. Fan, Robert X. D. Hawkins, Noah D. Goodman, Leo Guibas
We further show that a neural speaker that is `listener-aware' --- that plans its utterances according to how an imagined listener would interpret its words in context --- produces more discriminative referring expressions than an `listener-unaware' speaker, as measured by human performance in identifying the correct object.
1 code implementation • 19 Mar 2019 • Judith Degen, Robert D. Hawkins, Caroline Graf, Elisa Kreiss, Noah D. Goodman
Crucially, we relax the assumption that informativeness is computed with respect to a deterministic Boolean semantics, in favor of a non-deterministic continuous semantics.
1 code implementation • 15 Mar 2019 • Desmond C. Ong, Harold Soh, Jamil Zaki, Noah D. Goodman
Affective Computing is a rapidly growing field spurred by advancements in artificial intelligence, but often, held back by the inability to translate psychological theories of emotion into tractable computational models.
no code implementations • 21 Nov 2018 • Jonathan P. Chen, Fritz Obermeyer, Vladimir Lyapunov, Lionel Gueguen, Noah D. Goodman
Our algorithm outperforms our current production baseline based on k-means clustering.
1 code implementation • 18 Oct 2018 • Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, Noah D. Goodman
Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research.
no code implementations • WS 2019 • Reuben Cohn-Gordon, Noah D. Goodman, Christopher Potts
Recent Iterated Response (IR) models of pragmatics conceptualize language use as a recursive process in which agents reason about each other to increase communicative efficiency.
1 code implementation • 24 Jul 2018 • Robert D. Hawkins, Hyowon Gweon, Noah D. Goodman
In Experiment 1, we manipulated the presence or absence of occlusions in a director-matcher task and found that speakers spontaneously produced more informative descriptions to account for "known unknowns" in their partner's private view.
1 code implementation • TACL 2018 • Fereshte Khani, Noah D. Goodman, Percy Liang
We study sequential language games in which two players, each with private information, communicate to achieve a common goal.
1 code implementation • 12 Feb 2018 • Ishita Dasgupta, Demi Guo, Andreas Stuhlmüller, Samuel J. Gershman, Noah D. Goodman
Further, we find that augmenting training with our dataset improves test performance on our dataset without loss of performance on the original training dataset.
3 code implementations • 12 Oct 2017 • Allen Nie, Erin D. Bennett, Noah D. Goodman
Learning effective representations of sentences is one of the core missions of natural language understanding.
1 code implementation • NeurIPS 2017 • N. Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank Wood, Philip H. S. Torr
We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder.
1 code implementation • TACL 2017 • Will Monroe, Robert X. D. Hawkins, Noah D. Goodman, Christopher Potts
We present a model of pragmatic referring expression interpretation in a grounded communication task (identifying colors from descriptions) that draws upon predictions from two recurrent neural network classifiers, a speaker and a listener, unified by a recursive pragmatic reasoning framework.
no code implementations • 22 Nov 2016 • N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H. S. Torr
We develop a framework for incorporating structured graphical models in the \emph{encoders} of variational autoencoders (VAEs) that allows us to induce interpretable representations through approximate variational inference.
1 code implementation • 18 Oct 2016 • Daniel Ritchie, Paul Horsfall, Noah D. Goodman
This paper proposes a system for amortized inference in PPLs.
1 code implementation • 9 Aug 2016 • Michael Henry Tessler, Noah D. Goodman
Language provides simple ways of communicating generalizable knowledge to each other (e. g., "Birds fly", "John hikes", "Fire makes smoke").
1 code implementation • EMNLP 2016 • Will Monroe, Noah D. Goodman, Christopher Potts
The production of color language is essential for grounded language generation.
1 code implementation • NeurIPS 2016 • Daniel Ritchie, Anna Thomas, Pat Hanrahan, Noah D. Goodman
Probabilistic inference algorithms such as Sequential Monte Carlo (SMC) provide powerful tools for constraining procedural models in computer graphics, but they require many samples to produce desirable results.
no code implementations • 18 Dec 2015 • Owain Evans, Andreas Stuhlmueller, Noah D. Goodman
If we assume that choices are approximately optimal according to some utility function, we can treat preference inference as Bayesian inverse planning.
no code implementations • 9 Sep 2015 • Andreas Stuhlmüller, Robert X. D. Hawkins, N. Siddharth, Noah D. Goodman
When models are expressed as probabilistic programs, the models themselves are highly structured objects that can be used to derive annealing sequences that are more sensitive to domain structure.
no code implementations • 7 Sep 2015 • Daniel Ritchie, Andreas Stuhlmüller, Noah D. Goodman
Lightweight, source-to-source transformation approaches to implementing MCMC for probabilistic programming languages are popular for their simplicity, support of existing deterministic code, and ability to execute on existing fast runtimes.
no code implementations • 15 Jun 2012 • Andreas Stuhlmüller, Noah D. Goodman
This factored sum-product network makes (potentially cyclic) dependencies between subproblems explicit, and corresponds to a system of equations for the marginal distribution.