Search Results for author: Jonathan Tu

Found 3 papers, 0 papers with code

Understanding the Inner Workings of Language Models Through Representation Dissimilarity

no code implementations23 Oct 2023 Davis Brown, Charles Godfrey, Nicholas Konz, Jonathan Tu, Henry Kvinge

As language models are applied to an increasing number of real-world applications, understanding their inner workings has become an important issue in model trust, interpretability, and transparency.

Language Modelling

Attributing Learned Concepts in Neural Networks to Training Data

no code implementations4 Oct 2023 Nicholas Konz, Charles Godfrey, Madelyn Shapiro, Jonathan Tu, Henry Kvinge, Davis Brown

By now there is substantial evidence that deep learning models learn certain human-interpretable features as part of their internal representations of data.

Cannot find the paper you are looking for? You can Submit a new open access paper.