Search Results for author: Jonathan Ragan-Kelley

Found 16 papers, 10 papers with code

WatChat: Explaining perplexing programs by debugging mental models

1 code implementation • 8 Mar 2024 • Kartik Chandra, Tzu-Mao Li, Rachit Nigam, Joshua Tenenbaum, Jonathan Ragan-Kelley

Often, a good explanation for a program's unexpected behavior is a bug in the programmer's code.

Paper
Code

Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding

no code implementations • 7 Feb 2024 • Zachary Ankner, Rishab Parthasarathy, Aniruddha Nrusimha, Christopher Rinard, Jonathan Ragan-Kelley, William Brandon

In this work, we propose Hydra heads, a sequentially dependent, drop-in replacement for standard draft heads that significantly improves speculation accuracy.

Paper
Add Code

How to guess a gradient

no code implementations • 7 Dec 2023 • Utkarsh Singhal, Brian Cheung, Kartik Chandra, Jonathan Ragan-Kelley, Joshua B. Tenenbaum, Tomaso A. Poggio, Stella X. Yu

We study how to narrow the gap in optimization performance between methods that calculate exact gradients and those that use directional derivatives.

Paper
Add Code

Striped Attention: Faster Ring Attention for Causal Transformers

1 code implementation • 15 Nov 2023 • William Brandon, Aniruddha Nrusimha, Kevin Qian, Zachary Ankner, Tian Jin, Zhiye Song, Jonathan Ragan-Kelley

In experiments running Striped Attention on A100 GPUs and TPUv4s, we are able to achieve up to 1. 45x end-to-end throughput improvements over the original Ring Attention algorithm on causal transformer training at a sequence length of 256k.

Paper
Code

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

no code implementations • 7 Oct 2023 • Tian Jin, Nolan Clement, Xin Dong, Vaishnavh Nagarajan, Michael Carbin, Jonathan Ragan-Kelley, Gintare Karolina Dziugaite

We study two natural scaling techniques -- weight pruning and simply training a smaller or larger model, which we refer to as dense scaling -- and their effects on two core capabilities of LLMs: (a) recalling facts presented during pre-training and (b) processing information presented in-context during inference.

In-Context Learning

Paper
Add Code

Differentiating Metropolis-Hastings to Optimize Intractable Densities

1 code implementation • 13 Jun 2023 • Gaurav Arya, Ruben Seyer, Frank Schäfer, Kartik Chandra, Alexander K. Lew, Mathieu Huot, Vikash K. Mansinghka, Jonathan Ragan-Kelley, Christopher Rackauckas, Moritz Schauer

We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it.

Paper
Code

Acting as Inverse Inverse Planning

no code implementations • 26 May 2023 • Kartik Chandra, Tzu-Mao Li, Josh Tenenbaum, Jonathan Ragan-Kelley

Great storytellers know how to take us on a journey.

Paper
Add Code

Designing Perceptual Puzzles by Differentiating Probabilistic Programs

no code implementations • 26 Apr 2022 • Kartik Chandra, Tzu-Mao Li, Joshua Tenenbaum, Jonathan Ragan-Kelley

We design new visual illusions by finding "adversarial examples" for principled models of human perception -- specifically, for probabilistic models, which treat vision as Bayesian inference.

Color Constancy Probabilistic Programming

Paper
Add Code

Differentiable Vector Graphics Rasterization for Editing and Learning

1 code implementation • ACM Transactions on Graphics 2020 • Tzu-Mao Li, Michal Lukáč, Michaël Gharbi, Jonathan Ragan-Kelley

We introduce a differentiable rasterizer that bridges the vector graphics and raster image domains, enabling powerful raster-based loss functions, optimization procedures, and machine learning techniques to edit and generate vector content.

Vector Graphics

865

Paper
Code

Neural Kernels Without Tangents

2 code implementations • ICML 2020 • Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Ludwig Schmidt, Jonathan Ragan-Kelley, Benjamin Recht

We investigate the connections between neural networks and simple building blocks in kernel space.

Paper
Code

Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration

5 code implementations • 22 Nov 2019 • Hasan Genc, Seah Kim, Alon Amid, Ameer Haj-Ali, Vighnesh Iyer, Pranav Prakash, Jerry Zhao, Daniel Grubb, Harrison Liew, Howard Mao, Albert Ou, Colin Schmidt, Samuel Steffl, John Wright, Ion Stoica, Jonathan Ragan-Kelley, Krste Asanovic, Borivoje Nikolic, Yakun Sophia Shao

DNN accelerators are often developed and evaluated in isolation without considering the cross-stack, system-level effects in real-world environments.

1,429

Paper
Code

DiffTaichi: Differentiable Programming for Physical Simulation

2 code implementations • ICLR 2020 • Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, Frédo Durand

We present DiffTaichi, a new differentiable programming language tailored for building high-performance differentiable physical simulators.

Physical Simulations

24,769

Paper
Code

Gradient Descent: The Ultimate Optimizer

2 code implementations • 29 Sep 2019 • Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer

This allows us to easily apply the method to other optimizers and hyperparameters (e. g. momentum coefficients).

BIG-bench Machine Learning Hyperparameter Optimization

358

Paper
Code

Programming Heterogeneous Systems from an Image Processing DSL

3 code implementations • 28 Oct 2016 • Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan-Kelley, Mark Horowitz

We address this problem by extending the image processing language, Halide, so users can specify which portions of their applications should become hardware accelerators, and then we provide a compiler that uses this code to automatically create the accelerator along with the "glue" code needed for the user's application to access this hardware.

Software Engineering

Paper
Code

A Systematic Approach to Blocking Convolutional Neural Networks

1 code implementation • 14 Jun 2016 • Xuan Yang, Jing Pu, Blaine Burton Rister, Nikhil Bhagdikar, Stephen Richardson, Shahar Kvatinsky, Jonathan Ragan-Kelley, Ardavan Pedram, Mark Horowitz

Convolutional Neural Networks (CNNs) are the state of the art solution for many computer vision problems, and many researchers have explored optimized implementations.

Blocking

205

Paper
Code

Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging

no code implementations • 22 Apr 2016 • Zachary DeVito, Michael Mara, Michael Zollhöfer, Gilbert Bernstein, Jonathan Ragan-Kelley, Christian Theobalt, Pat Hanrahan, Matthew Fisher, Matthias Nießner

Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.