Search Results for author: David J. Fleet

Found 39 papers, 19 papers with code

A Personalized Video-Based Hand Taxonomy: Application for Individuals with Spinal Cord Injury

no code implementations • 26 Mar 2024 • Mehdy Dousty, David J. Fleet, José Zariffa

A comprehensive evaluation of function in home and community settings requires a hand grasp taxonomy for individuals with impaired hand function.

Paper
Add Code

Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model

no code implementations • 20 Dec 2023 • Saurabh Saxena, Junhwa Hur, Charles Herrmann, Deqing Sun, David J. Fleet

In contrast, we advocate a generic, task-agnostic diffusion model, with several advancements such as log-scale depth parameterization to enable joint modeling of indoor and outdoor scenes, conditioning on the field-of-view (FOV) to handle scale ambiguity and synthetically augmenting FOV during training to generalize beyond the limited camera intrinsics in training datasets.

Denoising Monocular Depth Estimation

Paper
Add Code

The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation

no code implementations • NeurIPS 2023 • Saurabh Saxena, Charles Herrmann, Junhwa Hur, Abhishek Kar, Mohammad Norouzi, Deqing Sun, David J. Fleet

Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity.

Denoising Image Generation +2

Paper
Add Code

Synthetic Data from Diffusion Models Improves ImageNet Classification

no code implementations • 17 Apr 2023 • Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, David J. Fleet

Deep generative models are becoming increasingly powerful, now generating diverse high fidelity photo-realistic samples given text prompts.

Classification Data Augmentation

Paper
Add Code

Monocular Depth Estimation using Diffusion Models

no code implementations • 28 Feb 2023 • Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet

To cope with the limited availability of data for supervised training, we leverage pre-training on self-supervised image-to-image translation tasks.

Ranked #22 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Denoising Image-to-Image Translation +3

Paper
Add Code

RobustNeRF: Ignoring Distractors with Robust Losses

1 code implementation • CVPR 2023 • Sara Sabour, Suhani Vora, Daniel Duckworth, Ivan Krasin, David J. Fleet, Andrea Tagliasacchi

To cope with distractors, we advocate a form of robust estimation for NeRF training, modeling distractors in training data as outliers of an optimization problem.

312

Paper
Code

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

no code implementations • CVPR 2023 • Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan

Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

Image Inpainting Object +1

Paper
Add Code

Gaussian-Bernoulli RBMs Without Tears

1 code implementation • 19 Oct 2022 • Renjie Liao, Simon Kornblith, Mengye Ren, David J. Fleet, Geoffrey Hinton

We revisit the challenging problem of training Gaussian-Bernoulli restricted Boltzmann machines (GRBMs), introducing two innovations.

Paper
Code

A Generalist Framework for Panoptic Segmentation of Images and Videos

1 code implementation • ICCV 2023 • Ting Chen, Lala Li, Saurabh Saxena, Geoffrey Hinton, David J. Fleet

Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image.

Inductive Bias Panoptic Segmentation +2

817

Paper
Code

Imagen Video: High Definition Video Generation with Diffusion Models

no code implementations • 5 Oct 2022 • Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models.

Ranked #1 on Video Generation on LAION-400M

Image Generation Video Generation +3

Paper
Add Code

A Unified Sequence Interface for Vision Tasks

1 code implementation • 15 Jun 2022 • Ting Chen, Saurabh Saxena, Lala Li, Tsung-Yi Lin, David J. Fleet, Geoffrey Hinton

Despite that, by formulating the output of each task as a sequence of discrete tokens with a unified interface, we show that one can train a neural network with a single model architecture and loss function on all these tasks, with no task-specific customization.

Image Captioning Instance Segmentation +2

817

Paper
Code

Residual Multiplicative Filter Networks for Multiscale Reconstruction

1 code implementation • 1 Jun 2022 • Shayan shekarforoush, David B. Lindell, David J. Fleet, Marcus A. Brubaker

Coordinate networks like Multiplicative Filter Networks (MFNs) and BACON offer some control over the frequency spectrum used to represent continuous signals such as images or 3D volumes.

Paper
Code

Video Diffusion Models

3 code implementations • 7 Apr 2022 • Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, David J. Fleet

Generating temporally coherent high fidelity video is an important milestone in generative modeling research.

Ranked #1 on Video Generation on UCF-101 16 frames, 64x64, Unconditional

Unconditional Video Generation Video Prediction

1,843

Paper
Code

Kubric: A scalable dataset generator

1 code implementation • CVPR 2022 • Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi

Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details.

Fairness Optical Flow Estimation

2,185

Paper
Code

Palette: Image-to-Image Diffusion Models

5 code implementations • 10 Nov 2021 • Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

We expect this standardized evaluation protocol to play a role in advancing image-to-image translation research.

Ranked #1 on Colorization on ImageNet ctest10k

Colorization Denoising +5

1,394

Paper
Code

Pix2seq: A Language Modeling Framework for Object Detection

6 code implementations • ICLR 2022 • Ting Chen, Saurabh Saxena, Lala Li, David J. Fleet, Geoffrey Hinton

We present Pix2Seq, a simple and generic framework for object detection.

Ranked #77 on Object Detection on COCO minival (using extra training data)

Language Modelling Object +2

817

Paper
Code

Cascaded Diffusion Models for High Fidelity Image Generation

no code implementations • 30 May 2021 • Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans

We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality.

Ranked #2 on Image Generation on ImageNet 64x64

Data Augmentation Image Generation +2

Paper
Add Code

Image Super-Resolution via Iterative Refinement

4 code implementations • 15 Apr 2021 • Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, Mohammad Norouzi

We present SR3, an approach to image Super-Resolution via Repeated Refinement.

Conditional Image Generation Denoising +1

3,386

Paper
Code

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation • 17 Feb 2021 • Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

Paper
Code

Unsupervised part representation by Flow Capsules

no code implementations • 27 Nov 2020 • Sara Sabour, Andrea Tagliasacchi, Soroosh Yazdani, Geoffrey E. Hinton, David J. Fleet

Capsule networks aim to parse images into a hierarchy of objects, parts and relations.

Decoder Unsupervised Image Classification

Paper
Add Code

A Study of Gradient Variance in Deep Learning

1 code implementation • 9 Jul 2020 • Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba

We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.

Clustering

Paper
Code

Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation

1 code implementation • NeurIPS 2020 • Sajad Norouzi, David J. Fleet, Mohammad Norouzi

We introduce Exemplar VAEs, a family of generative models that bridge the gap between parametric and non-parametric, exemplar based generative models.

Data Augmentation Density Estimation +3

Paper
Code

SentenceMIM: A Latent Variable Language Model

1 code implementation • 18 Feb 2020 • Micha Livne, Kevin Swersky, David J. Fleet

MIM learning encourages high mutual information between observations and latent variables, and is robust against posterior collapse.

Ranked #1 on Question Answering on YahooCQA (using extra training data)

Language Modelling Question Answering +1

Paper
Code

MIM: Mutual Information Machine

1 code implementation • 8 Oct 2019 • Micha Livne, Kevin Swersky, David J. Fleet

Experiments show that MIM learns representations with high mutual information, consistent encoding and decoding distributions, effective latent clustering, and data log likelihood comparable to VAE, while avoiding posterior collapse.

Clustering Decoder

Paper
Code

High Mutual Information in Representation Learning with Symmetric Variational Inference

no code implementations • 4 Oct 2019 • Micha Livne, Kevin Swersky, David J. Fleet

We introduce the Mutual Information Machine (MIM), a novel formulation of representation learning, using a joint distribution over the observations and latent state in an encoder/decoder framework.

Decoder Representation Learning +2

Paper
Add Code

Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture

no code implementations • 4 Dec 2018 • Micha Livne, Leonid Sigal, Marcus A. Brubaker, David J. Fleet

To our knowledge, this is the first approach to take physics into account without explicit {\em a priori} knowledge of the environment or body dimensions.

Markerless Motion Capture

Paper
Add Code

TzK Flow - Conditional Generative Model

no code implementations • 5 Nov 2018 • Micha Livne, David J. Fleet

Unlike autoencoders, the bottleneck does not limit model expressiveness, similar to flow-based ML.

Paper
Add Code

VSE++: Improving Visual-Semantic Embeddings with Hard Negatives

10 code implementations • 18 Jul 2017 • Fartash Faghri, David J. Fleet, Jamie Ryan Kiros, Sanja Fidler

We present a new technique for learning visual-semantic embeddings for cross-modal retrieval.

Ranked #23 on Cross-Modal Retrieval on Flickr30k

Cross-Modal Retrieval Image Retrieval +3

486

Paper
Code

Transductive Log Opinion Pool of Gaussian Process Experts

no code implementations • 24 Nov 2015 • Yanshuai Cao, David J. Fleet

We introduce a framework for analyzing transductive combination of Gaussian process (GP) experts, where independently trained GP experts are combined in a way that depends on test point location, in order to scale GPs to big data.

Paper
Add Code

Adversarial Manipulation of Deep Representations

2 code implementations • 16 Nov 2015 • Sara Sabour, Yanshuai Cao, Fartash Faghri, David J. Fleet

We show that the representation of an image in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image.

Paper
Code

Efficient non-greedy optimization of decision trees

no code implementations • NeurIPS 2015 • Mohammad Norouzi, Maxwell D. Collins, Matthew Johnson, David J. Fleet, Pushmeet Kohli

In this paper, we present an algorithm for optimizing the split functions at all levels of the tree jointly with the leaf parameters, based on a global objective.

Structured Prediction

Paper
Add Code

CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits

no code implementations • 19 Jun 2015 • Mohammad Norouzi, Maxwell D. Collins, David J. Fleet, Pushmeet Kohli

We develop a convex-concave upper bound on the classification loss for a one-level decision tree, and optimize the bound by stochastic gradient descent at each internal node of the tree.

General Classification Multi-class Classification

Paper
Add Code

Building Proteins in a Day: Efficient 3D Molecular Reconstruction

no code implementations • CVPR 2015 • Marcus A. Brubaker, Ali Punjani, David J. Fleet

A new framework for estimation is introduced which relies on modern stochastic optimization techniques to scale to large datasets.

3D Reconstruction Stochastic Optimization

Paper
Add Code

Generalized Product of Experts for Automatic and Principled Fusion of Gaussian Process Predictions

no code implementations • 28 Oct 2014 • Yanshuai Cao, David J. Fleet

In this work, we propose a generalized product of experts (gPoE) framework for combining the predictions of multiple probabilistic models.

Gaussian Processes valid

Paper
Add Code

Posebits for Monocular Human Pose Estimation

no code implementations • CVPR 2014 • Gerard Pons-Moll, David J. Fleet, Bodo Rosenhahn

We advocate the inference of qualitative information about 3D human pose, called posebits, from images.

3D Pose Estimation Image Retrieval +1

Paper
Add Code

Efficient Optimization for Sparse Gaussian Process Regression

no code implementations • NeurIPS 2013 • Yanshuai Cao, Marcus A. Brubaker, David J. Fleet, Aaron Hertzmann

We propose an efficient optimization algorithm for selecting a subset of training data to induce sparsity for Gaussian process regression.

regression

Paper
Add Code

Fast Exact Search in Hamming Space with Multi-Index Hashing

2 code implementations • 11 Jul 2013 • Mohammad Norouzi, Ali Punjani, David J. Fleet

There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search.

285

Paper
Code

Cartesian K-Means

1 code implementation • CVPR 2013 • Mohammad Norouzi, David J. Fleet

A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and runtime cost associated with the large numbers of clusters required to keep quantization errors small and model fidelity high.

Clustering Object Recognition +2

Paper
Code

Hamming Distance Metric Learning

no code implementations • NeurIPS 2012 • Mohammad Norouzi, David J. Fleet, Ruslan R. Salakhutdinov

Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity.

General Classification Metric Learning +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.