no code implementations • 3 Apr 2024 • Ying Shen, Yizhe Zhang, Shuangfei Zhai, Lifu Huang, Joshua M. Susskind, Jiatao Gu
This paper introduces a domain-general framework for many-to-many image generation, capable of producing interrelated image series from a given set of images, offering a scalable solution that obviates the need for task-specific solutions across different multi-image scenarios.
no code implementations • 7 Dec 2023 • Vimal Thilak, Chen Huang, Omid Saremi, Laurent Dinh, Hanlin Goh, Preetum Nakkiran, Joshua M. Susskind, Etai Littwin
In this paper, we introduce LiDAR (Linear Discriminant Analysis Rank), a metric designed to measure the quality of representations within JE architectures.
no code implementations • 27 Nov 2023 • Yuyang Wang, Ahmed A. Elhag, Navdeep Jaitly, Joshua M. Susskind, Miguel Angel Bautista
In this paper we tackle the problem of generating conformers of a molecule in 3D space given its molecular graph.
no code implementations • 12 Oct 2023 • Xiaoming Zhao, Alex Colburn, Fangchang Ma, Miguel Angel Bautista, Joshua M. Susskind, Alexander G. Schwing
In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video.
no code implementations • 24 May 2023 • Ahmed A. Elhag, Yuyang Wang, Joshua M. Susskind, Miguel Angel Bautista
Our approach allows to sample continuous functions on manifolds and is invariant with respect to rigid and isometric transformations of the manifold.
no code implementations • 1 Mar 2023 • Peiye Zhuang, Samira Abnar, Jiatao Gu, Alex Schwing, Joshua M. Susskind, Miguel Ángel Bautista
Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains.
no code implementations • 29 Sep 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Joshua M. Susskind
We introduce Dot Product Attention Free Transformer (DAFT), an efficient variant of Transformers \citep{transformer} that eliminates the query-key dot product in self attention.
Ranked #623 on Image Classification on ImageNet
no code implementations • 12 Jul 2021 • Pengsheng Guo, Miguel Angel Bautista, Alex Colburn, Liang Yang, Daniel Ulbricht, Joshua M. Susskind, Qi Shan
We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects.
no code implementations • 2 Jul 2021 • Shih-Yu Sun, Vimal Thilak, Etai Littwin, Omid Saremi, Joshua M. Susskind
Deep linear networks trained with gradient descent yield low rank solutions, as is typically studied in matrix factorization.
no code implementations • 1 Jul 2021 • Etai Littwin, Omid Saremi, Shuangfei Zhai, Vimal Thilak, Hanlin Goh, Joshua M. Susskind, Greg Yang
We analyze the learning dynamics of infinitely wide neural networks with a finite sized bottle-neck.
1 code implementation • ICCV 2021 • Terrance DeVries, Miguel Angel Bautista, Nitish Srivastava, Graham W. Taylor, Joshua M. Susskind
In this paper, we introduce Generative Scene Networks (GSN), which learns to decompose scenes into a collection of many local radiance fields that can be rendered from a free moving camera.
Ranked #1 on Scene Generation on VizDoom
no code implementations • 1 Jan 2021 • Yue Wu, Shuangfei Zhai, Nitish Srivastava, Joshua M. Susskind, Jian Zhang, Ruslan Salakhutdinov, Hanlin Goh
Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.
2 code implementations • ICCV 2021 • Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, Joshua M. Susskind
To create our dataset, we leverage a large repository of synthetic scenes created by professional artists, and we generate 77, 400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry.
no code implementations • 27 Jun 2020 • Miguel Angel Bautista, Walter Talbott, Shuangfei Zhai, Nitish Srivastava, Joshua M. Susskind
State-of-the-art learning-based monocular 3D reconstruction methods learn priors over object categories on the training set, and as a result struggle to achieve reasonable generalization to object categories unseen during training.
1 code implementation • NeurIPS 2019 • Shuangfei Zhai, Walter Talbott, Carlos Guestrin, Joshua M. Susskind
In contrast to a traditional view where the discriminator learns a constant function when reaching convergence, here we show that it can provide useful information for downstream tasks, e. g., feature extraction for classification.
no code implementations • 28 Oct 2019 • Alaaeldin El-Nouby, Shuangfei Zhai, Graham W. Taylor, Joshua M. Susskind
Deep neural networks require collecting and annotating large amounts of data to train successfully.
Ranked #44 on Self-Supervised Action Recognition on UCF101
no code implementations • 25 Sep 2019 • Shuangfei Zhai, Carlos Guestrin, Joshua M. Susskind
During inference time, the HBAE consists of two sampling steps: first a latent code for the input is sampled, and then this code is passed to the conditional generator to output a stochastic reconstruction.