Search Results for author: Thomas Funkhouser

Found 66 papers, 33 papers with code

Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing

no code implementations5 Dec 2023 Yushi Lan, Feitong Tan, Di Qiu, Qiangeng Xu, Kyle Genova, Zeng Huang, Sean Fanello, Rohit Pandey, Thomas Funkhouser, Chen Change Loy, yinda zhang

We present a novel framework for generating photorealistic 3D human head and subsequently manipulating and reposing them with remarkable flexibility.

Face Model

TidyBot: Personalized Robot Assistance with Large Language Models

1 code implementation9 May 2023 Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran Song, Jeannette Bohg, Szymon Rusinkiewicz, Thomas Funkhouser

For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios.

Polynomial Neural Fields for Subband Decomposition and Manipulation

1 code implementation9 Feb 2023 Guandao Yang, Sagie Benaim, Varun Jampani, Kyle Genova, Jonathan T. Barron, Thomas Funkhouser, Bharath Hariharan, Serge Belongie

We use this framework to design Fourier PNFs, which match state-of-the-art performance in signal representation tasks that use neural fields.

FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations

no code implementations25 Nov 2022 Christian Diller, Thomas Funkhouser, Angela Dai

Thus, we design our method to only require 2D RGB data while being able to generate 3D human motion sequences.

Pose Prediction

Learning Pneumatic Non-Prehensile Manipulation with a Mobile Blower

1 code implementation5 Apr 2022 Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Szymon Rusinkiewicz, Thomas Funkhouser

We investigate pneumatic non-prehensile manipulation (i. e., blowing) as a means of efficiently moving scattered objects into a target receptacle.

Neural Dual Contouring

2 code implementations4 Feb 2022 Zhiqin Chen, Andrea Tagliasacchi, Thomas Funkhouser, Hao Zhang

We introduce neural dual contouring (NDC), a new data-driven approach to mesh reconstruction based on dual contouring (DC).

Surface Reconstruction

Revisiting 3D Object Detection From an Egocentric Perspective

no code implementations NeurIPS 2021 Boyang Deng, Charles R. Qi, Mahyar Najibi, Thomas Funkhouser, Yin Zhou, Dragomir Anguelov

Given the insight that SDE would benefit from more accurate geometry descriptions, we propose to represent objects as amodal contours, specifically amodal star-shaped polygons, and devise a simple model, StarPoly, to predict such contours.

3D Object Detection Autonomous Driving +2

Urban Radiance Fields

no code implementations CVPR 2022 Konstantinos Rematas, Andrew Liu, Pratul P. Srinivasan, Jonathan T. Barron, Andrea Tagliasacchi, Thomas Funkhouser, Vittorio Ferrari

The goal of this work is to perform 3D reconstruction and novel view synthesis from data captured by scanning platforms commonly deployed for world mapping in urban outdoor environments (e. g., Street View).

3D Reconstruction Novel View Synthesis

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

1 code implementation CVPR 2022 Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass.

Novel View Synthesis Semantic Segmentation

Learning 3D Semantic Segmentation with only 2D Image Supervision

no code implementations21 Oct 2021 Kyle Genova, Xiaoqi Yin, Abhijit Kundu, Caroline Pantofaru, Forrester Cole, Avneesh Sud, Brian Brewington, Brian Shucker, Thomas Funkhouser

With the recent growth of urban mapping and autonomous driving efforts, there has been an explosion of raw 3D data collected from terrestrial platforms with lidar scanners and color cameras.

3D Semantic Segmentation Autonomous Driving +1

Multiresolution Deep Implicit Functions for 3D Shape Representation

no code implementations ICCV 2021 Zhang Chen, yinda zhang, Kyle Genova, Sean Fanello, Sofien Bouaziz, Christian Haene, Ruofei Du, Cem Keskin, Thomas Funkhouser, Danhang Tang

To the best of our knowledge, MDIF is the first deep implicit function model that can at the same time (1) represent different levels of detail and allow progressive decoding; (2) support both encoder-decoder inference and decoder-only latent optimization, and fulfill multiple applications; (3) perform detailed decoder-only shape completion.

3D Reconstruction 3D Shape Representation

Spatial Intention Maps for Multi-Agent Mobile Manipulation

1 code implementation23 Mar 2021 Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Szymon Rusinkiewicz, Thomas Funkhouser

The ability to communicate intention enables decentralized multi-agent robots to collaborate while performing physical tasks.

IBRNet: Learning Multi-View Image-Based Rendering

1 code implementation CVPR 2021 Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser

Unlike neural scene representation work that optimizes per-scene functions for rendering, we learn a generic view interpolation function that generalizes to novel scenes.

Neural Rendering Novel View Synthesis

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding

no code implementations24 Dec 2020 Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong

Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.

Contrastive Learning Representation Learning +1

Object-Centric Neural Scene Rendering

no code implementations15 Dec 2020 Michelle Guo, Alireza Fathi, Jiajun Wu, Thomas Funkhouser

We present a method for composing photorealistic scenes from captured images of objects.

Object

Forecasting Characteristic 3D Poses of Human Actions

no code implementations CVPR 2022 Christian Diller, Thomas Funkhouser, Angela Dai

To predict characteristic poses, we propose a probabilistic approach that models the possible multi-modality in the distribution of likely characteristic poses.

Human motion prediction motion prediction +1

Learning to Infer Semantic Parameters for 3D Shape Editing

no code implementations9 Nov 2020 Fangyin Wei, Elena Sizikova, Avneesh Sud, Szymon Rusinkiewicz, Thomas Funkhouser

Many applications in 3D shape design and augmentation require the ability to make specific edits to an object's semantic parameters (e. g., the pose of a person's arm or the length of an airplane's wing) while preserving as much existing details as possible.

Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object Detection

no code implementations24 Sep 2020 Yue Wang, Alireza Fathi, Jiajun Wu, Thomas Funkhouser, Justin Solomon

A common dilemma in 3D object detection for autonomous driving is that high-quality, dense point clouds are only available during training, but not testing.

3D Object Detection Autonomous Driving +3

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

no code implementations CVPR 2021 Li Yi, Boqing Gong, Thomas Funkhouser

We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors.

Semantic Segmentation Unsupervised Domain Adaptation

Spatial Action Maps for Mobile Manipulation

1 code implementation20 Apr 2020 Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Johnny Lee, Szymon Rusinkiewicz, Thomas Funkhouser

Typical end-to-end formulations for learning robotic navigation involve predicting a small set of steering command actions (e. g., step forward, turn left, turn right, etc.)

Q-Learning Value prediction

Local Implicit Grid Representations for 3D Scenes

1 code implementation19 Mar 2020 Chiyu Max Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, Thomas Funkhouser

Then, we use the decoder as a component in a shape optimization that solves for a set of latent codes on a regular grid of overlapping crops such that an interpolation of the decoded local shapes matches a partial or noisy observation.

3D Shape Representation Surface Reconstruction

Adversarial Texture Optimization from RGB-D Scans

1 code implementation CVPR 2020 Jingwei Huang, Justus Thies, Angela Dai, Abhijit Kundu, Chiyu Max Jiang, Leonidas Guibas, Matthias Nießner, Thomas Funkhouser

In this work, we present a novel approach for color texture generation using a conditional adversarial loss obtained from weakly-supervised views.

Surface Reconstruction Texture Synthesis

Local Deep Implicit Functions for 3D Shape

1 code implementation CVPR 2020 Kyle Genova, Forrester Cole, Avneesh Sud, Aaron Sarna, Thomas Funkhouser

The goal of this project is to learn a 3D shape representation that enables accurate surface reconstruction, compact storage, efficient computation, consistency for similar shapes, generalization across diverse shape categories, and inference from depth camera observations.

3D Shape Representation Surface Reconstruction

Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost Demonstrations

no code implementations9 Dec 2019 Shuran Song, Andy Zeng, Johnny Lee, Thomas Funkhouser

A key aspect of our grasping model is that it uses "action-view" based rendering to simulate future states with respect to different possible actions.

Rescan: Inductive Instance Segmentation for Indoor RGBD Scans

no code implementations ICCV 2019 Maciej Halber, Yifei Shi, Kai Xu, Thomas Funkhouser

In depth-sensing applications ranging from home robotics to AR/VR, it will be common to acquire 3D scans of interior spaces repeatedly at sparse time intervals (e. g., as part of regular daily use).

Instance Segmentation Segmentation +1

Neural Illumination: Lighting Prediction for Indoor Environments

no code implementations CVPR 2019 Shuran Song, Thomas Funkhouser

This paper addresses the task of estimating the light arriving from all directions to a 3D point observed at a selected pixel in an RGB image.

Learning Shape Templates with Structured Implicit Functions

1 code implementation ICCV 2019 Kyle Genova, Forrester Cole, Daniel Vlasic, Aaron Sarna, William T. Freeman, Thomas Funkhouser

To allow for widely varying geometry and topology, we choose an implicit surface representation based on composition of local shape elements.

Semantic Segmentation

TossingBot: Learning to Throw Arbitrary Objects with Residual Physics

no code implementations27 Mar 2019 Andy Zeng, Shuran Song, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser

In this work, we propose an end-to-end formulation that jointly learns to infer control parameters for grasping and throwing motion primitives from visual observations (images of arbitrary objects in a bin) through trial and error.

Friction

TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes

1 code implementation CVPR 2019 Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Nießner, Leonidas Guibas

We introduce, TextureNet, a neural network architecture designed to extract features from high-resolution signals associated with 3D surface meshes (e. g., color texture maps).

3D Semantic Segmentation

Structure-Aware Shape Synthesis

no code implementations4 Aug 2018 Elena Balashova, Vivek Singh, Jiangping Wang, Brian Teixeira, Terrence Chen, Thomas Funkhouser

We propose a new procedure to guide training of a data-driven shape generative model using a structure-aware loss function.

Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View

no code implementations CVPR 2018 Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser

We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image.

Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning

4 code implementations27 Mar 2018 Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser

Skilled robotic manipulation benefits from complex synergies between non-prehensile (e. g. pushing) and prehensile (e. g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help displace objects to make pushing movements more precise and collision-free.

Q-Learning reinforcement-learning +1

Deep Depth Completion of a Single RGB-D Image

1 code implementation CVPR 2018 Yinda Zhang, Thomas Funkhouser

The goal of our work is to complete the depth channel of an RGB-D image.

Depth Completion Depth Estimation

Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

no code implementations12 Dec 2017 Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser

We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image.

MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments

2 code implementations11 Dec 2017 Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, Vladlen Koltun

We present MINOS, a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments.

Navigate reinforcement-learning +1

Interactive 3D Modeling with a Generative Adversarial Network

no code implementations16 Jun 2017 Jerry Liu, Fisher Yu, Thomas Funkhouser

This paper proposes the idea of using a generative adversarial network (GAN) to assist a novice user in designing real-world shapes with a simple interface.

Generative Adversarial Network

Dilated Residual Networks

3 code implementations CVPR 2017 Fisher Yu, Vladlen Koltun, Thomas Funkhouser

Convolutional networks for image classification progressively reduce resolution until the image is represented by tiny feature maps in which the spatial structure of the scene is no longer discernible.

Classification General Classification +4

Learning Where to Look: Data-Driven Viewpoint Set Selection for 3D Scenes

no code implementations7 Apr 2017 Kyle Genova, Manolis Savva, Angel X. Chang, Thomas Funkhouser

We provide a search algorithm that generates a sampling of likely candidate views according to the example distribution, and a set selection algorithm that chooses a subset of the candidates that jointly cover the example distribution.

Segmentation Semantic Segmentation

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

no code implementations CVPR 2017 Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, Thomas Funkhouser

One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection.

Boundary Detection Edge Detection +4

Semantic Scene Completion from a Single Depth Image

3 code implementations CVPR 2017 Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser

This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.

3D Semantic Scene Completion

Fine-To-Coarse Global Registration of RGB-D Scans

no code implementations CVPR 2017 Maciej Halber, Thomas Funkhouser

RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality.

3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions

2 code implementations CVPR 2017 Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, Thomas Funkhouser

To amass training data for our model, we propose a self-supervised feature learning method that leverages the millions of correspondence labels found in existing RGB-D reconstructions.

3D Reconstruction Point Cloud Registration

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

4 code implementations10 Jun 2015 Fisher Yu, Ari Seff, yinda zhang, Shuran Song, Thomas Funkhouser, Jianxiong Xiao

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry.

Semantic Alignment of LiDAR Data at City Scale

no code implementations CVPR 2015 Fisher Yu, Jianxiong Xiao, Thomas Funkhouser

This paper describes an automatic algorithm for global alignment of LiDAR data collected with Google Street View cars in urban environments.

Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.