Search Results for author: Fisher Yu

Found 96 papers, 66 papers with code

UniDepth: Universal Monocular Metric Depth Estimation

1 code implementation27 Mar 2024 Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc van Gool, Fisher Yu

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

 Ranked #1 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Monocular Depth Estimation

DexDribbler: Learning Dexterous Soccer Manipulation via Dynamic Supervision

1 code implementation21 Mar 2024 Yutong Hu, Kehan Wen, Fisher Yu

Learning dexterous locomotion policy for legged robots is becoming increasingly popular due to its ability to handle diverse terrains and resemble intelligent behaviors.

S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

no code implementations21 Jan 2024 Zhiyuan Wu, Yi Feng, Chuang-Wei Liu, Fisher Yu, Qijun Chen, Rui Fan

Hence, in this article, we introduce S$^3$M-Net, a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously.

Autonomous Driving Scene Understanding +2

ICGNet: A Unified Approach for Instance-Centric Grasping

no code implementations18 Jan 2024 René Zurbrügg, Yifan Liu, Francis Engelmann, Suryansh Kumar, Marco Hutter, Vaishakh Patil, Fisher Yu

Executing a successful grasp in a cluttered environment requires multiple levels of scene understanding: First, the robot needs to analyze the geometric properties of individual objects to find feasible grasps.

Object Object Reconstruction +1

MuRF: Multi-Baseline Radiance Fields

1 code implementation7 Dec 2023 Haofei Xu, Anpei Chen, Yuedong Chen, Christos Sakaridis, Yulun Zhang, Marc Pollefeys, Andreas Geiger, Fisher Yu

We present Multi-Baseline Radiance Fields (MuRF), a general feed-forward approach to solving sparse view synthesis under multiple different baseline settings (small and large baselines, and different number of input views).

Zero-shot Generalization

Gaussian Grouping: Segment and Edit Anything in 3D Scenes

1 code implementation1 Dec 2023 Mingqiao Ye, Martin Danelljan, Fisher Yu, Lei Ke

To address this issue, we propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes.

Colorization Novel View Synthesis +2

Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding

1 code implementation NeurIPS 2023 Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc van Gool

The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants.

Autonomous Driving motion prediction

COOLer: Class-Incremental Learning for Appearance-Based Multiple Object Tracking

1 code implementation4 Oct 2023 Zhizheng Liu, Mattia Segu, Fisher Yu

Continual learning allows a model to learn multiple tasks sequentially while retaining the old knowledge without the training data of the preceding tasks.

Class Incremental Learning Disentanglement +2

DARTH: Holistic Test-time Adaptation for Multiple Object Tracking

1 code implementation ICCV 2023 Mattia Segu, Bernt Schiele, Fisher Yu

However, the nature of a MOT system is manifold - requiring object detection and instance association - and adapting all its components is non-trivial.

Autonomous Driving Multiple Object Tracking +4

Distilling ODE Solvers of Diffusion Models into Smaller Steps

no code implementations28 Sep 2023 Sanghwan Kim, Hao Tang, Fisher Yu

Notably, our method incurs negligible computational overhead compared to previous distillation techniques, facilitating straightforward and rapid integration with existing samplers.

Denoising Knowledge Distillation

Three Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

no code implementations8 Sep 2023 Ozan Unal, Christos Sakaridis, Suman Saha, Fisher Yu, Luc van Gool

A common formulation to tackle 3D visual grounding is grounding-by-detection, where localization is done via bounding boxes.

3D Instance Segmentation Object +3

MolGrapher: Graph-based Visual Recognition of Chemical Structures

1 code implementation ICCV 2023 Lucas Morin, Martin Danelljan, Maria Isabel Agea, Ahmed Nassar, Valery Weber, Ingmar Meijer, Peter Staar, Fisher Yu

In addition, we introduce a large-scale benchmark of annotated real molecule images, USPTO-30K, to spur research on this critical topic.

Video OWL-ViT: Temporally-consistent open-world localization in video

no code implementations ICCV 2023 Georg Heigold, Matthias Minderer, Alexey Gritsenko, Alex Bewley, Daniel Keysers, Mario Lučić, Fisher Yu, Thomas Kipf

Our model is end-to-end trainable on video data and enjoys improved temporal consistency compared to tracking-by-detection baselines, while retaining the open-world capabilities of the backbone detector.

Object Object Localization

Dual Aggregation Transformer for Image Super-Resolution

1 code implementation ICCV 2023 Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang, Fisher Yu

Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner.

Image Super-Resolution

Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects

1 code implementation6 Aug 2023 Chunming He, Kai Li, Yachao Zhang, Yulun Zhang, Zhenhua Guo, Xiu Li, Martin Danelljan, Fisher Yu

On the prey side, we propose an adversarial training framework, Camouflageator, which introduces an auxiliary generator to generate more camouflaged objects that are harder for a COD method to detect.

object-detection Object Detection

Segment Anything Meets Point Tracking

1 code implementation3 Jul 2023 Frano Rajič, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

The Segment Anything Model (SAM) has established itself as a powerful zero-shot image segmentation model, enabled by efficient point-centric annotation and prompt-based models.

Interactive Video Object Segmentation Object +5

Condition-Invariant Semantic Segmentation

1 code implementation27 May 2023 Christos Sakaridis, David Bruggemann, Fisher Yu, Luc van Gool

Motivated by these findings, we propose to leverage stylization in performing feature-level adaptation by aligning the internal network features extracted by the encoder of the network from the original and the stylized view of each input image with a novel feature invariance loss.

Segmentation Semantic Segmentation +1

Maskomaly:Zero-Shot Mask Anomaly Segmentation

no code implementations26 May 2023 Jan Ackermann, Christos Sakaridis, Fisher Yu

We present a simple and practical framework for anomaly segmentation called Maskomaly.

Informativeness Segmentation +1

How To Not Train Your Dragon: Training-free Embodied Object Goal Navigation with Semantic Frontiers

no code implementations26 May 2023 Junting Chen, Guohao Li, Suryansh Kumar, Bernard Ghanem, Fisher Yu

Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.

Imitation Learning Navigate +2

iDisc: Internal Discretization for Monocular Depth Estimation

2 code implementations CVPR 2023 Luigi Piccinelli, Christos Sakaridis, Fisher Yu

Our method sets the new state of the art with significant improvements on NYU-Depth v2 and KITTI, outperforming all published methods on the official KITTI benchmark.

Autonomous Driving Monocular Depth Estimation +3

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction

2 code implementations7 Mar 2023 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles.

Autonomous Driving Model-based Reinforcement Learning +1

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

1 code implementation7 Mar 2023 Nick Bührer, Zhejun Zhang, Alexander Liniger, Fisher Yu, Luc van Gool

To this end, we propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.

Navigate reinforcement-learning +3

Uncertainty-Driven Dense Two-View Structure from Motion

no code implementations1 Feb 2023 Weirong Chen, Suryansh Kumar, Fisher Yu

This work introduces an effective and practical solution to the dense two-view structure from motion (SfM) problem.

Depth Estimation Optical Flow Estimation +2

BiBench: Benchmarking and Analyzing Network Binarization

1 code implementation26 Jan 2023 Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu

Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width.

Benchmarking Binarization

3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection

1 code implementation ICCV 2023 Changyong Shu, Jiajun Deng, Fisher Yu, Yifan Liu

Although 3D measurements are not available at the inference time of monocular 3D object detection, 3DPPE uses predicted depth to approximate the real point positions.

Monocular 3D Object Detection object-detection

CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion

no code implementations2 Dec 2022 Tobias Fischer, Yung-Hsu Yang, Suryansh Kumar, Min Sun, Fisher Yu

To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle's full surroundings.

3D Object Tracking Autonomous Vehicles +2

3DPPE: 3D Point Positional Encoding for Multi-Camera 3D Object Detection Transformers

1 code implementation27 Nov 2022 Changyong Shu, Jiajun Deng, Fisher Yu, Yifan Liu

Although 3D measurements are not available at the inference time of monocular 3D object detection, 3DPPE uses predicted depth to approximate the real point positions.

Monocular 3D Object Detection Monocular Depth Estimation +1

Unifying Flow, Stereo and Depth Estimation

1 code implementation10 Nov 2022 Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Fisher Yu, DaCheng Tao, Andreas Geiger

We present a unified formulation and model for three motion and 3D perception tasks: optical flow, rectified stereo matching and unrectified stereo depth estimation from posed images.

Optical Flow Estimation Stereo Depth Estimation +1

Normalization Perturbation: A Simple Domain Generalization Method for Real-World Domain Shifts

no code implementations8 Nov 2022 Qi Fan, Mattia Segu, Yu-Wing Tai, Fisher Yu, Chi-Keung Tang, Bernt Schiele, Dengxin Dai

Thus, we propose to perturb the channel statistics of source domain features to synthesize various latent styles, so that the trained deep model can perceive diverse potential domains and generalizes well even without observations of target domain data in training.

Autonomous Driving Domain Generalization

Learning Deep Sensorimotor Policies for Vision-based Autonomous Drone Racing

no code implementations26 Oct 2022 Jiawei Fu, Yunlong Song, Yan Wu, Fisher Yu, Davide Scaramuzza

The resulting policy directly infers control commands with feature representations learned from raw images, forgoing the need for globally-consistent state estimation, trajectory planning, and handcrafted control design.

Contrastive Learning Trajectory Planning

Composite Learning for Robust and Effective Dense Predictions

no code implementations13 Oct 2022 Menelaos Kanakis, Thomas E. Huang, David Bruggemann, Fisher Yu, Luc van Gool

In this paper, we find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.

Boundary Detection Monocular Depth Estimation +3

Fast Hierarchical Learning for Few-Shot Object Detection

no code implementations10 Oct 2022 Yihang She, Goutam Bhat, Martin Danelljan, Fisher Yu

These approaches however suffer from ``catastrophic forgetting'' issue due to finetuning of base detector, leading to sub-optimal performance on the base classes.

Few-Shot Object Detection Object +2

Spatio-Temporal Action Detection Under Large Motion

no code implementations6 Sep 2022 Gurkirt Singh, Vasileios Choutas, Suman Saha, Fisher Yu, Luc van Gool

Current methods for spatiotemporal action tube detection often extend a bounding box proposal at a given keyframe into a 3D temporal cuboid and pool features from nearby frames.

Action Detection

Video Mask Transfiner for High-Quality Video Instance Segmentation

1 code implementation28 Jul 2022 Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

While Video Instance Segmentation (VIS) has seen rapid progress, current approaches struggle to predict high-quality masks with accurate boundary details.

Instance Segmentation Semantic Segmentation +2

Tracking Every Thing in the Wild

1 code implementation26 Jul 2022 Siyuan Li, Martin Danelljan, Henghui Ding, Thomas E. Huang, Fisher Yu

Our experiments show that TETA evaluates trackers more comprehensively, and TETer achieves significant improvements on the challenging large-scale datasets BDD100K and TAO compared to the state-of-the-art.

Benchmarking Classification +2

Learning Online Multi-Sensor Depth Fusion

1 code implementation7 Apr 2022 Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool

Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.

3D Reconstruction Mixed Reality +1

LiDAR Snowfall Simulation for Robust 3D Object Detection

1 code implementation CVPR 2022 Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.

Autonomous Driving Object +3

Transforming Model Prediction for Tracking

1 code implementation CVPR 2022 Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc van Gool

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function.

Ranked #19 on Visual Object Tracking on LaSOT (Precision metric)

Inductive Bias Visual Object Tracking

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

3 code implementations CVPR 2022 Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc van Gool

In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.

Denoising Image Inpainting

SAGA: Stochastic Whole-Body Grasping with Contact

1 code implementation19 Dec 2021 Yan Wu, Jiahao Wang, Yan Zhang, Siwei Zhang, Otmar Hilliges, Fisher Yu, Siyu Tang

Given an initial pose and the generated whole-body grasping pose as the start and end of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions.

Object

Normalizing Flow as a Flexible Fidelity Objective for Photo-Realistic Super-resolution

no code implementations5 Nov 2021 Andreas Lugmayr, Martin Danelljan, Fisher Yu, Luc van Gool, Radu Timofte

Super-resolution is an ill-posed problem, where a ground-truth high-resolution image represents only one possibility in the space of plausible solutions.

Super-Resolution

Dense Prediction with Attentive Feature Aggregation

no code implementations1 Nov 2021 Yung-Hsu Yang, Thomas E. Huang, Min Sun, Samuel Rota Bulò, Peter Kontschieder, Fisher Yu

Our experiments show consistent and significant improvements on challenging semantic segmentation benchmarks, including Cityscapes, BDD100K, and Mapillary Vistas, at negligible computational and parameter overhead.

Boundary Detection Semantic Segmentation

TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation

1 code implementation10 Sep 2021 Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc van Gool

In many real-world settings, the target domain task requires a different taxonomy than the one imposed by the source domain.

Contrastive Learning Domain Adaptation +1

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

2 code implementations ICCV 2021 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the challenging public routes of the CARLA LeaderBoard.

Autonomous Driving Imitation Learning +2

Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

2 code implementations ICCV 2021 Goutam Bhat, Martin Danelljan, Fisher Yu, Luc van Gool, Radu Timofte

The deep reparametrization allows us to directly model the image formation process in the latent space, and to integrate learned image priors into the prediction.

Burst Image Super-Resolution Denoising +2

On the Practicality of Deterministic Epistemic Uncertainty

2 code implementations1 Jul 2021 Janis Postels, Mattia Segu, Tao Sun, Luca Sieber, Luc van Gool, Fisher Yu, Federico Tombari

We find that, while DUMs scale to realistic vision tasks and perform well on OOD detection, the practicality of current methods is undermined by poor calibration under distributional shifts.

Out of Distribution (OOD) Detection Semantic Segmentation +1

Robust Object Detection via Instance-Level Temporal Cycle Confusion

1 code implementation ICCV 2021 Xin Wang, Thomas E. Huang, Benlin Liu, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, Trevor Darrell

Building reliable object detectors that are robust to domain shifts, such as various changes in context, viewpoint, and object appearances, is critical for real-world applications.

Object object-detection +2

Warp Consistency for Unsupervised Learning of Dense Correspondences

1 code implementation ICCV 2021 Prune Truong, Martin Danelljan, Fisher Yu, Luc van Gool

From our observations and empirical results, we design a general unsupervised objective employing two of the derived constraints.

Dense Pixel Correspondence Estimation

Monocular Quasi-Dense 3D Object Tracking

1 code implementation12 Mar 2021 Hou-Ning Hu, Yung-Hsu Yang, Tobias Fischer, Trevor Darrell, Fisher Yu, Min Sun

Experiments on our proposed simulation data and real-world benchmarks, including KITTI, nuScenes, and Waymo datasets, show that our tracking framework offers robust object association and tracking on urban-driving scenarios.

3D Object Tracking Autonomous Driving +3

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

5 code implementations ICCV 2021 Wenguan Wang, Tianfei Zhou, Fisher Yu, Jifeng Dai, Ender Konukoglu, Luc van Gool

Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.

Metric Learning Optical Character Recognition (OCR) +3

Instance-Aware Predictive Navigation in Multi-Agent Environments

1 code implementation14 Jan 2021 Jinkun Cao, Xin Wang, Trevor Darrell, Fisher Yu

To decide the action at each step, we seek the action sequence that can lead to safe future states based on the prediction module outputs by repeatedly sampling likely action sequences.

Quasi-Dense Similarity Learning for Multiple Object Tracking

3 code implementations CVPR 2021 Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu

Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets.

Contrastive Learning Metric Learning +4

Frustratingly Simple Few-Shot Object Detection

5 code implementations ICML 2020 Xin Wang, Thomas E. Huang, Trevor Darrell, Joseph E. Gonzalez, Fisher Yu

Such a simple approach outperforms the meta-learning methods by roughly 2~20 points on current benchmarks and sometimes even doubles the accuracy of the prior methods.

Few-Shot Object Detection Meta-Learning +2

Task-Aware Feature Generation for Zero-Shot Compositional Learning

1 code implementation11 Jun 2019 Xin Wang, Fisher Yu, Trevor Darrell, Joseph E. Gonzalez

In this work, we propose a task-aware feature generation (TFG) framework for compositional learning, which generates features of novel visual concepts by transferring knowledge from previously seen concepts.

Novel Concepts Zero-Shot Learning

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

1 code implementation CVPR 2019 Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez

We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning.

Attribute Few-Shot Learning +1

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

2 code implementations CVPR 2019 Zhichao Yin, Trevor Darrell, Fisher Yu

Explicit representations of the global match distributions of pixel-wise correspondences between pairs of images are desirable for uncertainty estimation and downstream applications.

Density Estimation Optical Flow Estimation +2

Few-shot Object Detection via Feature Reweighting

4 code implementations ICCV 2019 Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, Trevor Darrell

The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples.

Few-Shot Learning Few-Shot Object Detection +3

Disentangling Propagation and Generation for Video Prediction

1 code implementation ICCV 2019 Hang Gao, Huazhe Xu, Qi-Zhi Cai, Ruth Wang, Fisher Yu, Trevor Darrell

A dynamic scene has two types of elements: those that move fluidly and can be predicted from previous frames, and those which are disoccluded (exposed) and cannot be extrapolated.

Predict Future Video Frames

Joint Monocular 3D Vehicle Detection and Tracking

1 code implementation ICCV 2019 Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu

The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.

3D Object Detection 3D Pose Estimation +4

Deep Object-Centric Policies for Autonomous Driving

no code implementations13 Nov 2018 Dequan Wang, Coline Devin, Qi-Zhi Cai, Fisher Yu, Trevor Darrell

While learning visuomotor skills in an end-to-end manner is appealing, deep neural networks are often uninterpretable and fail in surprising ways.

Autonomous Driving Object

Deep Mixture of Experts via Shallow Embedding

no code implementations5 Jun 2018 Xin Wang, Fisher Yu, Lisa Dunlap, Yi-An Ma, Ruth Wang, Azalia Mirhoseini, Trevor Darrell, Joseph E. Gonzalez

Larger networks generally have greater representational power at the cost of increased computational complexity.

Few-Shot Learning Zero-Shot Learning

PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup

no code implementations CVPR 2018 Huiwen Chang, Jingwan Lu, Fisher Yu, Adam Finkelstein

This paper introduces an automatic method for editing a portrait photo so that the subject appears to be wearing makeup in the style of another person in a reference photo.

Style Transfer

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

4 code implementations CVPR 2020 Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, Trevor Darrell

Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving.

Autonomous Driving Domain Adaptation +8

Reinforcement Learning from Imperfect Demonstrations

no code implementations ICLR 2018 Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell

We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.

reinforcement-learning Reinforcement Learning (RL)

SkipNet: Learning Dynamic Routing in Convolutional Networks

2 code implementations ECCV 2018 Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, Joseph E. Gonzalez

While deeper convolutional networks are needed to achieve maximum accuracy in visual perception tasks, for many inputs shallower networks are sufficient.

Decision Making

Deep Layer Aggregation

7 code implementations CVPR 2018 Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell

We augment standard architectures with deeper aggregation to better fuse information across layers.

Image Classification

Interactive 3D Modeling with a Generative Adversarial Network

no code implementations16 Jun 2017 Jerry Liu, Fisher Yu, Thomas Funkhouser

This paper proposes the idea of using a generative adversarial network (GAN) to assist a novice user in designing real-world shapes with a simple interface.

Generative Adversarial Network

IDK Cascades: Fast Deep Learning by Learning not to Overthink

no code implementations3 Jun 2017 Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez

Advances in deep learning have led to substantial increases in prediction accuracy but have been accompanied by increases in the cost of rendering predictions.

Dialogue Generation

Dilated Residual Networks

3 code implementations CVPR 2017 Fisher Yu, Vladlen Koltun, Thomas Funkhouser

Convolutional networks for image classification progressively reduce resolution until the image is represented by tiny feature maps in which the spatial structure of the scene is no longer discernible.

Classification General Classification +4

FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation

3 code implementations8 Dec 2016 Judy Hoffman, Dequan Wang, Fisher Yu, Trevor Darrell

In this paper, we introduce the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems.

Semantic Segmentation Synthetic-to-Real Translation

End-to-end Learning of Driving Models from Large-scale Video Datasets

2 code implementations CVPR 2017 Huazhe Xu, Yang Gao, Fisher Yu, Trevor Darrell

Robust perception-action models should be learned from training data with diverse visual appearances and realistic behaviors, yet current approaches to deep visuomotor policy learning have been generally limited to in-situ models learned from a single vehicle or a simulation environment.

Scene Segmentation

Scribbler: Controlling Deep Image Synthesis with Sketch and Color

1 code implementation CVPR 2017 Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, James Hays

In this paper, we propose a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic cars, bedrooms, or faces.

Colorization Image Generation

Semantic Scene Completion from a Single Depth Image

3 code implementations CVPR 2017 Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser

This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.

3D Semantic Scene Completion

Multi-Scale Context Aggregation by Dilated Convolutions

8 code implementations23 Nov 2015 Fisher Yu, Vladlen Koltun

State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification.

General Classification Real-Time Semantic Segmentation +1

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

4 code implementations10 Jun 2015 Fisher Yu, Ari Seff, yinda zhang, Shuran Song, Thomas Funkhouser, Jianxiong Xiao

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry.

Semantic Alignment of LiDAR Data at City Scale

no code implementations CVPR 2015 Fisher Yu, Jianxiong Xiao, Thomas Funkhouser

This paper describes an automatic algorithm for global alignment of LiDAR data collected with Google Street View cars in urban environments.

Pose Estimation

3D ShapeNets: A Deep Representation for Volumetric Shapes

no code implementations CVPR 2015 Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, Jianxiong Xiao

Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically.

Ranked #35 on 3D Point Cloud Classification on ModelNet40 (Mean Accuracy metric)

3D Point Cloud Classification 3D Shape Representation +2

3D Reconstruction from Accidental Motion

no code implementations CVPR 2014 Fisher Yu, David Gallup

We have discovered that 3D reconstruction can be achieved from asingle still photographic capture due to accidental motions of thephotographer, even while attempting to hold the camera still.

3D Reconstruction Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.