Search Results for author: Yu-Wing Tai

Found 115 papers, 59 papers with code

FED-NeRF: Achieve High 3D Consistency and Temporal Coherence for Face Video Editing on Dynamic NeRF

1 code implementation • 5 Jan 2024 • Hao Zhang, Yu-Wing Tai, Chi-Keung Tang

However, achieving simultaneously multi-view consistency and temporal coherence while editing video sequences remains a formidable challenge.

Video Editing

Paper
Code

Inpaint4DNeRF: Promptable Spatio-Temporal NeRF Inpainting with Generative Diffusion Models

no code implementations • 30 Dec 2023 • Han Jiang, Haosen Sun, Ruoxuan Li, Chi-Keung Tang, Yu-Wing Tai

Second and the remaining problem is thus 3D multiview consistency among all completed images, now guided by the seed images and their 3D proxies.

Paper
Add Code

Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent

no code implementations • 5 Dec 2023 • Jianmeng Liu, Yuyao Zhang, Zeyuan Meng, Yu-Wing Tai, Chi-Keung Tang

This paper explores promptable NeRF generation (e. g., text prompt or single image prompt) for direct conditioning and fast generation of NeRF parameters for the underlying 3D scenes, thus undoing complex intermediate steps while providing full 3D generation with conditional control.

3D Generation 3D Reconstruction

Paper
Add Code

DragVideo: Interactive Drag-style Video Editing

1 code implementation • 3 Dec 2023 • Yufan Deng, Ruida Wang, Yuhao Zhang, Yu-Wing Tai, Chi-Keung Tang

The main issues are: 1) how to perform direct and accurate user control in editing; 2) how to execute editings like changing shape, expression, and layout without unsightly distortion and artifacts to the edited content; and 3) how to maintain spatio-temporal consistency of video after editing.

Video Editing Video Generation

Paper
Code

SANeRF-HQ: Segment Anything for NeRF in High Quality

no code implementations • 3 Dec 2023 • Yichen Liu, Benran Hu, Chi-Keung Tang, Yu-Wing Tai

Recently, the Segment Anything Model (SAM) has showcased remarkable capabilities of zero-shot segmentation, while NeRF (Neural Radiance Fields) has gained popularity as a method for various 3D problems beyond novel view synthesis.

Novel View Synthesis Object +4

Paper
Add Code

C3Net: Compound Conditioned ControlNet for Multimodal Content Generation

no code implementations • 29 Nov 2023 • Juntao Zhang, Yuehuai Liu, Yu-Wing Tai, Chi-Keung Tang

Specifically, C3Net first aligns the conditions from multi-modalities to the same semantic latent space using modality-specific encoders based on contrastive training.

multimodal generation

Paper
Add Code

Stable Segment Anything Model

1 code implementation • 27 Nov 2023 • Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu-Wing Tai, Chi-Keung Tang

Thus, our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality, with 3) minimal learnable parameters (0. 08 M) and fast adaptation (by 1 training epoch).

Segmentation

Paper
Code

Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent Synthetic Images

1 code implementation • 27 Nov 2023 • Shiu-hong Kao, Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang

This paper presents Deceptive-Human, a novel Prompt-to-NeRF framework capitalizing state-of-the-art control diffusion models (e. g., ControlNet) to generate a high-quality controllable 3D human NeRF.

Density Estimation

Paper
Code

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

no code implementations • ICCV 2023 • Yue Xu, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed.

Action Recognition Temporal Action Localization

Paper
Add Code

Scene-Generalizable Interactive Segmentation of Radiance Fields

no code implementations • 9 Aug 2023 • Songlin Tang, Wenjie Pei, Xin Tao, Tanghui Jia, Guangming Lu, Yu-Wing Tai

Existing methods for interactive segmentation in radiance fields entail scene-specific optimization and thus cannot generalize across different scenes, which greatly limits their applicability.

Interactive Segmentation Segmentation +1

Paper
Add Code

Feature Decoupling-Recycling Network for Fast Interactive Segmentation

no code implementations • 7 Aug 2023 • Huimin Zeng, Weinong Wang, Xin Tao, Zhiwei Xiong, Yu-Wing Tai, Wenjie Pei

First, our model decouples the learning of source image semantics from the encoding of user guidance to process two types of input domains separately.

Image Segmentation Interactive Segmentation +3

Paper
Add Code

Cascade-DETR: Delving into High-Quality Universal Object Detection

1 code implementation • ICCV 2023 • Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

While dominating on the COCO benchmark, recent Transformer-based detection methods are not competitive in diverse domains.

Object object-detection +2

Paper
Code

Segment Anything Meets Point Tracking

1 code implementation • 3 Jul 2023 • Frano Rajič, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

The Segment Anything Model (SAM) has established itself as a powerful zero-shot image segmentation model, enabled by efficient point-centric annotation and prompt-based models.

Interactive Video Object Segmentation Object +5

901

Paper
Code

UniBoost: Unsupervised Unimodal Pre-training for Boosting Zero-shot Vision-Language Tasks

no code implementations • 7 Jun 2023 • Yanan sun, Zihan Zhong, Qi Fan, Chi-Keung Tang, Yu-Wing Tai

Our thorough studies validate that models pre-trained as such can learn rich representations of both modalities, improving their ability to understand how images and text relate to each other.

Semantic Segmentation

Paper
Add Code

Segment Anything in High Quality

2 code implementations • NeurIPS 2023 • Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

HQ-SAM is only trained on the introduced detaset of 44k masks, which takes only 4 hours on 8 GPUs.

Ranked #1 on Zero-Shot Instance Segmentation on LVIS v1.0 val

Zero-Shot Instance Segmentation Zero Shot Segmentation

13,419

Paper
Code

FaceDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models

2 code implementations • NeurIPS 2023 • Hao Zhang, Yanbo Xu, Tianyuan Dai, Yu-Wing Tai, Chi-Keung Tang

The ability to create high-quality 3D faces from a single image has become increasingly important with wide applications in video conferencing, AR/VR, and advanced video editing in movie industries.

3D Face Reconstruction Video Editing +1

Paper
Code

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

1 code implementation • 28 May 2023 • Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

Our method consistently enhances the distillation algorithms, even on much larger-scale and more heterogeneous datasets, e. g. ImageNet-1K and Kinetics-400.

Paper
Code

Deceptive-NeRF: Enhancing NeRF Reconstruction using Pseudo-Observations from Diffusion Models

no code implementations • 24 May 2023 • Xinhang Liu, Jiaben Chen, Shiu-hong Kao, Yu-Wing Tai, Chi-Keung Tang

We introduce Deceptive-NeRF, a novel methodology for few-shot NeRF reconstruction, which leverages diffusion models to synthesize plausible pseudo-observations to improve the reconstruction.

Paper
Add Code

Registering Neural Radiance Fields as 3D Density Images

no code implementations • 22 May 2023 • Han Jiang, Ruoxuan Li, Haosen Sun, Yu-Wing Tai, Chi-Keung Tang

No significant work has been done to directly merge two partially overlapping scenes using NeRF representations.

Contrastive Learning

Paper
Add Code

Instance Neural Radiance Field

1 code implementation • ICCV 2023 • Yichen Liu, Benran Hu, Junkai Huang, Yu-Wing Tai, Chi-Keung Tang

This paper presents one of the first learning-based NeRF 3D instance segmentation pipelines, dubbed as {\bf \inerflong}, or \inerf.

3D Instance Segmentation Panoptic Segmentation +1

Paper
Code

Mask-Free Video Instance Segmentation

1 code implementation • CVPR 2023 • Lei Ke, Martin Danelljan, Henghui Ding, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

A consistency loss is then enforced on the found matches.

Ranked #1 on Video Instance Segmentation on Youtube-VIS (trained with no video masks)

Instance Segmentation Optical Flow Estimation +4

349

Paper
Code

Clean-NeRF: Reformulating NeRF to account for View-Dependent Observations

no code implementations • 26 Mar 2023 • Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang

This paper analyzes the NeRF's struggles in such settings and proposes Clean-NeRF for accurate 3D reconstruction and novel view rendering in complex scenes.

3D Reconstruction Density Estimation +3

Paper
Add Code

Ultrahigh Resolution Image/Video Matting With Spatio-Temporal Sparsity

1 code implementation • CVPR 2023 • Yanan sun, Chi-Keung Tang, Yu-Wing Tai

Instead, our method resorts to spatial and temporal sparsity for solving general UHR matting.

Image Matting Video Matting

Paper
Code

Compression-Aware Video Super-Resolution

1 code implementation • CVPR 2023 • Yingwei Wang, Xu Jia, Xin Tao, Takashi Isobe, Huchuan Lu, Yu-Wing Tai

Videos stored on mobile devices or delivered on the Internet are usually in compressed format and are of various unknown compression parameters, but most video super-resolution (VSR) methods often assume ideal inputs resulting in large performance gap between experimental settings and real-world applications.

Model Compression Video Enhancement +1

Paper
Code

ONeRF: Unsupervised 3D Object Segmentation from Multiple Views

no code implementations • 22 Nov 2022 • Shengnan Liang, Yichen Liu, Shangzhe Wu, Yu-Wing Tai, Chi-Keung Tang

We present ONeRF, a method that automatically segments and reconstructs object instances in 3D from multi-view RGB images without any additional manual annotations.

3D scene Editing Object +1

Paper
Add Code

FLNeRF: 3D Facial Landmarks Estimation in Neural Radiance Fields

1 code implementation • 21 Nov 2022 • Hao Zhang, Tianyuan Dai, Yu-Wing Tai, Chi-Keung Tang

This paper presents the first significant work on directly predicting 3D face landmarks on neural radiance fields (NeRFs).

Paper
Code

NeRF-RPN: A general framework for object detection in NeRFs

2 code implementations • CVPR 2023 • Benran Hu, Junkai Huang, Yichen Liu, Yu-Wing Tai, Chi-Keung Tang

This paper presents the first significant object detection framework, NeRF-RPN, which directly operates on NeRF.

object-detection Object Detection

210

Paper
Code

H-VFI: Hierarchical Frame Interpolation for Videos with Large Motions

no code implementations • 21 Nov 2022 • Changlin Li, Guangyang Wu, Yanan sun, Xin Tao, Chi-Keung Tang, Yu-Wing Tai

The learnt deformable kernel is then utilized in convolving the input frames for predicting the interpolated frame.

Video Frame Interpolation

Paper
Add Code

Normalization Perturbation: A Simple Domain Generalization Method for Real-World Domain Shifts

no code implementations • 8 Nov 2022 • Qi Fan, Mattia Segu, Yu-Wing Tai, Fisher Yu, Chi-Keung Tang, Bernt Schiele, Dengxin Dai

Thus, we propose to perturb the channel statistics of source domain features to synthesize various latent styles, so that the trained deep model can perceive diverse potential domains and generalizes well even without observations of target domain data in training.

Autonomous Driving Domain Generalization

Paper
Add Code

SDRTV-to-HDRTV Conversion via Spatial-Temporal Feature Fusion

no code implementations • 4 Nov 2022 • Kepeng Xu, Li Xu, Gang He, Chang Wu, Zijia Ma, Ming Sun, Yu-Wing Tai

To evaluate the performance of the proposed method, we construct a corresponding multi-frame dataset using HDR video of the HDR10 standard to conduct a comprehensive evaluation of different methods.

Paper
Add Code

Scene Text Image Super-Resolution via Content Perceptual Loss and Criss-Cross Transformer Blocks

no code implementations • 13 Oct 2022 • Rui Qin, Bin Wang, Yu-Wing Tai

The CP Loss supervises the text reconstruction with content semantics by multi-scale text recognition features, which effectively incorporates content awareness into the framework.

Image Reconstruction Image Super-Resolution +1

Paper
Add Code

Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation

no code implementations • 2 Oct 2022 • Xinhang Liu, Jiaben Chen, Huai Yu, Yu-Wing Tai, Chi-Keung Tang

The core of our method is a novel propagation strategy for individual objects' radiance fields with a bidirectional photometric loss, enabling an unsupervised partitioning of a scene into salient or meaningful regions corresponding to different object instances.

3D Object Editing Object +2

Paper
Add Code

DeViT: Deformed Vision Transformers in Video Inpainting

no code implementations • 28 Sep 2022 • Jiayin Cai, Changlin Li, Xin Tao, Chun Yuan, Yu-Wing Tai

This paper proposes a novel video inpainting method.

Video Inpainting

Paper
Add Code

Occlusion-Aware Instance Segmentation via BiLayer Network Architectures

1 code implementation • 8 Aug 2022 • Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Unlike previous instance segmentation methods, we model image formation as a composition of two overlapping layers, and propose Bilayer Convolutional Network (BCNet), where the top layer detects occluding objects (occluders) and the bottom layer infers partially occluded instances (occludees).

Instance Segmentation Segmentation +2

509

Paper
Code

Video Mask Transfiner for High-Quality Video Instance Segmentation

1 code implementation • 28 Jul 2022 • Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

While Video Instance Segmentation (VIS) has seen rapid progress, current approaches struggle to predict high-quality masks with accurate boundary details.

Ranked #1 on Video Instance Segmentation on HQ-YTVIS

Instance Segmentation Semantic Segmentation +2

Paper
Code

Self-Support Few-Shot Semantic Segmentation

1 code implementation • 23 Jul 2022 • Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang

Motivated by the simple Gestalt principle that pixels belonging to the same object are more similar than those to different objects of same class, we propose a novel self-support matching strategy to alleviate this problem, which uses query prototypes to match query features, where the query prototypes are collected from high-confidence query predictions.

Ranked #12 on Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot)

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Learning Sequence Representations by Non-local Recurrent Neural Memory

1 code implementation • 20 Jul 2022 • Wenjie Pei, Xin Feng, Canmiao Fu, Qiong Cao, Guangming Lu, Yu-Wing Tai

The key challenge of sequence representation learning is to capture the long-range temporal dependencies.

Representation Learning

Paper
Code

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

2 code implementations • 30 May 2022 • Peng Zheng, Huazhu Fu, Deng-Ping Fan, Qi Fan, Jie Qin, Yu-Wing Tai, Chi-Keung Tang, Luc van Gool

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes.

Ranked #1 on Co-Salient Object Detection on CoCA

Co-Salient Object Detection Object +2

152

Paper
Code

Human Instance Matting via Mutual Guidance and Multi-Instance Refinement

1 code implementation • CVPR 2022 • Yanan sun, Chi-Keung Tang, Yu-Wing Tai

A new instance matting metric called instance matting quality (IMQ) is proposed, which addresses the absence of a unified and fair means of evaluation emphasizing both instance recognition and matting quality.

Image Matting Instance Segmentation +1

Paper
Code

Interactiveness Field in Human-Object Interactions

1 code implementation • CVPR 2022 • Xinpeng Liu, Yong-Lu Li, Xiaoqian Wu, Yu-Wing Tai, Cewu Lu, Chi-Keung Tang

Human-Object Interaction (HOI) detection plays a core role in activity understanding.

Human-Object Interaction Detection Object

Paper
Code

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

1 code implementation • CVPR 2022 • Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, Yu-Wing Tai

Instead of directly feeding consecutive frames into a VSR model, we propose to compute the temporal difference between frames and divide those pixels into two subsets according to the level of difference.

Motion Compensation Optical Flow Estimation +1

Paper
Code

HAA4D: Few-Shot Human Atomic Action Recognition via 3D Spatio-Temporal Skeletal Alignment

no code implementations • 15 Feb 2022 • Mu-Ruei Tseng, Abhishek Gupta, Chi-Keung Tang, Yu-Wing Tai

All training and testing 3D skeletons in HAA4D are globally aligned, using a deep alignment model to the same global space, making each skeleton face the negative z-direction.

Atomic action recognition

Paper
Add Code

Transcoded Video Restoration by Temporal Spatial Auxiliary Network

1 code implementation • 15 Dec 2021 • Li Xu, Gang He, Jinjia Zhou, Jie Lei, Weiying Xie, Yunsong Li, Yu-Wing Tai

In most video platforms, such as Youtube, and TikTok, the played videos usually have undergone multiple video encodings such as hardware encoding by recording devices, software encoding by video editing apps, and single/multiple video transcoding by video application servers.

Video Editing Video Restoration

Paper
Code

NeRF-SR: High-Quality Neural Radiance Fields using Supersampling

1 code implementation • 3 Dec 2021 • Chen Wang, Xian Wu, Yuan-Chen Guo, Song-Hai Zhang, Yu-Wing Tai, Shi-Min Hu

We present NeRF-SR, a solution for high-resolution (HR) novel view synthesis with mostly low-resolution (LR) inputs.

Novel View Synthesis Vocal Bursts Intensity Prediction

128

Paper
Code

Mask Transfiner for High-Quality Instance Segmentation

2 code implementations • CVPR 2022 • Lei Ke, Martin Danelljan, Xia Li, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Instead of operating on regular dense tensors, our Mask Transfiner decomposes and represents the image regions as a quadtree.

Ranked #1 on Instance Segmentation on BDD100K val

Instance Segmentation Segmentation +2

520

Paper
Code

Occlusion-Aware Video Object Inpainting

no code implementations • ICCV 2021 • Lei Ke, Yu-Wing Tai, Chi-Keung Tang

To facilitate this new research, we construct the first large-scale video object inpainting benchmark YouTube-VOI to provide realistic occlusion scenarios with both occluded and visible object masks available.

Object Texture Synthesis +1

Paper
Add Code

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

1 code implementation • NeurIPS 2021 • Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation.

Ranked #1 on Video Instance Segmentation on BDD100K val

Multi-Object Tracking and Segmentation Multiple Object Track and Segmentation +3

359

Paper
Code

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

3 code implementations • NeurIPS 2021 • Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

This paper presents a simple yet effective approach to modeling space-time correspondences in the context of video object segmentation.

Ranked #7 on Video Object Segmentation on YouTube-VOS 2019

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

521

Paper
Code

Few-Shot Video Object Detection

1 code implementation • 30 Apr 2021 • Qi Fan, Chi-Keung Tang, Yu-Wing Tai

We introduce Few-Shot Video Object Detection (FSVOD) with three contributions to real-world visual learning challenge in our highly diverse and dynamic world: 1) a large-scale video dataset FSVOD-500 comprising of 500 classes with class-balanced videos in each category for few-shot learning; 2) a novel Tube Proposal Network (TPN) to generate high-quality video tube proposals for aggregating feature representation for the target video object which can be highly dynamic; 3) a strategically improved Temporal Matching Network (TMN+) for matching representative query tube features with better discriminative ability thus achieving higher diversity.

Few-Shot Video Object Detection Object +2

344

Paper
Code

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

1 code implementation • CVPR 2021 • Yanan sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai

Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets.

Image Matting Optical Flow Estimation +1

Paper
Code

Few-Shot Model Adaptation for Customized Facial Landmark Detection, Segmentation, Stylization and Shadow Removal

no code implementations • 19 Apr 2021 • Zhen Wei, Bingkun Liu, Weinong Wang, Yu-Wing Tai

Thus, there is always a great demand in customized data annotations.

Facial Landmark Detection Shadow Removal

Paper
Add Code

Semantic Image Matting

1 code implementation • CVPR 2021 • Yanan sun, Chi-Keung Tang, Yu-Wing Tai

Specifically, we consider and learn 20 classes of matting patterns, and propose to extend the conventional trimap to semantic trimap.

Ranked #1 on Semantic Image Matting on Semantic Image Matting Dataset

Semantic Image Matting Transparent objects

216

Paper
Code

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers

1 code implementation • CVPR 2021 • Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries.

Ranked #1 on Instance Segmentation on KINS

Amodal Instance Segmentation Boundary Detection +4

509

Paper
Code

Group Collaborative Learning for Co-Salient Object Detection

1 code implementation • CVPR 2021 • Qi Fan, Deng-Ping Fan, Huazhu Fu, Chi Keung Tang, Ling Shao, Yu-Wing Tai

We present a novel group collaborative learning framework (GCoNet) capable of detecting co-salient objects in real time (16ms), by simultaneously mining consensus representations at group level based on the two necessary criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module; 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module conditioning the inconsistent consensus.

Ranked #5 on Co-Salient Object Detection on CoCA

Co-Salient Object Detection Object +2

Paper
Code

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

5 code implementations • CVPR 2021 • Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance.

Ranked #1 on Interactive Video Object Segmentation on DAVIS 2017 (using extra training data)

Interactive Video Object Segmentation Semantic Segmentation +2

446

Paper
Code

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

2 code implementations • 24 Feb 2021 • Yang You, Yujing Lou, Ruoxi Shi, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Weiming Wang, Cewu Lu

Spherical Voxel Convolution and Point Re-sampling are proposed to extract rotation invariant features for each point.

3D Feature Matching Data Augmentation

Paper
Code

Semi-Supervised Few-Shot Atomic Action Recognition

1 code implementation • 17 Nov 2020 • Xiaoyuan Ni, Sizhe Song, Yu-Wing Tai, Chi-Keung Tang

Despite excellent progress has been made, the performance on action recognition still heavily relies on specific datasets, which are difficult to extend new action classes due to labor-intensive labeling.

Atomic action recognition

Paper
Code

HAA500: Human-Centric Atomic Action Dataset with Curated Videos

no code implementations • ICCV 2021 • Jihoon Chung, Cheng-hsin Wuu, Hsuan-ru Yang, Yu-Wing Tai, Chi-Keung Tang

We contribute HAA500, a manually annotated human-centric atomic action dataset for action recognition on 500 classes with over 591K labeled frames.

Ranked #1 on Action Recognition on HAA500

Action Classification Action Recognition

Paper
Add Code

Pose-Guided High-Resolution Appearance Transfer via Progressive Training

no code implementations • 27 Aug 2020 • Ji Liu, Heshan Liu, Mang-Tik Chiu, Yu-Wing Tai, Chi-Keung Tang

We propose a novel pose-guided appearance transfer network for transferring a given reference appearance to a target pose in unprecedented image resolution (1024 * 1024), given respectively an image of the reference and target person.

Video Generation Vocal Bursts Intensity Prediction

Paper
Add Code

GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision

1 code implementation • ECCV 2020 • Lei Ke, Shichao Li, Yanan sun, Yu-Wing Tai, Chi-Keung Tang

GSNet utilizes a unique four-way feature extraction and fusion scheme and directly regresses 6DoF poses and shapes in a single forward pass.

Ranked #1 on Autonomous Driving on ApolloCar3D

3D Car Instance Understanding 3D Pose Estimation +11

134

Paper
Code

Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation

1 code implementation • ECCV 2020 • Qi Fan, Lei Ke, Wenjie Pei, Chi-Keung Tang, Yu-Wing Tai

We propose to learn the underlying class-agnostic commonalities that can be generalized from mask-annotated categories to novel categories.

Ranked #79 on Instance Segmentation on COCO test-dev

Instance Segmentation Segmentation +1

344

Paper
Code

Fully Convolutional Networks for Continuous Sign Language Recognition

no code implementations • ECCV 2020 • Ka Leong Cheng, Zhaoyang Yang, Qifeng Chen, Yu-Wing Tai

Continuous sign language recognition (SLR) is a challenging task that requires learning on both spatial and temporal dimensions of signing frame sequences.

Sentence Sign Language Recognition

Paper
Add Code

Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

2 code implementations • ECCV 2020 • Jian-Feng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai

In this paper, we propose an efficient and effective dense hybrid recurrent multi-view stereo net with dynamic consistency checking, namely $D^{2}$HC-RMVSNet, for accurate dense point cloud reconstruction.

Point cloud reconstruction

110

Paper
Code

Dive Deeper Into Box for Object Detection

no code implementations • ECCV 2020 • Ran Chen, Yong liu, Mengdan Zhang, Shu Liu, Bei Yu, Yu-Wing Tai

Anchor free methods have defined the new frontier in state-of-the-art object detection researches where accurate bounding box estimation is the key to the success of these methods.

Object object-detection +1

Paper
Add Code

Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching

no code implementations • CVPR 2020 • Xuhua Huang, Jiarui Xu, Yu-Wing Tai, Chi-Keung Tang

Significant progress has been made in Video Object Segmentation (VOS), the video object tracking task in its finest level.

Ranked #71 on Semi-Supervised Video Object Segmentation on DAVIS 2016

Object One-Shot Learning +6

Paper
Add Code

Cascaded deep monocular 3D human pose estimation with evolutionary training data

1 code implementation • CVPR 2020 • Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng

End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data.

Ranked #13 on Weakly-supervised 3D Human Pose Estimation on Human3.6M

Data Augmentation Monocular 3D Human Pose Estimation +3

327

Paper
Code

One-Shot Object Detection without Fine-Tuning

1 code implementation • 8 May 2020 • Xiang Li, Lin Zhang, Yau Pun Chen, Yu-Wing Tai, Chi-Keung Tang

Deep learning has revolutionized object detection thanks to large-scale datasets, but their object categories are still arguably very limited.

Metric Learning Object +2

Paper
Code

CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement

2 code implementations • CVPR 2020 • Ho Kei Cheng, Jihoon Chung, Yu-Wing Tai, Chi-Keung Tang

In this paper, we propose a novel approach to address the high-resolution segmentation problem without using any high-resolution training data.

Ranked #1 on Semantic Segmentation on BIG (using extra training data)

4k Land Cover Classification +3

788

Paper
Code

Learning Video Object Segmentation from Unlabeled Videos

1 code implementation • CVPR 2020 • Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi

We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data.

Ranked #4 on Unsupervised Video Object Segmentation on DAVIS 2017 (test-dev)

Object Representation Learning +6

Paper
Code

Spatial-Scale Aligned Network for Fine-Grained Recognition

no code implementations • 5 Jan 2020 • Lizhao Gao, Hai-Hua Xu, Chong Sun, Junling Liu, Yu-Wing Tai

Existing approaches for fine-grained visual recognition focus on learning marginal region-based representations while neglecting the spatial and scale misalignments, leading to inferior performance.

Fine-Grained Visual Recognition

Paper
Add Code

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

1 code implementation • ECCV 2020 • Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai

n this paper, we propose an effective and efficient pyramid multi-view stereo (MVS) net with self-adaptive view aggregation for accurate and complete dense point cloud reconstruction.

3D Point Cloud Reconstruction Depth Estimation +1

Paper
Code

Reflective Decoding Network for Image Captioning

no code implementations • ICCV 2019 • Lei Ke, Wenjie Pei, Ruiyu Li, Xiaoyong Shen, Yu-Wing Tai

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance.

Ranked #4 on Image Captioning on MS COCO

Image Captioning Position +1

Paper
Add Code

Push for Center Learning via Orthogonalization and Subspace Masking for Person Re-Identification

no code implementations • 28 Aug 2019 • Weinong Wang, Wenjie Pei, Qiong Cao, Shu Liu, Yu-Wing Tai

Person re-identification aims to identify whether pairs of images belong to the same person or not.

Person Re-Identification

Paper
Add Code

Non-local Recurrent Neural Memory for Supervised Sequence Modeling

no code implementations • ICCV 2019 • Canmiao Fu, Wenjie Pei, Qiong Cao, Chaopeng Zhang, Yong Zhao, Xiaoyong Shen, Yu-Wing Tai

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies.

Action Recognition Sentiment Analysis

Paper
Add Code

Cross-Domain Adaptation for Animal Pose Estimation

no code implementations • ICCV 2019 • Jinkun Cao, Hongyang Tang, Hao-Shu Fang, Xiaoyong Shen, Cewu Lu, Yu-Wing Tai

Therefore, the easily available human pose dataset, which is of a much larger scale than our labeled animal dataset, provides important prior knowledge to boost up the performance on animal pose estimation.

Animal Pose Estimation Domain Adaptation

Paper
Add Code

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

3 code implementations • CVPR 2020 • Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai

To train our network, we contribute a new dataset that contains 1000 categories of various objects with high-quality annotations.

Ranked #21 on Few-Shot Object Detection on MS-COCO (10-shot)

Few-Shot Object Detection Object +2

375

Paper
Code

SF-Net: Structured Feature Network for Continuous Sign Language Recognition

no code implementations • 4 Aug 2019 • Zhaoyang Yang, Zhenmei Shi, Xiaoyong Shen, Yu-Wing Tai

The proposed SF-Net extracts features in a structured manner and gradually encodes information at the frame level, the gloss level and the sentence level into the feature representation.

Sentence Sign Language Recognition

Paper
Add Code

DAWN: Dual Augmented Memory Network for Unsupervised Video Object Tracking

no code implementations • 2 Aug 2019 • Zhenmei Shi, Haoyang Fang, Yu-Wing Tai, Chi-Keung Tang

Our Dual Augmented Memory Network (DAWN) is unique in remembering both target and background, and using an improved attention LSTM memory to guide the focus on memorized features.

Video Object Tracking Visual Tracking

Paper
Add Code

FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation

1 code implementation • CVPR 2020 • Xiang Li, Tianhan Wei, Yau Pun Chen, Yu-Wing Tai, Chi-Keung Tang

In this paper, we are interested in few-shot object segmentation where the number of annotated training examples are limited to 5 only.

Ranked #20 on Few-Shot Semantic Segmentation on FSS-1000 (5-shot)

Few-Shot Semantic Segmentation Object +2

268

Paper
Code

StableNet: Semi-Online, Multi-Scale Deep Video Stabilization

no code implementations • 24 Jul 2019 • Chia-Hung Huang, Hang Yin, Yu-Wing Tai, Chi-Keung Tang

Video stabilization algorithms are of greater importance nowadays with the prevalence of hand-held devices which unavoidably produce videos with undesirable shaky motions.

Video Stabilization

Paper
Add Code

Landmark Assisted CycleGAN for Cartoon Face Generation

no code implementations • 2 Jul 2019 • Ruizheng Wu, Xiaodong Gu, Xin Tao, Xiaoyong Shen, Yu-Wing Tai, Jiaya Jia

In this paper, we are interested in generating an cartoon face of a person by using unpaired training data between real faces and cartoon ones.

Face Generation

Paper
Add Code

Memory-Attended Recurrent Network for Video Captioning

1 code implementation • CVPR 2019 • Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen, Yu-Wing Tai

Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed.

Video Captioning

Paper
Code

LADN: Local Adversarial Disentangling Network for Facial Makeup and De-Makeup

1 code implementation • ICCV 2019 • Qiao Gu, Guanzhi Wang, Mang Tik Chiu, Yu-Wing Tai, Chi-Keung Tang

Central to our method are multiple and overlapping local adversarial discriminators in a content-style disentangling network for achieving local detail transfer between facial images, with the use of asymmetric loss functions for dramatic makeup styles with high-frequency details.

Style Transfer

177

Paper
Code

Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution

1 code implementation • 23 Nov 2018 • Yang You, Yujing Lou, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Cewu Lu, Weiming Wang

Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown.

3D Feature Matching Data Augmentation

Paper
Code

Physics-Based Generative Adversarial Models for Image Restoration and Beyond

no code implementations • 2 Aug 2018 • Jinshan Pan, Jiangxin Dong, Yang Liu, Jiawei Zhang, Jimmy Ren, Jinhui Tang, Yu-Wing Tai, Ming-Hsuan Yang

We present an algorithm to directly solve numerous image restoration problems (e. g., image deblurring, image dehazing, image deraining, etc.).

Deblurring Image Deblurring +3

Paper
Add Code

Pairwise Body-Part Attention for Recognizing Human-Object Interactions

1 code implementation • ECCV 2018 • Hao-Shu Fang, Jinkun Cao, Yu-Wing Tai, Cewu Lu

We propose a new pairwise body-part attention model which can learn to focus on crucial parts, and their correlations for HOI recognition.

Ranked #5 on Human-Object Interaction Detection on HICO

feature selection Human-Object Interaction Detection +1

Paper
Code

Learning Dual Convolutional Neural Networks for Low-Level Vision

no code implementations • CVPR 2018 • Jinshan Pan, Sifei Liu, Deqing Sun, Jiawei Zhang, Yang Liu, Jimmy Ren, Zechao Li, Jinhui Tang, Huchuan Lu, Yu-Wing Tai, Ming-Hsuan Yang

These problems usually involve the estimation of two components of the target signals: structures and details.

Rain Removal Super-Resolution

Paper
Add Code

Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer

1 code implementation • CVPR 2018 • Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu

In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations.

Ranked #4 on Human Part Segmentation on PASCAL-Part (using extra training data)

Human Parsing Human Part Segmentation +3

300

Paper
Code

MAVOT: Memory-Augmented Video Object Tracking

no code implementations • 26 Nov 2017 • Boyu Liu, Yanzhao Wang, Yu-Wing Tai, Chi-Keung Tang

We introduce a one-shot learning approach for video object tracking.

Object One-Shot Learning +2

Paper
Add Code

Deep High Dynamic Range Imaging with Large Foreground Motions

1 code implementation • ECCV 2018 • Shangzhe Wu, Jiarui Xu, Yu-Wing Tai, Chi-Keung Tang

In state-of-the-art deep HDR imaging, input images are first aligned using optical flows before merging, which are still error-prone due to occlusion and large motions.

Translation Vocal Bursts Intensity Prediction

180

Paper
Code

Image Generation from Sketch Constraint Using Contextual GAN

1 code implementation • ECCV 2018 • Yongyi Lu, Shangzhe Wu, Yu-Wing Tai, Chi-Keung Tang

We train a generated adversarial network, i. e, contextual GAN to learn the joint distribution of sketch and the corresponding image by using joint images.

Image-to-Image Translation Translation

Paper
Code

Deep Video Generation, Prediction and Completion of Human Action Sequences

no code implementations • ECCV 2018 • Haoye Cai, Chunyan Bai, Yu-Wing Tai, Chi-Keung Tang

In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage.

Ranked #5 on Human action generation on NTU RGB+D 2D

Human action generation Video Generation +1

Paper
Add Code

Adversarial Attacks Beyond the Image Space

no code implementations • CVPR 2019 • Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan L. Yuille

Though image-space adversaries can be interpreted as per-pixel albedo change, we verify that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect.

Question Answering Visual Question Answering

Paper
Add Code

Learning Discriminative Data Fitting Functions for Blind Image Deblurring

no code implementations • ICCV 2017 • Jinshan Pan, Jiangxin Dong, Yu-Wing Tai, Zhixun Su, Ming-Hsuan Yang

Solving blind image deblurring usually requires defining a data fitting function and image priors.

Blind Image Deblurring Image Deblurring +1

Paper
Add Code

Image Dehazing using Bilinear Composition Loss Function

no code implementations • 1 Oct 2017 • Hui Yang, Jinshan Pan, Qiong Yan, Wenxiu Sun, Jimmy Ren, Yu-Wing Tai

In this paper, we introduce a bilinear composition loss function to address the problem of image dehazing.

Blocking Image Dehazing

Paper
Add Code

Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting

3 code implementations • ICCV 2017 • Donghyeon Cho, Jinsun Park, Tae-Hyun Oh, Yu-Wing Tai, In So Kweon

Our method implicitly learns an attention map, which leads to a content-aware shift map for image retargeting.

Image Retargeting Self-Supervised Learning

Paper
Code

Attribute-Guided Face Generation Using Conditional CycleGAN

no code implementations • ECCV 2018 • Yongyi Lu, Yu-Wing Tai, Chi-Keung Tang

We are interested in attribute-guided face generation: given a low-res face input image, an attribute vector that can be extracted from a high-res image (attribute image), our new method generates a high-res face image for the low-res input that satisfies the given attributes.

Attribute Face Generation +2

Paper
Add Code

A Unified Approach of Multi-scale Deep and Hand-crafted Features for Defocus Estimation

1 code implementation • CVPR 2017 • Jinsun Park, Yu-Wing Tai, Donghyeon Cho, In So Kweon

In this paper, we introduce robust and synergetic hand-crafted features and a simple but efficient deep feature from a convolutional neural network (CNN) architecture for defocus estimation.

Ranked #2 on Defocus Estimation on CUHK - Blur Detection Dataset

Defocus Estimation Image Generation

Paper
Code

Accurate Single Stage Detector Using Recurrent Rolling Convolution

2 code implementations • CVPR 2017 • Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Jiahao Pang, Qiong Yan, Yu-Wing Tai, Li Xu

In this paper, we proposed a novel single stage end-to-end trainable object detection network to overcome this limitation.

3D Object Detection Object +2

362

Paper
Code

RMPE: Regional Multi-person Pose Estimation

14 code implementations • ICCV 2017 • Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu

In this paper, we propose a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes.

Ranked #1 on Pose Estimation on UAV-Human

2D Human Pose Estimation Human Detection +2

7,708

Paper
Code

Refining Geometry from Depth Sensors using IR Shading Images

no code implementations • 18 Aug 2016 • Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon

To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion.

Paper
Add Code

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

7 code implementations • 12 Jul 2016 • Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang

We alternate the pruning and retraining to further reduce zero activations in a network.

Efficient Neural Network

406

Paper
Code

Efficient and Robust Color Consistency for Community Photo Collections

no code implementations • CVPR 2016 • Jaesik Park, Yu-Wing Tai, Sudipta N. Sinha, In So Kweon

We present a robust low-rank matrix factorization method to estimate the unknown parameters of this model.

Paper
Add Code

Deep Saliency with Encoded Low level Distance Map and High Level Features

2 code implementations • CVPR 2016 • Gayoung Lee, Yu-Wing Tai, Junmo Kim

Recent advances in saliency detection have utilized deep learning to obtain high level features to detect salient regions in a scene.

Saliency Detection

Paper
Code

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification

no code implementations • 13 Feb 2016 • Jimmy Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan

This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable.

Speaker Identification

Paper
Add Code

RGB-Guided Hyperspectral Image Upsampling

no code implementations • ICCV 2015 • Hyeokhyen Kwon, Yu-Wing Tai

On the contrary, latest imaging sensors capture a RGB image with resolution of multiple times larger than a hyperspectral image.

Paper
Add Code

Fast Randomized Singular Value Thresholding for Low-rank Optimization

no code implementations • 1 Sep 2015 • Tae-Hyun Oh, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon

The problems related to NNM, or WNNM, can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT), or Weighted SVT, but they suffer from high computational cost of Singular Value Decomposition (SVD) at each iteration.

Clustering

Paper
Add Code

Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization

no code implementations • CVPR 2015 • Tae-Hyun Oh, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon

The problems related to NNM (or WNNM) can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT) (or Weighted SVT), but they suffer from high computational cost to compute a Singular Value Decomposition (SVD) at each iteration.

Clustering

Paper
Add Code

Data-Driven Depth Map Refinement via Multi-Scale Sparse Representation

no code implementations • CVPR 2015 • Hyeokhyen Kwon, Yu-Wing Tai, Stephen Lin

Depth maps captured by consumer-level depth cameras such as Kinect are usually degraded by noise, missing values, and quantization.

Dictionary Learning Quantization

Paper
Add Code

Accurate Depth Map Estimation From a Lenslet Light Field Camera

no code implementations • CVPR 2015 • Hae-Gon Jeon, Jaesik Park, Gyeongmin Choe, Jinsun Park, Yunsu Bok, Yu-Wing Tai, In So Kweon

This paper introduces an algorithm that accurately estimates depth maps using a lenslet light field camera.

Depth Estimation

Paper
Add Code

Partial Sum Minimization of Singular Values in Robust PCA: Algorithm and Applications

no code implementations • 4 Mar 2015 • Tae-Hyun Oh, Yu-Wing Tai, Jean-Charles Bazin, Hyeongwoo Kim, In So Kweon

Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers.

Edge Detection

Paper
Add Code

Salient Region Detection via High-Dimensional Color Transform

no code implementations • CVPR 2014 • Jiwhan Kim, Dongyoon Han, Yu-Wing Tai, Junmo Kim

By mapping a low dimensional RGB color to a feature vector in a high-dimensional color space, we show that we can linearly separate the salient regions from the background by finding an optimal linear combination of color coefficients in the high-dimensional color space.

Vocal Bursts Intensity Prediction

Paper
Add Code

Calibrating a Non-isotropic Near Point Light Source using a Plane

no code implementations • CVPR 2014 • Jaesik Park, Sudipta N. Sinha, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon

We show that a non-isotropic near point light source rigidly attached to a camera can be calibrated using multiple images of a weakly textured planar scene.

Position

Paper
Add Code

Exploiting Shading Cues in Kinect IR Images for Geometry Refinement

no code implementations • CVPR 2014 • Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon

To resolve ambiguity in our model between normals and distance, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not reconstructed by the Kinect fusion.

Paper
Add Code

Shading-Based Shape Refinement of RGB-D Images

no code implementations • CVPR 2013 • Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin

We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.