Search Results for author: Kai Han

Found 122 papers, 81 papers with code

GhostNetV3: Exploring the Training Strategies for Compact Models

no code implementations • 17 Apr 2024 • Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang

In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models.

Knowledge Distillation object-detection +1

Paper
Add Code

SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

no code implementations • 20 Mar 2024 • Hongjun Wang, Sagar Vaze, Kai Han

We thoroughly evaluate our SPTNet on standard benchmarks and demonstrate that our method outperforms existing GCD methods.

Paper
Add Code

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution

1 code implementation • 27 Feb 2024 • Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang

In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference.

Image Super-Resolution

Paper
Code

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

1 code implementation • 26 Feb 2024 • wei he, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang

Large language models (LLMs) face a daunting challenge due to the excessive computational and memory requirements of the commonly used Transformer architecture.

Paper
Code

Assortment Planning with Sponsored Products

no code implementations • 9 Feb 2024 • Shaojie Tang, Shuzhang Cai, Jing Yuan, Kai Han

In the rapidly evolving landscape of retail, assortment planning plays a crucial role in determining the success of a business.

Combinatorial Optimization

Paper
Add Code

Data-efficient Large Vision Models through Sequential Autoregression

1 code implementation • 7 Feb 2024 • Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu

Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.

Paper
Code

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

1 code implementation • 6 Feb 2024 • Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang

Recent advancements in large language models have sparked interest in their extraordinary and near-superhuman capabilities, leading researchers to explore methods for evaluating and optimizing these abilities, which is called superalignment.

Few-Shot Learning Knowledge Distillation +1

Paper
Code

A Survey on Transformer Compression

no code implementations • 5 Feb 2024 • Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, DaCheng Tao

Model compression methods reduce the memory and computational cost of Transformer, which is a necessary step to implement large language/vision models on practical devices.

Knowledge Distillation Model Compression +1

Paper
Add Code

Rethinking Optimization and Architecture for Tiny Language Models

1 code implementation • 5 Feb 2024 • Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang

Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.

Language Modelling

Paper
Code

FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition

no code implementations • 5 Feb 2024 • Xiaohu Huang, Hao Zhou, Kun Yao, Kai Han

To address these issues, FROSTER employs a residual feature distillation approach to ensure that CLIP retains its generalization capability while effectively adapting to the action recognition task.

Open Vocabulary Action Recognition

Paper
Add Code

An Empirical Study of Scaling Law for OCR

1 code implementation • 29 Dec 2023 • Miao Rang, Zhenni Bi, Chuanjian Liu, Yunhe Wang, Kai Han

The laws of model size, data volume, computation and model performance have been extensively studied in the field of Natural Language Processing (NLP).

Ranked #1 on Scene Text Recognition on ICDAR2013 (using extra training data)

Optical Character Recognition Optical Character Recognition (OCR) +1

106

Paper
Code

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

Paper
Add Code

LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models

no code implementations • 1 Dec 2023 • Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang

Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.

Image Classification Language Modelling +3

Paper
Add Code

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs

1 code implementation • 24 Nov 2023 • Jonathan Roberts, Timo Lüddecke, Rehan Sheikh, Kai Han, Samuel Albanie

Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored, despite potential wide-ranging benefits to navigation, environmental research, urban development, and disaster response.

Disaster Response

Paper
Code

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

1 code implementation • NeurIPS 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu

To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.

Knowledge Distillation

Paper
Code

SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching

no code implementations • 26 Oct 2023 • Xinghui Li, Jingyi Lu, Kai Han, Victor Prisacariu

In this paper, we address the challenge of matching semantically similar keypoints across image pairs.

Paper
Add Code

Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism

3 code implementations • NeurIPS 2023 • Chengcheng Wang, wei he, Ying Nie, Jianyuan Guo, Chuanjian Liu, Kai Han, Yunhe Wang

In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection.

object-detection Real-Time Object Detection

1,111

Paper
Code

Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings

no code implementations • ICCV 2023 • Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin

Following this observation, we propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features.

Segmentation Semantic Segmentation

Paper
Add Code

Practical Parallel Algorithms for Non-Monotone Submodular Maximization

no code implementations • 21 Aug 2023 • Shuang Cui, Kai Han, Jing Tang, He Huang, Xueying Li, Aakas Zhiyuli, Hanxiao Li

Submodular maximization has found extensive applications in various domains within the field of artificial intelligence, including but not limited to machine learning, computer vision, and natural language processing.

Paper
Add Code

Guide3D: Create 3D Avatars from Text and Image Guidance

no code implementations • 18 Aug 2023 • Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong

To this end, we introduce Guide3D, a zero-shot text-and-image-guided generative model for 3D avatar generation based on diffusion models.

3D Generation Text to 3D +1

Paper
Add Code

Category Feature Transformer for Semantic Segmentation

1 code implementation • 10 Aug 2023 • Quan Tang, Chuanjian Liu, Fagui Liu, Yifan Liu, Jun Jiang, BoWen Zhang, Kai Han, Yunhe Wang

Aggregation of multi-stage features has been revealed to play a significant role in semantic segmentation.

Segmentation Semantic Segmentation

Paper
Code

ParameterNet: Parameters Are All You Need

no code implementations • 26 Jun 2023 • Kai Han, Yunhe Wang, Jianyuan Guo, Enhua Wu

In the language domain, LLaMA-1B enhanced with ParameterNet achieves 2\% higher accuracy over vanilla LLaMA.

Paper
Add Code

GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?

1 code implementation • 1 Jun 2023 • Ning Ding, Yehui Tang, Zhongqian Fu, Chao Xu, Kai Han, Yunhe Wang

We present a new learning paradigm in which the knowledge extracted from large pre-trained models are utilized to help models like CNN and ViT learn enhanced representations and achieve better performance.

Descriptive Image Classification

1,111

Paper
Code

ViCo: Plug-and-play Visual Condition for Personalized Text-to-image Generation

1 code implementation • 1 Jun 2023 • Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong

Personalized text-to-image generation using diffusion models has recently emerged and garnered significant interest.

Text-to-Image Generation

231

Paper
Code

GPT4GEO: How a Language Model Sees the World's Geography

no code implementations • 30 May 2023 • Jonathan Roberts, Timo Lüddecke, Sowmen Das, Kai Han, Samuel Albanie

Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code.

Disaster Response Language Modelling +2

Paper
Add Code

VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale

1 code implementation • 25 May 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Han Hu, Chang Xu, Yunhe Wang

The tremendous success of large models trained on extensive datasets demonstrates that scale is a key ingredient in achieving superior results.

Data Augmentation Knowledge Distillation

Paper
Code

Learning Semi-supervised Gaussian Mixture Models for Generalized Category Discovery

1 code implementation • ICCV 2023 • Bingchen Zhao, Xin Wen, Kai Han

In this paper, we address the problem of generalized category discovery (GCD), \ie, given a set of images where part of them are labelled and the rest are not, the task is to automatically cluster the images in the unlabelled data, leveraging the information from the labelled data, while the unlabelled data contain images from the labelled classes and also new ones.

Contrastive Learning Image Classification +2

Paper
Code

SimSC: A Simple Framework for Semantic Correspondence with Temperature Learning

no code implementations • 3 May 2023 • Xinghui Li, Kai Han, Xingchen Wan, Victor Adrian Prisacariu

This module is trained together with the backbone and the temperature is updated online.

Semantic correspondence

Paper
Add Code

SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

no code implementations • 23 Apr 2023 • Jonathan Roberts, Kai Han, Samuel Albanie

In this work, we introduce SATellite ImageNet (SATIN), a metadataset curated from 27 existing remotely sensed datasets, and comprehensively evaluate the zero-shot transfer classification capabilities of a broad range of vision-language (VL) models on SATIN.

Classification Image Classification

Paper
Add Code

CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

1 code implementation • 14 Apr 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong

GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes.

Clustering Contrastive Learning +1

Paper
Code

What's in a Name? Beyond Class Indices for Image Recognition

no code implementations • 5 Apr 2023 • Kai Han, Yandong Li, Sagar Vaze, Jie Li, Xuhui Jia

In this paper, we reconsider the recognition problem and task a vision-language model to assign class names to images given only a large and essentially unconstrained vocabulary of categories as prior information.

Language Modelling Object Recognition

Paper
Add Code

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models

1 code implementation • 3 Apr 2023 • Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong

We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars with controllable poses.

5,626

Paper
Code

Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network

1 code implementation • ICCV 2023 • Cong Han, Yujie Zhong, Dengjie Li, Kai Han, Lin Ma

Recently, the open-vocabulary semantic segmentation problem has attracted increasing attention and the best performing methods are based on two-stream networks: one stream for proposal mask generation and the other for segment classification using a pretrained visual-language model.

Classification Language Modelling +3

Paper
Code

SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction

no code implementations • CVPR 2023 • Yukang Cao, Kai Han, Kwan-Yee K. Wong

We propose a flexible framework which, by leveraging the parametric SMPL-X model, can take an arbitrary number of input images to reconstruct a clothed human model under an uncalibrated setting.

Paper
Add Code

Learning Attention as Disentangler for Compositional Zero-shot Learning

1 code implementation • CVPR 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong

The key to CZSL is learning the disentanglement of the attribute-object composition.

Attribute Compositional Zero-Shot Learning +1

Paper
Code

Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation

1 code implementation • ICCV 2023 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao

On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use.

Ranked #2 on Multi-Hypotheses 3D Human Pose Estimation on Human3.6M

3D Pose Estimation Monocular 3D Human Pose Estimation +1

128

Paper
Code

Masked Image Modeling with Local Multi-Scale Reconstruction

1 code implementation • CVPR 2023 • Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han

The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.

Representation Learning

1,111

Paper
Code

Network Expansion for Practical Training Acceleration

1 code implementation • CVPR 2023 • Ning Ding, Yehui Tang, Kai Han, Chao Xu, Yunhe Wang

Recently, the sizes of deep neural networks and training datasets both increase drastically to pursue better performance in a practical sense.

1,111

Paper
Code

Redistribution of Weights and Activations for AdderNet Quantization

no code implementations • 20 Dec 2022 • Ying Nie, Kai Han, Haikang Diao, Chuanjian Liu, Enhua Wu, Yunhe Wang

To this end, we first thoroughly analyze the difference on distributions of weights and activations in AdderNet and then propose a new quantization algorithm by redistributing the weights and the activations.

Quantization

Paper
Add Code

FastMIM: Expediting Masked Image Modeling Pre-training for Vision

1 code implementation • 13 Dec 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Yunhe Wang, Chang Xu

This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.

Paper
Code

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

15 code implementations • 23 Nov 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang

The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.

29,758

Paper
Code

Novel Class Discovery without Forgetting

no code implementations • 21 Jul 2022 • K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian

Inspired by this, we identify and formulate a new, pragmatic problem setting of NCDwF: Novel Class Discovery without Forgetting, which tasks a machine learning model to incrementally discover novel categories of instances from unlabeled data, while maintaining its performance on the previously seen categories.

Novel Class Discovery

Paper
Add Code

Network Amplification With Efficient MACs Allocation

2 code implementations • Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022 • Chuanjian Liu, Kai Han, An Xiao, Ying Nie, Wei zhang, Yunhe Wang

In particular, the proposed method is used to enlarge models sourced by GhostNet, we achieve state-of-the-art 80. 9% and 84. 3% ImageNet top-1 accuracies under the setting of 600M and 4. 4B MACs, respectively.

Paper
Code

Vision GNN: An Image is Worth Graph of Nodes

11 code implementations • 1 Jun 2022 • Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, Enhua Wu

In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph-level feature for visual tasks.

Ranked #365 on Image Classification on ImageNet

Image Classification Object Detection

3,802

Paper
Code

Spacing Loss for Discovering Novel Categories

1 code implementation • 22 Apr 2022 • K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian

Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data, by utilizing labeled instances from a disjoint set of classes.

Novel Class Discovery

365

Paper
Code

JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction

no code implementations • CVPR 2022 • Yukang Cao, GuanYing Chen, Kai Han, Wenqi Yang, Kwan-Yee K. Wong

In this paper, we focus on improving the quality of face in the reconstruction and propose a novel Jointly-aligned Implicit Face Function (JIFF) that combines the merits of the implicit function based approach and model based approach.

3D Human Reconstruction Face Model +1

Paper
Add Code

SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation

no code implementations • CVPR 2022 • Chenming Zhu, Xuanye Zhang, Yanran Li, Liangdong Qiu, Kai Han, Xiaoguang Han

Contour-based models are efficient and generic to be incorporated with any existing segmentation methods, but they often generate over-smoothed contour and tend to fail on corner areas.

Instance Segmentation Segmentation +1

Paper
Add Code

GhostNets on Heterogeneous Devices via Cheap Operations

8 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian

The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.

3,801

Paper
Code

Generalized Category Discovery

1 code implementation • CVPR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman

Here, the unlabelled images may come from labelled classes or from novel ones.

Ranked #1 on Open-World Semi-Supervised Learning on CIFAR-10 (Seen accuracy (50% Labeled) metric)

Fine-Grained Visual Recognition Open-World Semi-Supervised Learning +1

177

Paper
Code

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture

1 code implementation • 4 Jan 2022 • Kai Han, Jianyuan Guo, Yehui Tang, Yunhe Wang

We hope this new baseline will be helpful to the further research and application of vision transformer.

3,801

Paper
Code

Instance-Aware Dynamic Neural Network Quantization

4 code implementations • CVPR 2022 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

However, natural images are of huge diversity with abundant content and using such a universal quantization configuration for all samples is not an optimal strategy.

Quantization

1,111

Paper
Code

An Image Patch is a Wave: Phase-Aware Vision MLP

10 code implementations • CVPR 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang

To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.

Image Classification object-detection +2

3,801

Paper
Code

Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

2 code implementations • ICLR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman

In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.

Ranked #10 on Out-of-Distribution Detection on CIFAR-100 vs CIFAR-10

Open Set Learning Out-of-Distribution Detection

247

Paper
Code

Symmetry-Enhanced Attention Network for Acute Ischemic Infarct Segmentation with Non-Contrast CT Images

1 code implementation • 11 Oct 2021 • Kongming Liang, Kai Han, Xiuli Li, Xiaoqing Cheng, Yiming Li, Yizhou Wang, Yizhou Yu

In this paper, we propose a symmetry enhanced attention network (SEAN) for acute ischemic infarct segmentation.

Paper
Code

Learning Versatile Convolution Filters for Efficient Visual Recognition

no code implementations • 20 Sep 2021 • Kai Han, Yunhe Wang, Chang Xu, Chunjing Xu, Enhua Wu, DaCheng Tao

A series of secondary filters can be derived from a primary filter with the help of binary masks.

Paper
Add Code

Hire-MLP: Vision MLP via Hierarchical Rearrangement

10 code implementations • CVPR 2022 • Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang

Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.

Image Classification object-detection +2

160

Paper
Code

Greedy Network Enlarging

1 code implementation • 31 Jul 2021 • Chuanjian Liu, Kai Han, An Xiao, Yiping Deng, Wei zhang, Chunjing Xu, Yunhe Wang

Recent studies on deep convolutional neural networks present a simple paradigm of architecture design, i. e., models with more MACs typically achieve better accuracy, such as EfficientNet and RegNet.

Paper
Code

Real-time Keypoints Detection for Autonomous Recovery of the Unmanned Ground Vehicle

no code implementations • 27 Jul 2021 • Jie Li, Sheng Zhang, Kai Han, Xia Yuan, Chunxia Zhao, Yu Liu

UGV-KPNet is computationally efficient with a small number of parameters and provides pixel-level accurate keypoints detection results in real-time.

Keypoint Detection

Paper
Add Code

CMT: Convolutional Neural Networks Meet Vision Transformers

14 code implementations • CVPR 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen, Yunhe Wang, Chang Xu

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image.

556

Paper
Code

Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation

no code implementations • NeurIPS 2021 • Bingchen Zhao, Kai Han

In this paper, we tackle the problem of novel visual category discovery, i. e., grouping unlabelled images from new classes into different semantic partitions by leveraging a labelled dataset that contains images from other different but relevant categories.

Fine-Grained Visual Recognition Knowledge Distillation

Paper
Add Code

Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation

1 code implementation • 3 Jul 2021 • Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang

Specifically, we train a tiny student model to match a pre-trained teacher model in the patch-level manifold space.

Knowledge Distillation Model Compression +1

Paper
Code

Augmented Shortcuts for Vision Transformers

4 code implementations • NeurIPS 2021 • Yehui Tang, Kai Han, Chang Xu, An Xiao, Yiping Deng, Chao Xu, Yunhe Wang

Transformer models have achieved great progress on computer vision tasks recently.

3,801

Paper
Code

AutoNovel: Automatically Discovering and Learning Novel Visual Categories

1 code implementation • 29 Jun 2021 • Kai Han, Sylvestre-Alvise Rebuffi, Sébastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman

We present a new approach called AutoNovel to address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labelled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use ranking statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.

Ranked #1 on Novel Class Discovery on SVHN

Clustering Image Clustering +2

219

Paper
Code

Post-Training Quantization for Vision Transformer

no code implementations • NeurIPS 2021 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

Recently, transformer has achieved remarkable performance on a variety of computer vision applications.

Quantization

Paper
Add Code

Positive-Unlabeled Data Purification in the Wild for Object Detection

no code implementations • CVPR 2021 • Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang

In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data.

Knowledge Distillation object-detection +1

Paper
Add Code

ReNAS: Relativistic Evaluation of Neural Architecture Search

7 code implementations • CVPR 2021 • Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

1,111

Paper
Code

Patch Slimming for Efficient Vision Transformers

no code implementations • CVPR 2022 • Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao

We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.

Ranked #8 on Efficient ViTs on ImageNet-1K (with DeiT-T)

Efficient ViTs

Paper
Add Code

Dynamic Resolution Network

3 code implementations • NeurIPS 2021 • Mingjian Zhu, Kai Han, Enhua Wu, Qiulin Zhang, Ying Nie, Zhenzhong Lan, Yunhe Wang

To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample.

334

Paper
Code

Dense Reconstruction of Transparent Objects by Altering Incident Light Paths Through Refraction

no code implementations • 20 May 2021 • Kai Han, Kwan-Yee K. Wong, Miaomiao Liu

We present a simple setup that allows us to alter the incident light paths before light rays enter the object by immersing the object partially in a liquid, and develop a method for recovering the object surface through reconstructing and triangulating such incident light paths.

Object Surface Reconstruction +1

Paper
Add Code

Joint Representation Learning and Novel Category Discovery on Single- and Multi-modal Data

no code implementations • ICCV 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green

This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.

Contrastive Learning Representation Learning

Paper
Add Code

Vision Transformer Pruning

2 code implementations • 17 Apr 2021 • Mingjian Zhu, Yehui Tang, Kai Han

Vision transformer has achieved competitive performance on a variety of computer vision applications.

433

Paper
Code

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

no code implementations • CVPR 2021 • Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang

Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases.

Ranked #10 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Classification Contrastive Learning +4

Paper
Add Code

Distilling Object Detectors via Decoupled Features

1 code implementation • CVPR 2021 • Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu

To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.

Image Classification Knowledge Distillation +3

109

Paper
Code

Learning Frequency Domain Approximation for Binary Neural Networks

3 code implementations • NeurIPS 2021 • Yixing Xu, Kai Han, Chang Xu, Yehui Tang, Chunjing Xu, Yunhe Wang

Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function.

334

Paper
Code

Transformer in Transformer

12 code implementations • NeurIPS 2021 • Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang

In this paper, we point out that the attention inside these local patches are also essential for building visual transformers with high performance and we explore a new architecture, namely, Transformer iN Transformer (TNT).

Ranked #9 on Fine-Grained Image Classification on Oxford-IIIT Pet Dataset

Fine-Grained Image Classification Sentence

29,747

Paper
Code

AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence

no code implementations • 25 Jan 2021 • Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei zhang, Chunjing Xu, DaCheng Tao

With a comprehensive comparison on the performance, power consumption, hardware resource consumption and network generalization capability, we conclude the AdderNet is able to surpass all the other competitors including the classical CNN, novel memristor-network, XNOR-Net and the shift-kernel based network, indicating its great potential in future high performance and energy-efficient artificial intelligence applications.

Quantization

Paper
Add Code

Fixed Viewpoint Mirror Surface Reconstruction under an Uncalibrated Camera

1 code implementation • 23 Jan 2021 • Kai Han, Miaomiao Liu, Dirk Schnieders, Kwan-Yee K. Wong

This paper addresses the problem of mirror surface reconstruction, and proposes a solution based on observing the reflections of a moving reference plane on the mirror surface.

Surface Reconstruction

Paper
Code

GhostSR: Learning Ghost Features for Efficient Image Super-Resolution

4 code implementations • 21 Jan 2021 • Ying Nie, Kai Han, Zhenhua Liu, Chuanjian Liu, Yunhe Wang

Based on the observation that many features in SISR models are also similar to each other, we propose to use shift operation to generate the redundant features (i. e., ghost features).

Image Super-Resolution

Paper
Code

A Flexible Framework for Discovering Novel Categories with Contrastive Learning

no code implementations • 1 Jan 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green

This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.

Contrastive Learning Representation Learning

Paper
Add Code

A Survey on Visual Transformer

no code implementations • 23 Dec 2020 • Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.

Image Classification Inductive Bias

Paper
Add Code

$\mathbb{X}$Resolution Correspondence Networks

1 code implementation • 17 Dec 2020 • Georgi Tinchev, Shuda Li, Kai Han, David Mitchell, Rigas Kouskouridas

In this paper, we aim at establishing accurate dense correspondences between a pair of images with overlapping field of view under challenging illumination variation, viewpoint changes, and style differences.

Paper
Code

Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets

3 code implementations • NeurIPS 2020 • Kai Han, Yunhe Wang, Qiulin Zhang, Wei zhang, Chunjing Xu, Tong Zhang

To this end, we summarize a tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint.

Image Classification

3,801

Paper
Code

Dynamic Feature Pyramid Networks for Object Detection

1 code implementation • 1 Dec 2020 • Mingjian Zhu, Kai Han, Changbin Yu, Yunhe Wang

An attempt to enhance the FPN is enriching the spatial information by expanding the receptive fields, which is promising to largely improve the detection accuracy.

Object object-detection +1

Paper
Code

VEGA: Towards an End-to-End Configurable AutoML Pipeline

1 code implementation • 3 Nov 2020 • Bochao Wang, Hang Xu, Jiajin Zhang, Chen Chen, Xiaozhi Fang, Yixing Xu, Ning Kang, Lanqing Hong, Chenhan Jiang, Xinyue Cai, Jiawei Li, Fengwei Zhou, Yong Li, Zhicheng Liu, Xinghao Chen, Kai Han, Han Shu, Dehua Song, Yunhe Wang, Wei zhang, Chunjing Xu, Zhenguo Li, Wenzhi Liu, Tong Zhang

Automated Machine Learning (AutoML) is an important industrial solution for automatic discovery and deployment of the machine learning models.

BIG-bench Machine Learning Data Augmentation +3

834

Paper
Code

Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets

10 code implementations • 28 Oct 2020 • Kai Han, Yunhe Wang, Qiulin Zhang, Wei zhang, Chunjing Xu, Tong Zhang

To this end, we summarize a tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint.

Ranked #695 on Image Classification on ImageNet

Image Classification Rubik's Cube

29,758

Paper
Code

Deterministic Approximation for Submodular Maximization over a Matroid in Nearly Linear Time

no code implementations • NeurIPS 2020 • Kai Han, Zongmai Cao, Shuang Cui, Benwei Wu

We study the problem of maximizing a non-monotone, non-negative submodular function subject to a matroid constraint.

Paper
Add Code

Training Binary Neural Networks through Learning with Noisy Supervision

1 code implementation • ICML 2020 • Kai Han, Yunhe Wang, Yixing Xu, Chunjing Xu, Enhua Wu, Chang Xu

This paper formalizes the binarization operations over neural networks from a learning perspective.

Binarization

Paper
Code

Searching for Low-Bit Weights in Quantized Neural Networks

1 code implementation • NeurIPS 2020 • Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing Xu, Chao Xu, DaCheng Tao, Chang Xu

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.

Image Classification Quantization +1

Paper
Code

Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint

no code implementations • 12 Aug 2020 • Jing Tang, Xueyan Tang, Andrew Lim, Kai Han, Chongshou Li, Junsong Yuan

Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum.

Paper
Add Code

Deep Photometric Stereo for Non-Lambertian Surfaces

1 code implementation • 26 Jul 2020 • Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong

To deal with the uncalibrated scenario where light directions are unknown, we introduce a new convolutional network, named LCNet, to estimate light directions from input images.

Paper
Code

LSD-C: Linearly Separable Deep Clusters

1 code implementation • 17 Jun 2020 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman

We present LSD-C, a novel method to identify clusters in an unlabeled dataset.

Clustering Data Augmentation +4

Paper
Code

Dual-Resolution Correspondence Networks

1 code implementation • NeurIPS 2020 • Xinghui Li, Kai Han, Shuda Li, Victor Adrian Prisacariu

The fine-resolution feature maps are used to obtain the final dense correspondences guided by the refined coarse 4D correlation tensor.

Paper
Code

Efficient Approximation Algorithms for Adaptive Influence Maximization

2 code implementations • 14 Apr 2020 • Keke Huang, Jing Tang, Kai Han, Xiaokui Xiao, Wei Chen, Aixin Sun, Xueyan Tang, Andrew Lim

In this paper, we propose the first practical algorithm for the adaptive IM problem that could provide the worst-case approximation guarantee of $1-\mathrm{e}^{\rho_b(\varepsilon-1)}$, where $\rho_b=1-(1-1/b)^b$ and $\varepsilon \in (0, 1)$ is a user-specified parameter.

Social and Information Networks

Paper
Code

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

1 code implementation • CVPR 2020 • Jie Li, Kai Han, Peng Wang, Yu Liu, Xia Yuan

In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.

Ranked #4 on 3D Semantic Scene Completion from a single RGB image on NYUv2

3D Semantic Scene Completion from a single RGB image

Paper
Code

Correspondence Networks with Adaptive Neighbourhood Consensus

1 code implementation • CVPR 2020 • Shuda Li, Kai Han, Theo W. Costain, Henry Howard-Jenkins, Victor Prisacariu

This is a challenging task due to large intra-class variations and a lack of dense pixel level annotations.

Ranked #11 on Semantic correspondence on PF-PASCAL

Semantic correspondence

Paper
Code

Learning Inverse Rendering of Faces from Real-world Videos

1 code implementation • 26 Mar 2020 • Yuda Qiu, Zhangyang Xiong, Kai Han, Zhongyuan Wang, Zixiang Xiong, Xiaoguang Han

To alleviate this problem, we propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images.

Inverse Rendering

Paper
Code

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

1 code implementation • CVPR 2020 • Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu, Xinghao Chen, Chang Xu

To this end, we propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components (i. e. backbone, neck, and head) of object detector in an end-to-end manner.

Image Classification Neural Architecture Search +3

114

Paper
Code

Automatically Discovering and Learning New Visual Categories with Ranking Statistics

1 code implementation • ICLR 2020 • Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman

In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.

Clustering General Classification +1

219

Paper
Code

Widening and Squeezing: Towards Accurate and Efficient QNNs

no code implementations • 3 Feb 2020 • Chuanjian Liu, Kai Han, Yunhe Wang, Hanting Chen, Qi Tian, Chunjing Xu

Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.

Quantization

Paper
Add Code

GhostNet: More Features from Cheap Operations

34 code implementations • CVPR 2020 • Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu

Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources.

Ranked #867 on Image Classification on ImageNet

Image Classification

29,758

Paper
Code

Beyond Human Parts: Dual Part-Aligned Representations for Person Re-Identification

1 code implementation • ICCV 2019 • Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jinge Yao, Kai Han

On the other hand, there still exist many useful contextual cues that do not fall into the scope of predefined human parts or attributes.

Ranked #59 on Person Re-Identification on DukeMTMC-reID

Human Parsing Person Re-Identification

Paper
Code

ReNAS:Relativistic Evaluation of Neural Architecture Search

4 code implementations • 30 Sep 2019 • Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

1,111

Paper
Code

Balanced Binary Neural Networks with Gated Residual

1 code implementation • 26 Sep 2019 • Mingzhu Shen, Xianglong Liu, Ruihao Gong, Kai Han

In this paper, we attempt to maintain the information propagated in the forward process and propose a Balanced Binary Neural Networks with Gated Residual (BBG for short).

Ranked #972 on Image Classification on ImageNet

Binarization General Classification +1

757

Paper
Code

Positive-Unlabeled Compression on the Cloud

2 code implementations • NeurIPS 2019 • Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, DaCheng Tao, Chang Xu

In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor.

Knowledge Distillation

1,111

Paper
Code

Searching for Accurate Binary Neural Architectures

no code implementations • 16 Sep 2019 • Mingzhu Shen, Kai Han, Chunjing Xu, Yunhe Wang

Binary neural networks have attracted tremendous attention due to the efficiency for deploying them on mobile devices.

Paper
Add Code

Learning to Discover Novel Visual Categories via Deep Transfer Clustering

1 code implementation • ICCV 2019 • Kai Han, Andrea Vedaldi, Andrew Zisserman

The second contribution is a method to estimate the number of classes in the unlabelled data.

Clustering Transfer Learning

151

Paper
Code

Full-Stack Filters to Build Minimum Viable CNNs

1 code implementation • 6 Aug 2019 • Kai Han, Yunhe Wang, Yixing Xu, Chunjing Xu, DaCheng Tao, Chang Xu

Existing works used to decrease the number or size of requested convolution filters for a minimum viable CNN on edge devices.

Paper
Code

Attribute Aware Pooling for Pedestrian Attribute Recognition

no code implementations • 27 Jul 2019 • Kai Han, Yunhe Wang, Han Shu, Chuanjian Liu, Chunjing Xu, Chang Xu

This paper expands the strength of deep convolutional neural networks (CNNs) to the pedestrian attribute recognition problem by devising a novel attribute aware pooling algorithm.

Attribute Pedestrian Attribute Recognition

Paper
Add Code

Learning Instance-wise Sparsity for Accelerating Deep Models

no code implementations • 27 Jul 2019 • Chuanjian Liu, Yunhe Wang, Kai Han, Chunjing Xu, Chang Xu

Exploring deep convolutional neural networks of high efficiency and low memory usage is very essential for a wide variety of machine learning tasks.

Paper
Add Code

Co-Evolutionary Compression for Unpaired Image Translation

2 code implementations • ICCV 2019 • Han Shu, Yunhe Wang, Xu Jia, Kai Han, Hanting Chen, Chunjing Xu, Qi Tian, Chang Xu

Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation.

Image-to-Image Translation Translation

237

Paper
Code

Learning Transparent Object Matting

1 code implementation • 25 Jul 2019 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong

In this paper, we formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, called TOM-Net, for learning the refractive flow.

Image Matting Object +1

Paper
Code

Semi-Supervised Learning with Scarce Annotations

1 code implementation • 21 May 2019 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman

The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label.

Multi-class Classification Self-Supervised Learning +1

Paper
Code

Unsupervised Image Matching and Object Discovery as Optimization

1 code implementation • CVPR 2019 • Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann Lecun, Patrick Perez, Jean Ponce

Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts.

Ranked #2 on Single-object colocalization on Object Discovery

Object Object Discovery +2

Paper
Code

Self-calibrating Deep Photometric Stereo Networks

1 code implementation • CVPR 2019 • Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong

This paper proposes an uncalibrated photometric stereo method for non-Lambertian scenes based on deep learning.

170

Paper
Code

Attribute-Aware Attention Model for Fine-grained Representation Learning

1 code implementation • 2 Jan 2019 • Kai Han, Jianyuan Guo, Chao Zhang, Mingjian Zhu

Based on the considerations above, we propose a novel Attribute-Aware Attention Model ($A^3M$), which can learn local attribute representation and global category representation simultaneously in an end-to-end manner.

Ranked #4 on Fine-Grained Image Classification on CompCars

Attribute Fine-Grained Image Classification +4

156

Paper
Code

Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN

2 code implementations • NeurIPS 2018 • Shupeng Su, Chao Zhang, Kai Han, Yonghong Tian

To convert the input into binary code, hashing algorithm has been widely used for approximate nearest neighbor search on large-scale image sets due to its computation and storage efficiency.

Deep Hashing

Paper
Code

PS-FCN: A Flexible Learning Framework for Photometric Stereo

1 code implementation • ECCV 2018 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong

This paper addresses the problem of photometric stereo for non-Lambertian surfaces.

Paper
Code

TOM-Net: Learning Transparent Object Matting from a Single Image

1 code implementation • CVPR 2018 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong

In this paper, we first formulate transparent object matting as a refractive flow estimation problem.

Image Matting Object +1

Paper
Code

AutoEncoder Inspired Unsupervised Feature Selection

1 code implementation • 23 Oct 2017 • Kai Han, Yunhe Wang, Chao Zhang, Chao Li, Chao Xu

High-dimensional data in many areas such as computer vision and machine learning tasks brings in computational and analytical difficulty.

BIG-bench Machine Learning feature selection

Paper
Code

SCNet: Learning Semantic Correspondence

1 code implementation • ICCV 2017 • Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.

Semantic correspondence

Paper
Code

Mirror Surface Reconstruction Under an Uncalibrated Camera

no code implementations • CVPR 2016 • Kai Han, Kwan-Yee K. Wong, Dirk Schnieders, Miaomiao Liu

Unlike previous approaches which require tedious work to calibrate the camera, our method can recover both the camera intrinsics and extrinsics together with the mirror surface from reflections of the reference plane under at least three unknown distinct poses.

Surface Reconstruction

Paper
Add Code

A Fixed Viewpoint Approach for Dense Reconstruction of Transparent Objects

no code implementations • CVPR 2015 • Kai Han, Kwan-Yee K. Wong, Miaomiao Liu

In this paper, we develop a fixed viewpoint approach for dense surface reconstruction of transparent objects based on refraction of light.

Object Surface Reconstruction +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.