no code implementations • 17 Apr 2024 • Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang
In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models.
no code implementations • 20 Mar 2024 • Hongjun Wang, Sagar Vaze, Kai Han
We thoroughly evaluate our SPTNet on standard benchmarks and demonstrate that our method outperforms existing GCD methods.
1 code implementation • 27 Feb 2024 • Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang
In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference.
1 code implementation • 26 Feb 2024 • wei he, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang
Large language models (LLMs) face a daunting challenge due to the excessive computational and memory requirements of the commonly used Transformer architecture.
no code implementations • 9 Feb 2024 • Shaojie Tang, Shuzhang Cai, Jing Yuan, Kai Han
In the rapidly evolving landscape of retail, assortment planning plays a crucial role in determining the success of a business.
1 code implementation • 7 Feb 2024 • Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu
Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.
1 code implementation • 6 Feb 2024 • Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang
Recent advancements in large language models have sparked interest in their extraordinary and near-superhuman capabilities, leading researchers to explore methods for evaluating and optimizing these abilities, which is called superalignment.
no code implementations • 5 Feb 2024 • Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, DaCheng Tao
Model compression methods reduce the memory and computational cost of Transformer, which is a necessary step to implement large language/vision models on practical devices.
1 code implementation • 5 Feb 2024 • Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang
Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.
no code implementations • 5 Feb 2024 • Xiaohu Huang, Hao Zhou, Kun Yao, Kai Han
To address these issues, FROSTER employs a residual feature distillation approach to ensure that CLIP retains its generalization capability while effectively adapting to the action recognition task.
1 code implementation • 29 Dec 2023 • Miao Rang, Zhenni Bi, Chuanjian Liu, Yunhe Wang, Kai Han
The laws of model size, data volume, computation and model performance have been extensively studied in the field of Natural Language Processing (NLP).
Ranked #1 on Scene Text Recognition on ICDAR2013 (using extra training data)
Optical Character Recognition Optical Character Recognition (OCR) +1
no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao
We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.
no code implementations • 1 Dec 2023 • Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang
Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.
1 code implementation • 24 Nov 2023 • Jonathan Roberts, Timo Lüddecke, Rehan Sheikh, Kai Han, Samuel Albanie
Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored, despite potential wide-ranging benefits to navigation, environmental research, urban development, and disaster response.
1 code implementation • NeurIPS 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu
To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.
no code implementations • 26 Oct 2023 • Xinghui Li, Jingyi Lu, Kai Han, Victor Prisacariu
In this paper, we address the challenge of matching semantically similar keypoints across image pairs.
3 code implementations • NeurIPS 2023 • Chengcheng Wang, wei he, Ying Nie, Jianyuan Guo, Chuanjian Liu, Kai Han, Yunhe Wang
In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection.
no code implementations • ICCV 2023 • Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin
Following this observation, we propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features.
no code implementations • 21 Aug 2023 • Shuang Cui, Kai Han, Jing Tang, He Huang, Xueying Li, Aakas Zhiyuli, Hanxiao Li
Submodular maximization has found extensive applications in various domains within the field of artificial intelligence, including but not limited to machine learning, computer vision, and natural language processing.
no code implementations • 18 Aug 2023 • Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
To this end, we introduce Guide3D, a zero-shot text-and-image-guided generative model for 3D avatar generation based on diffusion models.
1 code implementation • 10 Aug 2023 • Quan Tang, Chuanjian Liu, Fagui Liu, Yifan Liu, Jun Jiang, BoWen Zhang, Kai Han, Yunhe Wang
Aggregation of multi-stage features has been revealed to play a significant role in semantic segmentation.
no code implementations • 26 Jun 2023 • Kai Han, Yunhe Wang, Jianyuan Guo, Enhua Wu
In the language domain, LLaMA-1B enhanced with ParameterNet achieves 2\% higher accuracy over vanilla LLaMA.
1 code implementation • 1 Jun 2023 • Ning Ding, Yehui Tang, Zhongqian Fu, Chao Xu, Kai Han, Yunhe Wang
We present a new learning paradigm in which the knowledge extracted from large pre-trained models are utilized to help models like CNN and ViT learn enhanced representations and achieve better performance.
1 code implementation • 1 Jun 2023 • Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong
Personalized text-to-image generation using diffusion models has recently emerged and garnered significant interest.
no code implementations • 30 May 2023 • Jonathan Roberts, Timo Lüddecke, Sowmen Das, Kai Han, Samuel Albanie
Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code.
1 code implementation • 25 May 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Han Hu, Chang Xu, Yunhe Wang
The tremendous success of large models trained on extensive datasets demonstrates that scale is a key ingredient in achieving superior results.
1 code implementation • ICCV 2023 • Bingchen Zhao, Xin Wen, Kai Han
In this paper, we address the problem of generalized category discovery (GCD), \ie, given a set of images where part of them are labelled and the rest are not, the task is to automatically cluster the images in the unlabelled data, leveraging the information from the labelled data, while the unlabelled data contain images from the labelled classes and also new ones.
no code implementations • 3 May 2023 • Xinghui Li, Kai Han, Xingchen Wan, Victor Adrian Prisacariu
This module is trained together with the backbone and the temperature is updated online.
no code implementations • 23 Apr 2023 • Jonathan Roberts, Kai Han, Samuel Albanie
In this work, we introduce SATellite ImageNet (SATIN), a metadataset curated from 27 existing remotely sensed datasets, and comprehensively evaluate the zero-shot transfer classification capabilities of a broad range of vision-language (VL) models on SATIN.
1 code implementation • 14 Apr 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes.
no code implementations • 5 Apr 2023 • Kai Han, Yandong Li, Sagar Vaze, Jie Li, Xuhui Jia
In this paper, we reconsider the recognition problem and task a vision-language model to assign class names to images given only a large and essentially unconstrained vocabulary of categories as prior information.
1 code implementation • 3 Apr 2023 • Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars with controllable poses.
1 code implementation • ICCV 2023 • Cong Han, Yujie Zhong, Dengjie Li, Kai Han, Lin Ma
Recently, the open-vocabulary semantic segmentation problem has attracted increasing attention and the best performing methods are based on two-stream networks: one stream for proposal mask generation and the other for segment classification using a pretrained visual-language model.
no code implementations • CVPR 2023 • Yukang Cao, Kai Han, Kwan-Yee K. Wong
We propose a flexible framework which, by leveraging the parametric SMPL-X model, can take an arbitrary number of input images to reconstruct a clothed human model under an uncalibrated setting.
1 code implementation • CVPR 2023 • Shaozhe Hao, Kai Han, Kwan-Yee K. Wong
The key to CZSL is learning the disentanglement of the attribute-object composition.
1 code implementation • ICCV 2023 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao
On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use.
1 code implementation • CVPR 2023 • Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han
The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.
1 code implementation • CVPR 2023 • Ning Ding, Yehui Tang, Kai Han, Chao Xu, Yunhe Wang
Recently, the sizes of deep neural networks and training datasets both increase drastically to pursue better performance in a practical sense.
no code implementations • 20 Dec 2022 • Ying Nie, Kai Han, Haikang Diao, Chuanjian Liu, Enhua Wu, Yunhe Wang
To this end, we first thoroughly analyze the difference on distributions of weights and activations in AdderNet and then propose a new quantization algorithm by redistributing the weights and the activations.
1 code implementation • 13 Dec 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Yunhe Wang, Chang Xu
This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.
15 code implementations • 23 Nov 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang
The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.
no code implementations • 21 Jul 2022 • K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian
Inspired by this, we identify and formulate a new, pragmatic problem setting of NCDwF: Novel Class Discovery without Forgetting, which tasks a machine learning model to incrementally discover novel categories of instances from unlabeled data, while maintaining its performance on the previously seen categories.
2 code implementations • Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022 • Chuanjian Liu, Kai Han, An Xiao, Ying Nie, Wei zhang, Yunhe Wang
In particular, the proposed method is used to enlarge models sourced by GhostNet, we achieve state-of-the-art 80. 9% and 84. 3% ImageNet top-1 accuracies under the setting of 600M and 4. 4B MACs, respectively.
11 code implementations • 1 Jun 2022 • Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, Enhua Wu
In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph-level feature for visual tasks.
Ranked #365 on Image Classification on ImageNet
1 code implementation • 22 Apr 2022 • K J Joseph, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian
Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data, by utilizing labeled instances from a disjoint set of classes.
no code implementations • CVPR 2022 • Yukang Cao, GuanYing Chen, Kai Han, Wenqi Yang, Kwan-Yee K. Wong
In this paper, we focus on improving the quality of face in the reconstruction and propose a novel Jointly-aligned Implicit Face Function (JIFF) that combines the merits of the implicit function based approach and model based approach.
no code implementations • CVPR 2022 • Chenming Zhu, Xuanye Zhang, Yanran Li, Liangdong Qiu, Kai Han, Xiaoguang Han
Contour-based models are efficient and generic to be incorporated with any existing segmentation methods, but they often generate over-smoothed contour and tend to fail on corner areas.
8 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian
The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.
1 code implementation • CVPR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman
Here, the unlabelled images may come from labelled classes or from novel ones.
Ranked #1 on Open-World Semi-Supervised Learning on CIFAR-10 (Seen accuracy (50% Labeled) metric)
Fine-Grained Visual Recognition Open-World Semi-Supervised Learning +1
1 code implementation • 4 Jan 2022 • Kai Han, Jianyuan Guo, Yehui Tang, Yunhe Wang
We hope this new baseline will be helpful to the further research and application of vision transformer.
4 code implementations • CVPR 2022 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao
However, natural images are of huge diversity with abundant content and using such a universal quantization configuration for all samples is not an optimal strategy.
10 code implementations • CVPR 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang
To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.
2 code implementations • ICLR 2022 • Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman
In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
Ranked #10 on Out-of-Distribution Detection on CIFAR-100 vs CIFAR-10
1 code implementation • 11 Oct 2021 • Kongming Liang, Kai Han, Xiuli Li, Xiaoqing Cheng, Yiming Li, Yizhou Wang, Yizhou Yu
In this paper, we propose a symmetry enhanced attention network (SEAN) for acute ischemic infarct segmentation.
no code implementations • 20 Sep 2021 • Kai Han, Yunhe Wang, Chang Xu, Chunjing Xu, Enhua Wu, DaCheng Tao
A series of secondary filters can be derived from a primary filter with the help of binary masks.
10 code implementations • CVPR 2022 • Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang
Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.
1 code implementation • 31 Jul 2021 • Chuanjian Liu, Kai Han, An Xiao, Yiping Deng, Wei zhang, Chunjing Xu, Yunhe Wang
Recent studies on deep convolutional neural networks present a simple paradigm of architecture design, i. e., models with more MACs typically achieve better accuracy, such as EfficientNet and RegNet.
no code implementations • 27 Jul 2021 • Jie Li, Sheng Zhang, Kai Han, Xia Yuan, Chunxia Zhao, Yu Liu
UGV-KPNet is computationally efficient with a small number of parameters and provides pixel-level accurate keypoints detection results in real-time.
14 code implementations • CVPR 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen, Yunhe Wang, Chang Xu
Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image.
no code implementations • NeurIPS 2021 • Bingchen Zhao, Kai Han
In this paper, we tackle the problem of novel visual category discovery, i. e., grouping unlabelled images from new classes into different semantic partitions by leveraging a labelled dataset that contains images from other different but relevant categories.
1 code implementation • 3 Jul 2021 • Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang
Specifically, we train a tiny student model to match a pre-trained teacher model in the patch-level manifold space.
4 code implementations • NeurIPS 2021 • Yehui Tang, Kai Han, Chang Xu, An Xiao, Yiping Deng, Chao Xu, Yunhe Wang
Transformer models have achieved great progress on computer vision tasks recently.
1 code implementation • 29 Jun 2021 • Kai Han, Sylvestre-Alvise Rebuffi, Sébastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman
We present a new approach called AutoNovel to address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labelled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use ranking statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.
Ranked #1 on Novel Class Discovery on SVHN
no code implementations • NeurIPS 2021 • Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao
Recently, transformer has achieved remarkable performance on a variety of computer vision applications.
no code implementations • CVPR 2021 • Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang
In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data.
7 code implementations • CVPR 2021 • Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu
An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).
no code implementations • CVPR 2022 • Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao
We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.
Ranked #8 on Efficient ViTs on ImageNet-1K (with DeiT-T)
3 code implementations • NeurIPS 2021 • Mingjian Zhu, Kai Han, Enhua Wu, Qiulin Zhang, Ying Nie, Zhenzhong Lan, Yunhe Wang
To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample.
no code implementations • 20 May 2021 • Kai Han, Kwan-Yee K. Wong, Miaomiao Liu
We present a simple setup that allows us to alter the incident light paths before light rays enter the object by immersing the object partially in a liquid, and develop a method for recovering the object surface through reconstructing and triangulating such incident light paths.
no code implementations • ICCV 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green
This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.
2 code implementations • 17 Apr 2021 • Mingjian Zhu, Yehui Tang, Kai Han
Vision transformer has achieved competitive performance on a variety of computer vision applications.
no code implementations • CVPR 2021 • Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang
Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases.
Ranked #10 on Long-tail Learning on CIFAR-10-LT (ρ=10)
1 code implementation • CVPR 2021 • Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu
To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.
3 code implementations • NeurIPS 2021 • Yixing Xu, Kai Han, Chang Xu, Yehui Tang, Chunjing Xu, Yunhe Wang
Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function.
12 code implementations • NeurIPS 2021 • Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang
In this paper, we point out that the attention inside these local patches are also essential for building visual transformers with high performance and we explore a new architecture, namely, Transformer iN Transformer (TNT).
no code implementations • 25 Jan 2021 • Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei zhang, Chunjing Xu, DaCheng Tao
With a comprehensive comparison on the performance, power consumption, hardware resource consumption and network generalization capability, we conclude the AdderNet is able to surpass all the other competitors including the classical CNN, novel memristor-network, XNOR-Net and the shift-kernel based network, indicating its great potential in future high performance and energy-efficient artificial intelligence applications.
1 code implementation • 23 Jan 2021 • Kai Han, Miaomiao Liu, Dirk Schnieders, Kwan-Yee K. Wong
This paper addresses the problem of mirror surface reconstruction, and proposes a solution based on observing the reflections of a moving reference plane on the mirror surface.
4 code implementations • 21 Jan 2021 • Ying Nie, Kai Han, Zhenhua Liu, Chuanjian Liu, Yunhe Wang
Based on the observation that many features in SISR models are also similar to each other, we propose to use shift operation to generate the redundant features (i. e., ghost features).
no code implementations • 1 Jan 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green
This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.
no code implementations • 23 Dec 2020 • Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.
1 code implementation • 17 Dec 2020 • Georgi Tinchev, Shuda Li, Kai Han, David Mitchell, Rigas Kouskouridas
In this paper, we aim at establishing accurate dense correspondences between a pair of images with overlapping field of view under challenging illumination variation, viewpoint changes, and style differences.
3 code implementations • NeurIPS 2020 • Kai Han, Yunhe Wang, Qiulin Zhang, Wei zhang, Chunjing Xu, Tong Zhang
To this end, we summarize a tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint.
1 code implementation • 1 Dec 2020 • Mingjian Zhu, Kai Han, Changbin Yu, Yunhe Wang
An attempt to enhance the FPN is enriching the spatial information by expanding the receptive fields, which is promising to largely improve the detection accuracy.
1 code implementation • 3 Nov 2020 • Bochao Wang, Hang Xu, Jiajin Zhang, Chen Chen, Xiaozhi Fang, Yixing Xu, Ning Kang, Lanqing Hong, Chenhan Jiang, Xinyue Cai, Jiawei Li, Fengwei Zhou, Yong Li, Zhicheng Liu, Xinghao Chen, Kai Han, Han Shu, Dehua Song, Yunhe Wang, Wei zhang, Chunjing Xu, Zhenguo Li, Wenzhi Liu, Tong Zhang
Automated Machine Learning (AutoML) is an important industrial solution for automatic discovery and deployment of the machine learning models.
10 code implementations • 28 Oct 2020 • Kai Han, Yunhe Wang, Qiulin Zhang, Wei zhang, Chunjing Xu, Tong Zhang
To this end, we summarize a tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint.
Ranked #695 on Image Classification on ImageNet
no code implementations • NeurIPS 2020 • Kai Han, Zongmai Cao, Shuang Cui, Benwei Wu
We study the problem of maximizing a non-monotone, non-negative submodular function subject to a matroid constraint.
1 code implementation • ICML 2020 • Kai Han, Yunhe Wang, Yixing Xu, Chunjing Xu, Enhua Wu, Chang Xu
This paper formalizes the binarization operations over neural networks from a learning perspective.
1 code implementation • NeurIPS 2020 • Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing Xu, Chao Xu, DaCheng Tao, Chang Xu
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
no code implementations • 12 Aug 2020 • Jing Tang, Xueyan Tang, Andrew Lim, Kai Han, Chongshou Li, Junsong Yuan
Second, we enhance the modified greedy algorithm to derive a data-dependent upper bound on the optimum.
1 code implementation • 26 Jul 2020 • Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong
To deal with the uncalibrated scenario where light directions are unknown, we introduce a new convolutional network, named LCNet, to estimate light directions from input images.
1 code implementation • 17 Jun 2020 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
1 code implementation • NeurIPS 2020 • Xinghui Li, Kai Han, Shuda Li, Victor Adrian Prisacariu
The fine-resolution feature maps are used to obtain the final dense correspondences guided by the refined coarse 4D correlation tensor.
2 code implementations • 14 Apr 2020 • Keke Huang, Jing Tang, Kai Han, Xiaokui Xiao, Wei Chen, Aixin Sun, Xueyan Tang, Andrew Lim
In this paper, we propose the first practical algorithm for the adaptive IM problem that could provide the worst-case approximation guarantee of $1-\mathrm{e}^{\rho_b(\varepsilon-1)}$, where $\rho_b=1-(1-1/b)^b$ and $\varepsilon \in (0, 1)$ is a user-specified parameter.
Social and Information Networks
1 code implementation • CVPR 2020 • Jie Li, Kai Han, Peng Wang, Yu Liu, Xia Yuan
In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.
1 code implementation • CVPR 2020 • Shuda Li, Kai Han, Theo W. Costain, Henry Howard-Jenkins, Victor Prisacariu
This is a challenging task due to large intra-class variations and a lack of dense pixel level annotations.
Ranked #11 on Semantic correspondence on PF-PASCAL
1 code implementation • 26 Mar 2020 • Yuda Qiu, Zhangyang Xiong, Kai Han, Zhongyuan Wang, Zixiang Xiong, Xiaoguang Han
To alleviate this problem, we propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images.
1 code implementation • CVPR 2020 • Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu, Xinghao Chen, Chang Xu
To this end, we propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components (i. e. backbone, neck, and head) of object detector in an end-to-end manner.
1 code implementation • ICLR 2020 • Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Andrea Vedaldi, Andrew Zisserman
In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data.
no code implementations • 3 Feb 2020 • Chuanjian Liu, Kai Han, Yunhe Wang, Hanting Chen, Qi Tian, Chunjing Xu
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
34 code implementations • CVPR 2020 • Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu
Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources.
Ranked #867 on Image Classification on ImageNet
1 code implementation • ICCV 2019 • Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jinge Yao, Kai Han
On the other hand, there still exist many useful contextual cues that do not fall into the scope of predefined human parts or attributes.
Ranked #59 on Person Re-Identification on DukeMTMC-reID
4 code implementations • 30 Sep 2019 • Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu
An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).
1 code implementation • 26 Sep 2019 • Mingzhu Shen, Xianglong Liu, Ruihao Gong, Kai Han
In this paper, we attempt to maintain the information propagated in the forward process and propose a Balanced Binary Neural Networks with Gated Residual (BBG for short).
Ranked #972 on Image Classification on ImageNet
2 code implementations • NeurIPS 2019 • Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, DaCheng Tao, Chang Xu
In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor.
no code implementations • 16 Sep 2019 • Mingzhu Shen, Kai Han, Chunjing Xu, Yunhe Wang
Binary neural networks have attracted tremendous attention due to the efficiency for deploying them on mobile devices.
1 code implementation • ICCV 2019 • Kai Han, Andrea Vedaldi, Andrew Zisserman
The second contribution is a method to estimate the number of classes in the unlabelled data.
1 code implementation • 6 Aug 2019 • Kai Han, Yunhe Wang, Yixing Xu, Chunjing Xu, DaCheng Tao, Chang Xu
Existing works used to decrease the number or size of requested convolution filters for a minimum viable CNN on edge devices.
no code implementations • 27 Jul 2019 • Kai Han, Yunhe Wang, Han Shu, Chuanjian Liu, Chunjing Xu, Chang Xu
This paper expands the strength of deep convolutional neural networks (CNNs) to the pedestrian attribute recognition problem by devising a novel attribute aware pooling algorithm.
no code implementations • 27 Jul 2019 • Chuanjian Liu, Yunhe Wang, Kai Han, Chunjing Xu, Chang Xu
Exploring deep convolutional neural networks of high efficiency and low memory usage is very essential for a wide variety of machine learning tasks.
2 code implementations • ICCV 2019 • Han Shu, Yunhe Wang, Xu Jia, Kai Han, Hanting Chen, Chunjing Xu, Qi Tian, Chang Xu
Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation.
1 code implementation • 25 Jul 2019 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong
In this paper, we formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, called TOM-Net, for learning the refractive flow.
1 code implementation • 21 May 2019 • Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman
The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label.
1 code implementation • CVPR 2019 • Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann Lecun, Patrick Perez, Jean Ponce
Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts.
Ranked #2 on Single-object colocalization on Object Discovery
1 code implementation • CVPR 2019 • Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong
This paper proposes an uncalibrated photometric stereo method for non-Lambertian scenes based on deep learning.
1 code implementation • 2 Jan 2019 • Kai Han, Jianyuan Guo, Chao Zhang, Mingjian Zhu
Based on the considerations above, we propose a novel Attribute-Aware Attention Model ($A^3M$), which can learn local attribute representation and global category representation simultaneously in an end-to-end manner.
Ranked #4 on Fine-Grained Image Classification on CompCars
2 code implementations • NeurIPS 2018 • Shupeng Su, Chao Zhang, Kai Han, Yonghong Tian
To convert the input into binary code, hashing algorithm has been widely used for approximate nearest neighbor search on large-scale image sets due to its computation and storage efficiency.
1 code implementation • ECCV 2018 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong
This paper addresses the problem of photometric stereo for non-Lambertian surfaces.
1 code implementation • CVPR 2018 • Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong
In this paper, we first formulate transparent object matting as a refractive flow estimation problem.
1 code implementation • 23 Oct 2017 • Kai Han, Yunhe Wang, Chao Zhang, Chao Li, Chao Xu
High-dimensional data in many areas such as computer vision and machine learning tasks brings in computational and analytical difficulty.
1 code implementation • ICCV 2017 • Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce
This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.
no code implementations • CVPR 2016 • Kai Han, Kwan-Yee K. Wong, Dirk Schnieders, Miaomiao Liu
Unlike previous approaches which require tedious work to calibrate the camera, our method can recover both the camera intrinsics and extrinsics together with the mirror surface from reflections of the reference plane under at least three unknown distinct poses.
no code implementations • CVPR 2015 • Kai Han, Kwan-Yee K. Wong, Miaomiao Liu
In this paper, we develop a fixed viewpoint approach for dense surface reconstruction of transparent objects based on refraction of light.