1 code implementation • 22 Dec 2023 • Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin
In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.
no code implementations • 17 Aug 2023 • Wei Song, Jun Zhou, Mingjie Wang, Hongchen Tan, Nannan Li, Xiuping Liu
In this work, we propose a novel multimodal fusion network for point cloud completion, which can simultaneously fuse visual and textual information to predict the semantic and geometric characteristics of incomplete shapes effectively.
no code implementations • 5 May 2023 • Yiyi Zhang, Zhiwen Ying, Ying Zheng, Cuiling Wu, Nannan Li, Jun Wang, Xianzhong Feng, Xiaogang Xu
Plant leaf identification is crucial for biodiversity protection and conservation and has gradually attracted the attention of academia in recent years.
1 code implementation • ICCV 2023 • Nannan Li, Kevin J. Shih, Bryan A. Plummer
Then we reconstruct the input image by sampling from the permuted textures for patch-level disentanglement.
1 code implementation • 13 Jul 2022 • Nannan Li, Bryan A. Plummer
Thus, the source attribute information can often be hidden in the disentangled features, leading to unwanted image editing effects.
1 code implementation • 28 May 2022 • Yu Pan, Zeyong Su, Ao Liu, Jingquan Wang, Nannan Li, Zenglin Xu
To address this problem, we propose a universal weight initialization paradigm, which generalizes Xavier and Kaiming methods and can be widely applicable to arbitrary TCNNs.
1 code implementation • 13 Feb 2022 • Nannan Li, Yaran Chen, Weifan Li, Zixiang Ding, Dongbin Zhao
In this paper, we propose the broad attention to improve the performance by incorporating the attention relationship of different layers for vision transformer, which is called BViT.
no code implementations • 15 Nov 2021 • Zixiang Ding, Yaran Chen, Nannan Li, Dongbin Zhao, C. L. Philip Chen
Moreover, multi-scale feature fusion and knowledge embedding are proposed to improve the performance of BCNN with shallow topology.
1 code implementation • 8 Oct 2021 • Jiaqi Li, Haoran Li, Yaran Chen, Zixiang Ding, Nannan Li, Mingjun Ma, Zicheng Duan, Dongbing Zhao
Compared with the traditional rule-based pruning method, this pipeline saves human labor and achieves a higher compression ratio with lower accuracy loss.
no code implementations • 6 Jul 2021 • Leitian Tao, Li Mi, Nannan Li, Xianhang Cheng, Yaosi Hu, Zhenzhong Chen
For a typical Scene Graph Generation (SGG) method, there is often a large gap in the performance of the predicates' head classes and tail classes.
no code implementations • 22 Sep 2020 • Nannan Li, Yu Pan, Yaran Chen, Zixiang Ding, Dongbin Zhao, Zenglin Xu
Interestingly, we discover that part of the rank elements is sensitive and usually aggregate in a narrow region, namely an interest region.
no code implementations • 18 Sep 2020 • Zixiang Ding, Yaran Chen, Nannan Li, Dongbin Zhao
For this consequent issue, two solutions are given: 1) we propose Confident Learning Rate (CLR) that considers the confidence of gradient for architecture weights update, increasing with the training time of over-parameterized BCNN; 2) we introduce the combination of partial channel connections and edge normalization that also can improve the memory efficiency further.
1 code implementation • 21 Jul 2020 • Nannan Li, Zhenzhong Chen
Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion.
no code implementations • 24 Mar 2020 • Nannan Li, Zhenzhong Chen
Adversarial learning has shown its advances in generating natural and diverse descriptions in image captioning.
no code implementations • 18 Jan 2020 • Zixiang Ding, Yaran Chen, Nannan Li, Dongbin Zhao, Zhiquan Sun, C. L. Philip Chen
In this paper, we propose Broad Neural Architecture Search (BNAS) where we elaborately design broad scalable architecture dubbed Broad Convolutional Neural Network (BCNN) to solve the above issue.
no code implementations • 23 Dec 2019 • Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao
In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties.
no code implementations • 20 Mar 2019 • Haohao Li, Shengfa Wang, Nannan Li, Zhixun Su, Ximin Liu
The different intrinsic representations (features) focus on different geometric properties to describe the same 3D shape, which makes the representations are related.
1 code implementation • CVPR 2019 • Jia-Xing Zhong, Nannan Li, Weijie Kong, Shan Liu, Thomas H. Li, Ge Li
Remarkably, we obtain the frame-level AUC score of 82. 12% on UCF-Crime.
Anomaly Detection In Surveillance Videos Multiple Instance Learning +3
no code implementations • 6 Nov 2018 • Weijie Kong, Nannan Li, Shan Liu, Thomas Li, Ge Li
Despite tremendous progress achieved in temporal action detection, state-of-the-art methods still suffer from the sharp performance deterioration when localizing the starting and ending temporal action boundaries.
no code implementations • 9 Jul 2018 • Jia-Xing Zhong, Nannan Li, Weijie Kong, Tao Zhang, Thomas H. Li, Ge Li
Weakly supervised temporal action detection is a Herculean task in understanding untrimmed videos, since no supervisory signal except the video-level category label is available on training data.
no code implementations • 22 Jan 2018 • Qianye Yang, Nannan Li, Zixu Zhao, Xingyu Fan, Eric I-Chao Chang, Yan Xu
Based on our proposed framework, we first propose a method for cross-modality registration by fusing the deformation fields to adopt the cross-modality information from translated modalities.
1 code implementation • 22 Jun 2017 • Jingjia Huang, Nannan Li, Tao Zhang, Ge Li
Existing action detection algorithms usually generate action proposals through an extensive search over the video at multiple temporal scales, which brings about huge computational overhead and deviates from the human perception procedure.
no code implementations • 23 Aug 2016 • Nannan Li, Dan Xu, Zhenqiang Ying, Zhihao LI, Ge Li
In this paper, we address the problem of searching action proposals in unconstrained video clips.