Search Results for author: Qi Chu

Found 38 papers, 14 papers with code

Transformer based Pluralistic Image Completion with Reduced Information Loss

1 code implementation • 31 Mar 2024 • Qiankun Liu, Yuqi Jiang, Zhentao Tan, Dongdong Chen, Ying Fu, Qi Chu, Gang Hua, Nenghai Yu

The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer.

Decoder Image Inpainting +1

150

Paper
Code

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

no code implementations • 27 Mar 2024 • Shengjie Ma, Chong Chen, Qi Chu, Jiaxin Mao

Nonetheless, the method of employing a general large language model for reliable relevance judgments in legal case retrieval is yet to be thoroughly explored.

Language Modelling Large Language Model +1

Paper
Add Code

MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection

1 code implementation • 4 Mar 2024 • Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, Nenghai Yu

Inspired by the recent basic model with linear complexity for long-distance modeling, called Mamba, we explore the potential of this state space model for ISTD task in terms of effectiveness and efficiency in the paper.

Sentence

Paper
Code

Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues

no code implementations • 4 Feb 2024 • Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu

This bidirectional interaction narrows the modality imbalance, facilitating more effective learning of integrated audio-visual representations.

Decoder Representation Learning

Paper
Add Code

TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection

no code implementations • 3 Feb 2024 • Tianxiang Chen, Zhentao Tan, Qi Chu, Yue Wu, Bin Liu, Nenghai Yu

We abstract this process as the directional movement of feature map pixels to target areas through convolution, pooling and interactions with surrounding pixels, which can be analogous to the movement of thermal particles constrained by surrounding variables and particles.

Paper
Add Code

Towards More Unified In-context Visual Understanding

no code implementations • 5 Dec 2023 • Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Thanks to this design, the model is capable of handling in-context vision understanding tasks with multimodal output in a unified pipeline. Experimental results demonstrate that our model achieves competitive performance compared with specialized models and previous ICL baselines.

Decoder Image Captioning +2

Paper
Add Code

GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection

1 code implementation • 24 Oct 2023 • Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Tong He, Yonghui Li, Wanli Ouyang

It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning.

Monocular 3D Object Detection object-detection

126

Paper
Code

Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding

no code implementations • 22 Sep 2023 • Jiazhen Wang, Bin Liu, Changtao Miao, Zhiwei Zhao, Wanyi Zhuang, Qi Chu, Nenghai Yu

Existing methods for multi-modal manipulation detection and grounding primarily focus on fusing vision-language features to make predictions, while overlooking the importance of modality-specific features, leading to sub-optimal results.

Paper
Add Code

MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators

no code implementations • 19 Jun 2023 • Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, Wanli Ouyang

Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement of digital humans.

Paper
Add Code

EVOPOSE: A Recursive Transformer For 3D Human Pose Estimation With Kinematic Structure Priors

no code implementations • 16 Jun 2023 • Yaqi Zhang, Yan Lu, Bin Liu, Zhiwei Zhao, Qi Chu, Nenghai Yu

Transformer is popular in recent 3D human pose estimation, which utilizes long-term modeling to lift 2D keypoints into the 3D space.

Ranked #85 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation

Paper
Add Code

Exploring the Application of Large-scale Pre-trained Models on Adverse Weather Removal

no code implementations • 15 Jun 2023 • Zhentao Tan, Yue Wu, Qiankun Liu, Qi Chu, Le Lu, Jieping Ye, Nenghai Yu

Inspired by the various successful applications of large-scale pre-trained models (e. g, CLIP), in this paper, we explore the potential benefits of them for this task through both spatial feature representation learning and semantic information embedding aspects: 1) for spatial feature representation learning, we design a Spatially-Adaptive Residual (\textbf{SAR}) Encoder to extract degraded areas adaptively.

Image Restoration Representation Learning

Paper
Add Code

HQ-50K: A Large-scale, High-quality Dataset for Image Restoration

1 code implementation • 8 Jun 2023 • Qinhong Yang, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Lu Yuan, Gang Hua, Nenghai Yu

This paper introduces a new large-scale image restoration dataset, called HQ-50K, which contains 50, 000 high-quality images with rich texture details and semantic diversity.

Denoising Image Restoration +2

Paper
Code

Multi-spectral Class Center Network for Face Manipulation Detection and Localization

1 code implementation • 18 May 2023 • Changtao Miao, Qi Chu, Zhentao Tan, Zhenchao Jin, Wanyi Zhuang, Yue Wu, Bin Liu, Honggang Hu, Nenghai Yu

Next, a novel Multi-Spectral Class Center Network (MSCCNet) is proposed for face manipulation detection and localization.

Face Swapping

Paper
Code

Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identification

no code implementations • 10 May 2023 • Xulin Li, Yan Lu, Bin Liu, Yuenan Hou, Yating Liu, Qi Chu, Wanli Ouyang, Nenghai Yu

Clothes-invariant feature extraction is critical to the clothes-changing person re-identification (CC-ReID).

Clothes Changing Person Re-Identification

Paper
Add Code

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

1 code implementation • 7 Dec 2022 • Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable.

Ranked #7 on Instance Segmentation on LVIS v1.0 val

Data Augmentation Instance Segmentation +5

Paper
Code

UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection

no code implementations • 23 Oct 2022 • Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, Nenghai Yu

UPCL is designed for learning the consistency-related representation with progressive optimized pseudo annotations.

Representation Learning

Paper
Add Code

Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification

no code implementations • 1 Aug 2022 • Xulin Li, Yan Lu, Bin Liu, Yating Liu, Guojun Yin, Qi Chu, Jinyang Huang, Feng Zhu, Rui Zhao, Nenghai Yu

But we find existing graph-based methods in the visible-infrared person re-identification task (VI-ReID) suffer from bad generalization because of two issues: 1) train-test modality balance gap, which is a property of VI-ReID task.

counterfactual Person Re-Identification

Paper
Add Code

Towards Intrinsic Common Discriminative Features Learning for Face Forgery Detection using Adversarial Learning

no code implementations • 8 Jul 2022 • Wanyi Zhuang, Qi Chu, Haojie Yuan, Changtao Miao, Bin Liu, Nenghai Yu

Existing face forgery detection methods usually treat face forgery detection as a binary classification problem and adopt deep convolution neural networks to learn discriminative features.

Binary Classification Classification +1

Paper
Add Code

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

1 code implementation • CVPR 2022 • Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu

The indices of quantized pixels are used as tokens for the inputs and prediction targets of transformer.

Ranked #6 on Seeing Beyond the Visible on KITTI360-EX

Image Inpainting Quantization +1

150

Paper
Code

Online Multi-Object Tracking with Unsupervised Re-Identification Learning and Occlusion Estimation

no code implementations • 4 Jan 2022 • Qiankun Liu, Dongdong Chen, Qi Chu, Lu Yuan, Bin Liu, Lei Zhang, Nenghai Yu

In addition, such practice of re-identification still can not track those highly occluded objects when they are missed by the detector.

Ranked #7 on Multi-Object Tracking on MOT16 (using extra training data)

Multi-Object Tracking Object +2

Paper
Add Code

Unsupervised Finetuning

no code implementations • 18 Oct 2021 • Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Bin Liu, Nenghai Yu

This problem is more challenging than the supervised counterpart, as the low data density in the small-scale target data is not friendly for unsupervised learning, leading to the damage of the pretrained representation and poor representation in the target domain.

Paper
Add Code

Temporal RoI Align for Video Object Recognition

1 code implementation • 8 Sep 2021 • Tao Gong, Kai Chen, Xinjiang Wang, Qi Chu, Feng Zhu, Dahua Lin, Nenghai Yu, Huamin Feng

In this work, considering the features of the same object instance are highly similar among frames in a video, a novel Temporal RoI Align operator is proposed to extract features from other frames feature maps for current frame proposals by utilizing feature similarity.

Ranked #1 on Video Instance Segmentation on YouTube-VIS

Instance Segmentation Object +5

3,385

Paper
Code

ISNet: Integrate Image-Level and Semantic-Level Context for Semantic Segmentation

1 code implementation • ICCV 2021 • Zhenchao Jin, Bin Liu, Qi Chu, Nenghai Yu

Third, we compute the similarities between each pixel representation and the image-level contextual information, the semantic-level contextual information, respectively.

Image Segmentation Semantic Segmentation

735

Paper
Code

Mining Contextual Information Beyond Image for Semantic Segmentation

1 code implementation • ICCV 2021 • Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao

To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations.

Image Segmentation Segmentation +1

Paper
Code

Abnormal Behavior Detection Based on Target Analysis

no code implementations • 29 Jul 2021 • Luchuan Song, Bin Liu, Huihui Zhu, Qi Chu, Nenghai Yu

To this end, we propose a multivariate fusion method that analyzes each target through three branches: object, action and motion.

Object

Paper
Add Code

Cascaded Residual Density Network for Crowd Counting

no code implementations • 29 Jul 2021 • Kun Zhao, Luchuan Song, Bin Liu, Qi Chu, Nenghai Yu

Crowd counting is a challenging task due to the issues such as scale variation and perspective variation in real crowd scenes.

Crowd Counting

Paper
Add Code

Geometry Uncertainty Projection Network for Monocular 3D Object Detection

1 code implementation • ICCV 2021 • Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, Wanli Ouyang

In this paper, we propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.

Ranked #2 on 3D Object Detection From Monocular Images on Waymo Open Dataset

3D Object Detection From Monocular Images Depth Estimation +3

126

Paper
Code

Improve Unsupervised Pretraining for Few-label Transfer

no code implementations • ICCV 2021 • Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Bin Liu, Nenghai Yu

Unsupervised pretraining has achieved great success and many recent works have shown unsupervised pretraining can achieve comparable or even slightly better transfer performance than supervised pretraining on downstream target datasets.

Clustering Contrastive Learning

Paper
Add Code

Towards Generalizable and Robust Face Manipulation Detection via Bag-of-local-feature

no code implementations • 14 Mar 2021 • Changtao Miao, Qi Chu, Weihai Li, Tao Gong, Wanyi Zhuang, Nenghai Yu

Over the past several years, in order to solve the problem of malicious abuse of facial manipulation technology, face manipulation detection technology has obtained considerable attention and achieved remarkable progress.

Paper
Add Code

Diverse Semantic Image Synthesis via Probability Distribution Modeling

1 code implementation • CVPR 2021 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level.

Ranked #1 on Image-to-Image Translation on ADE20K Labels-to-Photos (LPIPS metric)

Image-to-Image Translation

Paper
Code

First demonstration of early warning gravitational wave alerts

no code implementations • 8 Feb 2021 • Ryan Magee, Deep Chatterjee, Leo P. Singer, Surabhi Sachdev, Manoj Kovalam, Geoffrey Mo, Stuart Anderson, Patrick Brady, Patrick Brockill, Kipp Cannon, Tito Dal Canton, Qi Chu, Patrick Clearwater, Alex Codoreanu, Marco Drago, Patrick Godwin, Shaon Ghosh, Giuseppe Greco, Chad Hanna, Shasvath J. Kapadia, Erik Katsavounidis, Victor Oloworaran, Alexander E. Pace, Fiona Panther, Anwarul Patwary, Roberto De Pietri, Brandon Piotrzkowski, Tanner Prestegard, Luca Rei, Anala K. Sreekumar, Marek J. Szczepańczyk, Vinaya Valsan, Aaron Viets, Madeline Wade, Linqing Wen, John Zweizig

We present results from an end-to-end mock data challenge that detects binary neutron star mergers and alerts partner facilities before merger.

High Energy Astrophysical Phenomena

Paper
Add Code

Are Fewer Labels Possible for Few-shot Learning?

no code implementations • 10 Dec 2020 • Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Nenghai Yu

We conduct experiments on 10 different few-shot target datasets, and our average few-shot performance outperforms both vanilla inductive unsupervised transfer and supervised transfer by a large margin.

Clustering Few-Shot Learning

Paper
Add Code

Efficient Semantic Image Synthesis via Class-Adaptive Normalization

1 code implementation • 8 Dec 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away.

Image Generation

Paper
Code

MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

1 code implementation • 30 Oct 2020 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation.

Conditional Image Generation

291

Paper
Code

Rethinking Spatially-Adaptive Normalization

no code implementations • 6 Apr 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Nenghai Yu

Despite its impressive performance, a more thorough understanding of the true advantages inside the box is still highly demanded, to help reduce the significant computation and parameter overheads introduced by these new structures.

Image Generation

Paper
Add Code

Density-Aware Graph for Deep Semi-Supervised Visual Recognition

no code implementations • CVPR 2020 • Suichan Li, Bin Liu, Dong-Dong Chen, Qi Chu, Lu Yuan, Nenghai Yu

Motivated by these limitations, this paper proposes to solve the SSL problem by building a novel density-aware graph, based on which the neighborhood information can be easily leveraged and the feature learning and label propagation can also be trained in an end-to-end way.

Pseudo Label

Paper
Add Code

Cross-modality Person re-identification with Shared-Specific Feature Transfer

no code implementations • CVPR 2020 • Yan Lu, Yue Wu, Bin Liu, Tianzhu Zhang, Baopu Li, Qi Chu, Nenghai Yu

In this paper, we tackle the above limitation by proposing a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics to boost the re-identification performance.

Cross-Modality Person Re-identification Person Re-Identification

Paper
Add Code

Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

no code implementations • ICCV 2017 • Qi Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, Bin Liu, Nenghai Yu

The visibility map of the target is learned and used for inferring the spatial attention map.

Computational Efficiency Multi-Object Tracking +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.