Search Results for author: Lingxi Xie

Found 135 papers, 63 papers with code

SIMILE: Introducing Sequential Information towards More Effective Imitation Learning

no code implementations • ICLR 2019 • Yutong Bai, Lingxi Xie

Reinforcement learning (RL) is a metaheuristic aiming at teaching an agent to interact with an environment and maximizing the reward in a complex task.

Imitation Learning OpenAI Gym +3

Paper
Add Code

AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation

no code implementations • 8 Apr 2024 • Jiannan Ge, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian

(1) Mutually-Refined Proposal Extraction.

Image Segmentation Segmentation +3

Paper
Add Code

GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

1 code implementation • 15 Feb 2024 • Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.

Neural Rendering Object

609

Paper
Code

ChatterBox: Multi-round Multimodal Referring and Grounding

1 code implementation • 24 Jan 2024 • Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.

Language Modelling Visual Grounding

Paper
Code

VMamba: Visual State Space Model

2 code implementations • 18 Jan 2024 • Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, YaoWei Wang, Qixiang Ye, Yunfan Liu

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have long been the predominant backbone networks for visual representation learning.

Computational Efficiency Representation Learning

1,368

Paper
Code

Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models

no code implementations • 6 Jan 2024 • Xin He, Longhui Wei, Lingxi Xie, Qi Tian

Multimodal Large Language Models (MLLMs) are experiencing rapid growth, yielding a plethora of noteworthy contributions in recent months.

Instruction Following

Paper
Add Code

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

no code implementations • 7 Dec 2023 • Yabo Chen, Jiemin Fang, YuYang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

We propose a cascade generation framework constructed with two Zero-1-to-3 models, named Cascade-Zero123, to tackle this issue, which progressively extracts 3D information from the source image.

Transparent objects

Paper
Add Code

Segment Any 3D Gaussians

no code implementations • 1 Dec 2023 • Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Interactive 3D segmentation in radiance fields is an appealing task since its importance in 3D scene understanding and manipulation.

Interactive Segmentation Scene Understanding +1

Paper
Add Code

Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

no code implementations • 28 Nov 2023 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Lingxi Xie, Qi Tian, Wei Shen

Parameter-efficient fine-tuning (PEFT) is an effective methodology to unleash the potential of large foundation models in novel scenarios with limited training data.

Image Classification Image Segmentation +2

Paper
Add Code

GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions

no code implementations • 27 Nov 2023 • Jiemin Fang, Junjie Wang, Xiaopeng Zhang, Lingxi Xie, Qi Tian

Specifically, we first extract the region of interest (RoI) corresponding to the text instruction, aligning it to 3D Gaussians.

3D scene Editing

Paper
Add Code

One-bit Supervision for Image Classification: Problem, Solution, and Beyond

no code implementations • 26 Nov 2023 • Hengtong Hu, Lingxi Xie, Xinyue Hue, Richang Hong, Qi Tian

An intriguing property of the setting is that the burden of annotation largely alleviates in comparison to offering the accurate label.

Active Learning Image Classification +2

Paper
Add Code

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

1 code implementation • 12 Oct 2023 • Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, Xinggang Wang

In recent times, the generation of 3D assets from text prompts has shown impressive results.

Text to 3D

514

Paper
Code

4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

1 code implementation • 12 Oct 2023 • Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang

Representing and rendering dynamic scenes has been an important but challenging task.

1,640

Paper
Code

QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

1 code implementation • 26 Sep 2023 • Yuhui Xu, Lingxi Xie, Xiaotao Gu, Xin Chen, Heng Chang, Hengheng Zhang, Zhengsu Chen, Xiaopeng Zhang, Qi Tian

Recently years have witnessed a rapid development of large language models (LLMs).

Quantization

5,881

Paper
Code

Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models

no code implementations • 14 Jun 2023 • Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Kaifeng Bi, Xiaotao Gu, Jianlong Chang, Qi Tian

In this paper, we start with a conceptual definition of AGI and briefly review how NLP solves a wide range of tasks via a chat system.

Paper
Add Code

Visual Tuning

no code implementations • 10 May 2023 • Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen

With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer.

Paper
Add Code

Segment Anything in 3D with Radiance Fields

1 code implementation • NeurIPS 2023 • Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.

Inverse Rendering Segmentation

783

Paper
Code

Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism

no code implementations • 22 Apr 2023 • Xin Chen, Hengheng Zhang, Xiaotao Gu, Kaifeng Bi, Lingxi Xie, Qi Tian

The Mixture of Experts (MoE) model becomes an important choice of large language models nowadays because of its scalability with sublinear computational complexity for training and inference.

Paper
Add Code

Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation

no code implementations • ICCV 2023 • Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.

Semantic Segmentation Unsupervised Domain Adaptation

Paper
Add Code

USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation

no code implementations • ICCV 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian

Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).

Multi-Label Classification Weakly supervised Semantic Segmentation +1

Paper
Add Code

Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration

1 code implementation • CVPR 2023 • Yunjie Tian, Lingxi Xie, Jihao Qiu, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye

iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT).

object-detection Object Detection +1

149

Paper
Code

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

1 code implementation • 4 Nov 2022 • Chengcheng Ma, Yang Liu, Jiankang Deng, Lingxi Xie, WeiMing Dong, Changsheng Xu

Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.

object-detection Open Vocabulary Object Detection +2

Paper
Code

Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast

3 code implementations • 3 Nov 2022 • Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian

In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.

920

Paper
Code

Learnable Distribution Calibration for Few-Shot Class-Incremental Learning

no code implementations • 1 Oct 2022 • Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye

LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.

Few-Shot Class-Incremental Learning Few-Shot Learning +2

Paper
Add Code

Fine-Grained Semantically Aligned Vision-Language Pre-Training

1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang

Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.

object-detection Object Detection +1

Paper
Code

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang

To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.

Human motion prediction motion prediction

Paper
Code

Visual Recognition by Request

1 code implementation • CVPR 2023 • Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian

Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.

Instance Segmentation Semantic Segmentation

Paper
Code

Active Pointly-Supervised Instance Segmentation

1 code implementation • 23 Jul 2022 • Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu

In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.

Active Learning Instance Segmentation +2

Paper
Code

A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian

Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.

Image Segmentation Instance Segmentation +2

Paper
Add Code

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

1 code implementation • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Transfer Learning

Paper
Code

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian

A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.

307

Paper
Code

CenterNet++ for Object Detection

2 code implementations • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).

Ranked #35 on Object Detection on COCO test-dev

Object object-detection +1

177

Paper
Code

Domain-Agnostic Prior for Transfer Semantic Segmentation

no code implementations • CVPR 2022 • Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian

Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.

Representation Learning Semantic Segmentation +1

Paper
Add Code

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

The past year has witnessed a rapid development of masked image modeling (MIM).

Paper
Code

TAPE: Task-Agnostic Prior Embedding for Image Restoration

no code implementations • 11 Mar 2022 • Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian

In this paper, we propose a novel approach that embeds a task-agnostic prior into a transformer.

Image Restoration

Paper
Add Code

MVP: Multimodality-guided Visual Pre-training

no code implementations • 10 Mar 2022 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.

Language Modelling

Paper
Add Code

One-Bit Active Query With Contrastive Pairs

no code implementations • CVPR 2022 • Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian

The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.

Active Learning Contrastive Learning

Paper
Add Code

Exploring Complicated Search Spaces with Interleaving-Free Sampling

no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian

In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.

Neural Architecture Search

Paper
Add Code

NeuSample: Neural Sample Field for Efficient View Synthesis

1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.

Paper
Code

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Ranked #63 on Semantic Segmentation on Cityscapes test

Representation Learning Semantic Segmentation

Paper
Code

Consensus Synergizes with Memory: A Simple Approach for Anomaly Segmentation in Urban Scenes

no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen

Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.

Ranked #10 on Anomaly Detection on Fishyscapes L&F

Anomaly Detection Autonomous Driving +1

Paper
Add Code

CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis

1 code implementation • 19 Oct 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian

The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.

Ranked #1 on 3D-Aware Image Synthesis on FFHQ 256 x 256

3D-Aware Image Synthesis Transfer Learning

605

Paper
Code

Vibration-based Uncertainty Estimation for Learning from Limited Supervision

no code implementations • 29 Sep 2021 • Hengtong Hu, Lingxi Xie, Yinquan Wang, Richang Hong, Meng Wang, Qi Tian

We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision.

Active Learning

Paper
Add Code

Deep Encryption: Protecting Pre-Trained Neural Networks with Confusion Neurons

no code implementations • 29 Sep 2021 • Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian

Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers.

Position

Paper
Add Code

Rectifying the Shortcut Learning of Background for Few-Shot Learning

1 code implementation • NeurIPS 2021 • Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian

The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).

Ranked #20 on Few-Shot Image Classification on Mini-Imagenet 5-way (5-shot)

Few-Shot Image Classification Few-Shot Learning

101

Paper
Code

Bag of Instances Aggregation Boosts Self-supervised Distillation

1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.

Contrastive Learning Self-Supervised Learning

Paper
Code

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation

no code implementations • CVPR 2021 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.

Continual Learning Image Segmentation +3

Paper
Add Code

Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task

no code implementations • 1 Jun 2021 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.

Self-Supervised Learning

Paper
Add Code

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

3 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Transformers have offered a new methodology of designing neural networks for visual recognition.

Image Classification object-detection +1

Paper
Code

What Is Considered Complete for Visual Recognition?

no code implementations • 28 May 2021 • Lingxi Xie, Xiaopeng Zhang, Longhui Wei, Jianlong Chang, Qi Tian

This is an opinion paper.

Paper
Add Code

Conformer: Local Features Coupling Global Representations for Visual Recognition

4 code implementations • ICCV 2021 • Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, YaoWei Wang, Jianbin Jiao, Qixiang Ye

Within Convolutional Neural Network (CNN), the convolution operations are good at extracting local features but experience difficulty to capture global representations.

Ranked #322 on Image Classification on ImageNet

Image Classification Instance Segmentation +4

3,137

Paper
Code

Visformer: The Vision-friendly Transformer

5 code implementations • ICCV 2021 • Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian

The past year has witnessed the rapid development of applying the Transformer module to vision problems.

Ranked #507 on Image Classification on ImageNet

Image Classification

29,648

Paper
Code

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

Ranked #56 on Object Detection on COCO test-dev

2D Human Pose Estimation Instance Segmentation +5

154

Paper
Code

Spatiotemporal Transformer for Video-based Person Re-identification

no code implementations • 30 Mar 2021 • Tianyu Zhang, Longhui Wei, Lingxi Xie, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

Recently, the Transformer module has been transplanted from natural language processing to computer vision.

Video-Based Person Re-Identification

Paper
Add Code

MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes

no code implementations • CVPR 2021 • Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang

This paper presents MagDR, a mask-guided detection and reconstruction pipeline for defending deepfakes from adversarial attacks.

Paper
Add Code

Interactive Fusion of Multi-level Features for Compositional Activity Recognition

1 code implementation • 10 Dec 2020 • Rui Yan, Lingxi Xie, Xiangbo Shu, Jinhui Tang

To understand a complex action, multiple sources of information, including appearance, positional, and semantic features, need to be integrated.

Action Recognition

Paper
Code

UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification

1 code implementation • CVPR 2021 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.

Domain Adaptation Image Generation +1

Paper
Code

Seed the Views: Hierarchical Semantic Alignment for Contrastive Representation Learning

no code implementations • 4 Dec 2020 • Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, Qi Tian

In this paper, we propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to \textbf{Cross-samples and Multi-level} representation, and models the invariance to semantically similar images in a hierarchical way.

Contrastive Learning Representation Learning +2

Paper
Add Code

Batch Normalization with Enhanced Linear Transformation

1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille

Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.

Paper
Code

Omni-GAN: On the Secrets of cGANs and Beyond

3 code implementations • ICCV 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian

The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.

Ranked #8 on Conditional Image Generation on ImageNet 128x128

Conditional Image Generation Generative Adversarial Network

Paper
Code

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations

no code implementations • 19 Nov 2020 • Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.

Contrastive Learning Data Augmentation +1

Paper
Add Code

Privileged Knowledge Distillation for Online Action Detection

no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian

Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.

Ranked #11 on Online Action Detection on TVSeries

Knowledge Distillation Online Action Detection

Paper
Add Code

Can Semantic Labels Assist Self-Supervised Visual Representation Learning?

no code implementations • 17 Nov 2020 • Longhui Wei, Lingxi Xie, Jianzhong He, Jianlong Chang, Xiaopeng Zhang, Wengang Zhou, Houqiang Li, Qi Tian

Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning.

Contrastive Learning Representation Learning +1

Paper
Add Code

One-bit Supervision for Image Classification

1 code implementation • NeurIPS 2020 • Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian

Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.

Classification General Classification +1

Paper
Code

Reinforced Axial Refinement Network for Monocular 3D Object Detection

no code implementations • ECCV 2020 • Lijie Liu, Chufan Wu, Jiwen Lu, Lingxi Xie, Jie zhou, Qi Tian

Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.

Ranked #16 on Vehicle Pose Estimation on KITTI Cars Hard

Monocular 3D Object Detection Object +2

Paper
Add Code

Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap

no code implementations • 4 Aug 2020 • Lingxi Xie, Xin Chen, Kaifeng Bi, Longhui Wei, Yuhui Xu, Zhengsu Chen, Lanfei Wang, An Xiao, Jianlong Chang, Xiaopeng Zhang, Qi Tian

Neural architecture search (NAS) has attracted increasing attentions in both academia and industry.

Neural Architecture Search

Paper
Add Code

Corner Proposal Network for Anchor-free, Two-stage Object Detection

1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.

Ranked #94 on Object Detection on COCO test-dev

Computational Efficiency Object +3

193

Paper
Code

Polar Relative Positional Encoding for Video-Language Segmentation

no code implementations • 20 Jul 2020 • Ke Ning, Lingxi Xie, Fei Wu, Qi Tian

In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i. e., in terms of direction and range.

Ranked #11 on Referring Expression Segmentation on J-HMDB

Referring Expression Segmentation Sentence

Paper
Add Code

Social Adaptive Module for Weakly-supervised Group Activity Recognition

no code implementations • ECCV 2020 • Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian

This paper presents a new task named weakly-supervised group activity recognition (GAR) which differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.

Group Activity Recognition

Paper
Add Code

Universal-to-Specific Framework for Complex Action Recognition

no code implementations • 13 Jul 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian

The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.

Action Recognition Decision Making

Paper
Add Code

Discretization-Aware Architecture Search

1 code implementation • 7 Jul 2020 • Yunjie Tian, Chang Liu, Lingxi Xie, Jianbin Jiao, Qixiang Ye

The search cost of neural architecture search (NAS) has been largely reduced by weight-sharing methods.

Image Classification Neural Architecture Search

Paper
Code

GOLD-NAS: Gradual, One-Level, Differentiable

1 code implementation • 7 Jul 2020 • Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.

Image Classification Neural Architecture Search

Paper
Code

Searching towards Class-Aware Generators for Conditional Generative Adversarial Networks

1 code implementation • 25 Jun 2020 • Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian

To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.

Image Generation

Paper
Code

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Medical Image Segmentation

no code implementations • 24 Jun 2020 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Qi Tian

This paper focuses on a popular pipeline known as self learning, and points out a weakness named lazy learning that refers to the difficulty for a model to learn from the pseudo labels generated by itself.

Autonomous Driving Image Segmentation +4

Paper
Add Code

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

no code implementations • 17 Apr 2020 • Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu, Qi Tian

We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.

Neural Architecture Search

Paper
Add Code

Unsupervised Person Re-identification via Softened Similarity Learning

1 code implementation • CVPR 2020 • Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, Qi Tian

Person re-identification (re-ID) is an important topic in computer vision.

Clustering General Classification +2

Paper
Code

Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio

1 code implementation • CVPR 2020 • Zhengsu Chen, Jianwei Niu, Lingxi Xie, Xuefeng Liu, Longhui Wei, Qi Tian

Automatic designing computationally efficient neural networks has received much attention in recent years.

Image Classification Network Pruning

Paper
Code

Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

1 code implementation • CVPR 2020 • Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian

In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.

Knowledge Distillation Retrieval

Paper
Code

Circumventing Outliers of AutoAugment with Knowledge Distillation

1 code implementation • ECCV 2020 • Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian

AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.

Ranked #185 on Image Classification on ImageNet

Data Augmentation General Classification +2

Paper
Code

Bottom-Up Temporal Action Localization with Mutual Regularization

1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian

To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.

Temporal Action Localization

Paper
Code

Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization

1 code implementation • ECCV 2020 • Zijie Zhuang, Longhui Wei, Lingxi Xie, Tianyu Zhang, Hengheng Zhang, Haozhe Wu, Haizhou Ai, Qi Tian

The fundamental difficulty in person re-identification (ReID) lies in learning the correspondence among individual cameras.

Ranked #16 on Unsupervised Domain Adaptation on Duke to Market

Direct Transfer Person Re-identification Domain Adaptive Person Re-Identification +2

104

Paper
Code

Latency-Aware Differentiable Neural Architecture Search

1 code implementation • 17 Jan 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, Hongkai Xiong

However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.

Neural Architecture Search

Paper
Code

Wasserstein-Bounded Generative Adversarial Networks

no code implementations • ICLR 2020 • Peng Zhou, Bingbing Ni, Lingxi Xie, Xiaopeng Zhang, Hang Wang, Cong Geng, Qi Tian

In the field of Generative Adversarial Networks (GANs), how to design a stable training strategy remains an open problem.

Paper
Add Code

Scalable NAS with Factorizable Architectural Parameters

no code implementations • 31 Dec 2019 • Lanfei Wang, Lingxi Xie, Tianyi Zhang, Jun Guo, Qi Tian

Neural Architecture Search (NAS) is an emerging topic in machine learning and computer vision.

Image Classification Neural Architecture Search

Paper
Add Code

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

4 code implementations • 23 Dec 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian

With the rapid development of neural architecture search (NAS), researchers found powerful network architectures for a wide range of vision tasks.

Neural Architecture Search

360

Paper
Code

Appending Adversarial Frames for Universal Video Attack

no code implementations • 10 Dec 2019 • Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Qi Tian

There have been many efforts in attacking image classification models with adversarial perturbations, but the same topic on video classification has not yet been thoroughly studied.

Classification General Classification +2

Paper
Add Code

Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters

1 code implementation • 25 Oct 2019 • Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.

Neural Architecture Search

Paper
Code

Fast Non-Local Neural Networks with Spectral Residual Learning

1 code implementation • MM '19: Proceedings of the 27th ACM International Conference on Multimedia 2019 • Lu Chi, Guiyu Tian, Yadong Mu, Lingxi Xie, Qi Tian

We show its equivalence to conducting residual learning in some spectral domain and carefully re-formulate a variety of neural layers into their spectral forms, such as ReLU or convolutions.

Pose Estimation Video Classification

Paper
Code

Pruning from Scratch

1 code implementation • 27 Sep 2019 • Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu

Network pruning is an important research field aiming at reducing computational costs of neural networks.

Network Pruning

Paper
Code

Single Camera Training for Person Re-identification

1 code implementation • 24 Sep 2019 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, Qi Tian

Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.

Metric Learning Person Re-Identification

Paper
Code

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

no code implementations • 19 Sep 2019 • Zhuoxun He, Lingxi Xie, Xin Chen, Ya zhang, Yan-Feng Wang, Qi Tian

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks.

Data Augmentation Image Classification +2

Paper
Add Code

PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

8 code implementations • ICLR 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong

Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.

Ranked #20 on Neural Architecture Search on CIFAR-10

Neural Architecture Search

429

Paper
Code

Defending Adversarial Attacks by Correcting logits

no code implementations • 26 Jun 2019 • Yifeng Li, Lingxi Xie, Ya zhang, Rui Zhang, Yanfeng Wang, Qi Tian

Generating and eliminating adversarial examples has been an intriguing topic in the field of deep learning.

Paper
Add Code

Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation

4 code implementations • ICCV 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian

Recently, differentiable search methods have made major progress in reducing the computational costs of neural architecture search.

Neural Architecture Search

360

Paper
Code

CenterNet: Keypoint Triplets for Object Detection

20 code implementations • ICCV 2019 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.

Ranked #116 on Object Detection on COCO test-dev

Object object-detection +1

1,851

Paper
Code

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval

1 code implementation • ICCV 2019 • Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille

Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications.

Domain Adaptation Retrieval +2

Paper
Code

Thickened 2D Networks for Efficient 3D Medical Image Segmentation

no code implementations • 2 Apr 2019 • Qihang Yu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille

With this design, we achieve a higher performance while maintaining a lower inference latency on a few abdominal organs from CT scans, in particular when the organ has a peculiar 3D shape and thus strongly requires contextual information, demonstrating our method's effectiveness and ability in capturing 3D information.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

SIXray : A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images

1 code implementation • 2 Jan 2019 • Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye

In particular, the advantage of CHR is more significant in the scenarios with fewer positive training samples, which demonstrates its potential application in real-world security inspection.

Object Localization

116

Paper
Code

Identity-Enhanced Network for Facial Expression Recognition

no code implementations • 11 Dec 2018 • Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu

Facial expression recognition is a challenging task, arguably because of large intra-class variations and high inter-class similarities.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Attention-guided Unified Network for Panoptic Segmentation

no code implementations • CVPR 2019 • Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, Xingang Wang

This paper studies panoptic segmentation, a recently proposed task which segments foreground (FG) objects at the instance level as well as background (BG) contents at the semantic level.

Ranked #24 on Panoptic Segmentation on COCO test-dev

Panoptic Segmentation Segmentation

Paper
Add Code

Elastic Boundary Projection for 3D Medical Image Segmentation

2 code implementations • CVPR 2019 • Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille

The key observation is that, although the object is a 3D volume, what we really need in segmentation is to find its boundary which is a 2D surface.

3D Medical Imaging Segmentation Image Segmentation +3

Paper
Code

CRAVES: Controlling Robotic Arm with a Vision-based Economic System

1 code implementation • CVPR 2019 • Yiming Zuo, Weichao Qiu, Lingxi Xie, Fangwei Zhong, Yizhou Wang, Alan L. Yuille

We also construct a vision-based control system for task accomplishment, for which we train a reinforcement learning agent in a virtual environment and apply it to the real-world.

3D Pose Estimation Domain Adaptation

262

Paper
Code

Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

1 code implementation • CVPR 2019 • Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille

We consider spatial contexts, for which we solve so-called jigsaw puzzles, i. e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration.

General Classification Image Classification +4

Paper
Code

Snapshot Distillation: Teacher-Student Optimization in One Generation

no code implementations • CVPR 2019 • Chenglin Yang, Lingxi Xie, Chi Su, Alan L. Yuille

Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting.

Image Classification object-detection +2

Paper
Add Code

Generalized Coarse-to-Fine Visual Recognition with Progressive Training

no code implementations • 29 Nov 2018 • Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille

Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.

Image Classification Object Localization +1

Paper
Add Code

Phase Collaborative Network for Two-Phase Medical Image Segmentation

no code implementations • 28 Nov 2018 • Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille

However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.

Image Segmentation Medical Image Segmentation +3

Paper
Add Code

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

1 code implementation • ICCV 2019 • Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille

In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).

Clustering Object +1

Paper
Code

Accelerating Deep Neural Networks with Spatial Bottleneck Modules

no code implementations • 7 Sep 2018 • Junran Peng, Lingxi Xie, Zhao-Xiang Zhang, Tieniu Tan, Jingdong Wang

This paper presents an efficient module named spatial bottleneck for accelerating the convolutional layers in deep neural networks.

Paper
Add Code

Infinite Curriculum Learning for Efficiently Detecting Gastric Ulcers in WCE Images

no code implementations • 7 Sep 2018 • Xiaolu Zhang, Shiwan Zhao, Lingxi Xie

This paper considers WCE-based gastric ulcer detection, in which the major challenge is to detect the lesions in a local region.

Binary Classification

Paper
Add Code

Attention-based Pyramid Aggregation Network for Visual Place Recognition

no code implementations • 1 Aug 2018 • Yingying Zhu, Jiong Wang, Lingxi Xie, Liang Zheng

Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task.

Image Retrieval Retrieval +1

Paper
Add Code

Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal Adenocarcinoma

no code implementations • 9 Jul 2018 • Zhuotun Zhu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille

We propose an intuitive approach of detecting pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, by checking abdominal CT scans.

General Classification Segmentation +1

Paper
Add Code

G2C: A Generator-to-Classifier Framework Integrating Multi-Stained Visual Cues for Pathological Glomerulus Classification

no code implementations • 30 Jun 2018 • Bingzhe Wu, Xiaolu Zhang, Shiwan Zhao, Lingxi Xie, Caihong Zeng, Zhihong Liu, Guangyu Sun

Given an input image from a specified stain, several generators are first applied to estimate its appearances in other staining methods, and a classifier follows to combine visual cues from different stains for prediction (whether it is pathological, or which type of pathology it has).

Classification Decision Making +2

Paper
Add Code

Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students

no code implementations • 15 May 2018 • Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille

We focus on the problem of training a deep neural network in generations.

General Classification Image Classification +1

Paper
Add Code

Joint Shape Representation and Classification for Detecting PDAC

no code implementations • 27 Apr 2018 • Fengze Liu, Lingxi Xie, Yingda Xia, Elliot K. Fishman, Alan L. Yuille

Shape representation and classification are performed in a joint manner, both to exploit the knowledge that PDAC often changes the shape of the pancreas and to prevent over-fitting.

Classification General Classification +1

Paper
Add Code

Multi-Scale Spatially-Asymmetric Recalibration for Image Classification

no code implementations • ECCV 2018 • Yan Wang, Lingxi Xie, Siyuan Qiao, Ya zhang, Wenjun Zhang, Alan L. Yuille

Convolution is spatially-symmetric, i. e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition.

Classification General Classification +2

Paper
Add Code

Bridging the Gap Between 2D and 3D Organ Segmentation with Volumetric Fusion Net

no code implementations • 2 Apr 2018 • Yingda Xia, Lingxi Xie, Fengze Liu, Zhuotun Zhu, Elliot K. Fishman, Alan L. Yuille

There has been a debate on whether to use 2D or 3D deep neural networks for volumetric organ segmentation.

Organ Segmentation Segmentation

Paper
Add Code

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

no code implementations • 1 Apr 2018 • Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille

But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?

Classification General Classification

Paper
Add Code

Adversarial Attacks Beyond the Image Space

no code implementations • CVPR 2019 • Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan L. Yuille

Though image-space adversaries can be interpreted as per-pixel albedo change, we verify that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect.

Question Answering Visual Question Answering

Paper
Add Code

Visual Concepts and Compositional Voting

no code implementations • 13 Nov 2017 • Jianyu Wang, Zhishuai Zhang, Cihang Xie, Yuyin Zhou, Vittal Premachandran, Jun Zhu, Lingxi Xie, Alan Yuille

We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles.

Clustering Semantic Part Detection

Paper
Add Code

DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion

no code implementations • CVPR 2018 • Zhishuai Zhang, Cihang Xie, Jian-Yu Wang, Lingxi Xie, Alan L. Yuille

The first layer extracts the evidence of local visual cues, and the second layer performs a voting mechanism by utilizing the spatial relationship between visual cues and semantic parts.

Semantic Part Detection

Paper
Add Code

Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

2 code implementations • CVPR 2018 • Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille

The key innovation is a saliency transformation module, which repeatedly converts the segmentation probability map from the previous iteration as spatial weights and applies these weights to the current iteration.

Ranked #1 on Pancreas Segmentation on TCIA Pancreas-CT Dataset

Organ Segmentation Pancreas Segmentation +1

105

Paper
Code

Detecting Semantic Parts on Partially Occluded Objects

no code implementations • 25 Jul 2017 • Jianyu Wang, Cihang Xie, Zhishuai Zhang, Jun Zhu, Lingxi Xie, Alan Yuille

Our approach detects semantic parts by accumulating the confidence of local visual cues.

Clustering Semantic Part Detection

Paper
Add Code

Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans

no code implementations • 22 Jun 2017 • Yuyin Zhou, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille

Inspired by the high relevance between the location of a pancreas and its cystic region, we introduce extra deep supervision into the segmentation network, so that cyst segmentation can be improved with the help of relatively easier pancreas segmentation.

Pancreas Segmentation Segmentation

Paper
Add Code

Adversarial Examples for Semantic Segmentation and Object Detection

2 code implementations • ICCV 2017 • Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille

Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e. g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations.

Adversarial Attack Object +4

122

Paper
Code

SORT: Second-Order Response Transform for Visual Recognition

no code implementations • ICCV 2017 • Yan Wang, Lingxi Xie, Chenxi Liu, Ya zhang, Wenjun Zhang, Alan Yuille

In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks.

Paper
Add Code

Genetic CNN

1 code implementation • ICCV 2017 • Lingxi Xie, Alan Yuille

The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition.

Object Recognition

Paper
Code

Deep Collaborative Learning for Visual Recognition

no code implementations • 3 Mar 2017 • Yan Wang, Lingxi Xie, Ya zhang, Wenjun Zhang, Alan Yuille

We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity.

General Classification Image Classification

Paper
Add Code

A Fixed-Point Model for Pancreas Segmentation in Abdominal CT Scans

3 code implementations • 25 Dec 2016 • Yuyin Zhou, Lingxi Xie, Wei Shen, Yan Wang, Elliot K. Fishman, Alan L. Yuille

Deep neural networks have been widely adopted for automatic organ segmentation from abdominal CT scans.

Organ Segmentation Pancreas Segmentation +1

105

Paper
Code

Object Recognition with and without Objects

1 code implementation • 20 Nov 2016 • Zhuotun Zhu, Lingxi Xie, Alan L. Yuille

While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image.

Object Object Recognition

Paper
Code

Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons

no code implementations • 21 Jul 2016 • Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille

For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them.

Image Classification

Paper
Add Code

InterActive: Inter-Layer Activeness Propagation

no code implementations • CVPR 2016 • Lingxi Xie, Liang Zheng, Jingdong Wang, Alan Yuille, Qi Tian

An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network.

Descriptive General Classification

Paper
Add Code

DisturbLabel: Regularizing CNN on the Loss Layer

2 code implementations • CVPR 2016 • Lingxi Xie, Jingdong Wang, Zhen Wei, Meng Wang, Qi Tian

During a long period of time we are combating over-fitting in the CNN training process with model regularization, including weight decay, model averaging, data augmentation, etc.

Data Augmentation

Paper
Code

RIDE: Reversal Invariant Descriptor Enhancement

no code implementations • ICCV 2015 • Lingxi Xie, Jingdong Wang, Weiyao Lin, Bo Zhang, Qi Tian

In many fine-grained object recognition datasets, image orientation (left/right) might vary from sample to sample.

Object Recognition

Paper
Add Code

Fidelity-Naturalness Evaluation of Single Image Super Resolution

no code implementations • 21 Nov 2015 • Xuan Dong, Yu Zhu, Weixin Li, Lingxi Xie, Alex Wong, Alan Yuille

In this paper, we proposed to use both fidelity (the difference with original images) and naturalness (human visual perception of super resolved images) for evaluation.

Image Quality Assessment Image Super-Resolution

Paper
Add Code

Orientational Pyramid Matching for Recognizing Indoor Scenes

no code implementations • CVPR 2014 • Lingxi Xie, Jingdong Wang, Baining Guo, Bo Zhang, Qi Tian

The novelty lies in that OPM uses the 3D orientations to form the pyramid and produce the pooling regions, which is unlike SPM that uses the spatial positions to form the pyramid.

General Classification Scene Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.