Search Results for author: Alan Yuille

Found 244 papers, 127 papers with code

Learning a Category-level Object Pose Estimator without Pose Annotations

no code implementations • 8 Apr 2024 • Fengrui Tian, Yaoyao Liu, Adam Kortylewski, Yueqi Duan, Shaoyi Du, Alan Yuille, Angtian Wang

Instead of using manually annotated images, we leverage diffusion models (e. g., Zero-1-to-3) to generate a set of images under controlled pose differences and propose to learn our object pose estimator with those images.

Object Pose Estimation

Paper
Add Code

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

1 code implementation • 2 Apr 2024 • Jieneng Chen, Qihang Yu, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

To this end, we introduce ViTamin, a new vision models tailored for VLMs.

Paper
Code

Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images

1 code implementation • 13 Mar 2024 • Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

To this end, we propose a Simple Space-Aware Memory Matrix for In-painting and Detecting anomalies from radiography images (abbreviated as SimSID).

Anatomy Image Reconstruction +1

Paper
Code

A Bayesian Approach to OOD Robustness in Image Classification

no code implementations • 12 Mar 2024 • Prakhar Kaushik, Adam Kortylewski, Alan Yuille

This enables us to learn a transitional dictionary of vMF kernels that are intermediate between the source and target domains and train the generative model on this dictionary using the annotations on the source domain, followed by iterative refinement.

Image Classification

Paper
Add Code

From Pixel to Cancer: Cellular Automata in Computed Tomography

1 code implementation • 11 Mar 2024 • Yuxiang Lai, Xiaoxi Chen, Angtian Wang, Alan Yuille, Zongwei Zhou

AI for cancer detection encounters the bottleneck of data scarcity, annotation difficulty, and low prevalence of early tumors.

Computed Tomography (CT)

Paper
Code

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

1 code implementation • 7 Mar 2024 • Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille

X-ray is widely applied for transmission imaging due to its stronger penetration than natural light.

Novel View Synthesis

Paper
Code

Towards Generalizable Tumor Synthesis

1 code implementation • 29 Feb 2024 • Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou

Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation.

Computed Tomography (CT)

Paper
Code

Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training?

1 code implementation • 29 Feb 2024 • Tiezheng Zhang, Xiaoxi Chen, Chongyu Qu, Alan Yuille, Zongwei Zhou

Human experts revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from these revised annotations.

Interactive Segmentation

Paper
Code

PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter

no code implementations • 16 Feb 2024 • Junfei Xiao, Zheng Xu, Alan Yuille, Shen Yan, Boyu Wang

Our research undertakes a thorough exploration of the state-of-the-art perceiver resampler architecture and builds a strong baseline.

Language Modelling Question Answering +1

Paper
Add Code

Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation

no code implementations • 19 Jan 2024 • Prakhar Kaushik, Aayush Mishra, Adam Kortylewski, Alan Yuille

We focus on individual locally robust mesh vertex features and iteratively update them based on their proximity to corresponding features in the target domain even when the global pose is not correct.

Ranked #1 on Unsupervised Domain Adaptation on OOD-CV

Pose Estimation Unsupervised Domain Adaptation

Paper
Add Code

SPFormer: Enhancing Vision Transformer with Superpixel Representation

no code implementations • 5 Jan 2024 • Jieru Mei, Liang-Chieh Chen, Alan Yuille, Cihang Xie

In this work, we introduce SPFormer, a novel Vision Transformer enhanced by superpixel representation.

Superpixels

Paper
Add Code

HISR: Hybrid Implicit Surface Representation for Photorealistic 3D Human Reconstruction

no code implementations • 28 Dec 2023 • Angtian Wang, Yuanlu Xu, Nikolaos Sarafianos, Robert Maier, Edmond Boyer, Alan Yuille, Tony Tung

This representation is composed of two surface layers that represent opaque and translucent regions on the clothed human body.

3D Human Reconstruction

Paper
Add Code

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties

1 code implementation • 21 Dec 2023 • Junfei Xiao, Ziqi Zhou, Wenxuan Li, Shiyi Lan, Jieru Mei, Zhiding Yu, Alan Yuille, Yuyin Zhou, Cihang Xie

Instead of relying solely on category-specific annotations, ProLab uses descriptive properties grounded in common sense knowledge for supervising segmentation models.

Common Sense Reasoning Descriptive +1

Paper
Code

Continual Adversarial Defense

no code implementations • 15 Dec 2023 • Qian Wang, Yaoyao Liu, Hefei Ling, Yingwei Li, Qihao Liu, Ping Li, Jiazhong Chen, Alan Yuille, Ning Yu

In response to the rapidly evolving nature of adversarial attacks against visual classifiers on a monthly basis, numerous defenses have been proposed to generalize against as many known attacks as possible.

Adversarial Defense Continual Learning +2

Paper
Add Code

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

no code implementations • 9 Dec 2023 • Shitian Zhao, Zhuowan Li, Yadong Lu, Alan Yuille, Yan Wang

We propose Causal Context Generation, Causal-CoG, which is a prompting strategy that engages contextual information to enhance precise VQA during inference.

Question Answering Visual Question Answering

Paper
Add Code

Rejuvenating image-GPT as Strong Visual Representation Learners

1 code implementation • 4 Dec 2023 • Sucheng Ren, Zeyu Wang, Hongru Zhu, Junfei Xiao, Alan Yuille, Cihang Xie

This paper enhances image-GPT (iGPT), one of the pioneering works that introduce autoregressive pretraining to predict next pixels for visual representation learning.

Representation Learning

Paper
Code

SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference

1 code implementation • 4 Dec 2023 • Feng Wang, Jieru Mei, Alan Yuille

Specifically, we replace the traditional self-attention block of CLIP vision encoder's last layer by our CSA module and reuse its pretrained projection matrices of query, key, and value, leading to a training-free adaptation approach for CLIP's zero-shot semantic segmentation.

Segmentation Semantic Segmentation +2

Paper
Code

Sequential Modeling Enables Scalable Learning for Large Vision Models

1 code implementation • 1 Dec 2023 • Yutong Bai, Xinyang Geng, Karttikeya Mangalam, Amir Bar, Alan Yuille, Trevor Darrell, Jitendra Malik, Alexei A Efros

We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data.

1,581

Paper
Code

Prompt-Based Exemplar Super-Compression and Regeneration for Class-Incremental Learning

1 code implementation • 30 Nov 2023 • Ruxiao Duan, Yaoyao Liu, Jieneng Chen, Adam Kortylewski, Alan Yuille

Replay-based methods in class-incremental learning (CIL) have attained remarkable success, as replaying the exemplars of old classes can significantly mitigate catastrophic forgetting.

Class Incremental Learning Data Augmentation +1

Paper
Code

MaXTron: Mask Transformer with Trajectory Attention for Video Panoptic Segmentation

1 code implementation • 30 Nov 2023 • Ju He, Qihang Yu, Inkyu Shin, Xueqing Deng, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

To alleviate the issue, we propose to adapt the trajectory attention for both the dense pixel features and object queries, aiming to improve the short-term and long-term tracking results, respectively.

Ranked #1 on Video Panoptic Segmentation on VIPSeg

Object Video Classification +3

Paper
Code

Learning Part Segmentation from Synthetic Animals

no code implementations • 30 Nov 2023 • Jiawei Peng, Ju He, Prakhar Kaushik, Zihao Xiao, Jiteng Mu, Alan Yuille

We then benchmark Syn-to-Real animal part segmentation from SAP to PartImageNet, namely SynRealPart, with existing semantic segmentation domain adaptation methods and further improve them as our second contribution.

Domain Adaptation Pseudo Label +2

Paper
Add Code

Instruct2Attack: Language-Guided Semantic Adversarial Attacks

no code implementations • 27 Nov 2023 • Jiang Liu, Chen Wei, Yuxiang Guo, Heng Yu, Alan Yuille, Soheil Feizi, Chun Pong Lau, Rama Chellappa

We propose Instruct2Attack (I2A), a language-guided semantic attack that generates semantically meaningful perturbations according to free-form language instructions.

Paper
Add Code

IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers

no code implementations • 27 Nov 2023 • Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan Yuille, Jiahui Yu

To tackle this problem, we redesign the scoring objective for the captioner to alleviate the distributional bias and focus on measuring the gain of information brought by the visual inputs.

Caption Generation Language Modelling +2

Paper
Add Code

Structure-Aware Sparse-View X-ray 3D Reconstruction

1 code implementation • 18 Nov 2023 • Yuanhao Cai, Jiahao Wang, Alan Yuille, Zongwei Zhou, Angtian Wang

In this paper, we propose a framework, Structure-Aware X-ray Neural Radiodensity Fields (SAX-NeRF), for sparse-view X-ray 3D reconstruction.

Ranked #1 on Low-Dose X-Ray Ct Reconstruction on X3D

3D Reconstruction Low-Dose X-Ray Ct Reconstruction +1

Paper
Code

De-Diffusion Makes Text a Strong Cross-Modal Interface

1 code implementation • 1 Nov 2023 • Chen Wei, Chenxi Liu, Siyuan Qiao, Zhishuai Zhang, Alan Yuille, Jiahui Yu

We demonstrate text as a strong cross-modal interface.

Paper
Code

3D-Aware Visual Question Answering about Parts, Poses and Occlusions

2 code implementations • NeurIPS 2023 • Xingrui Wang, Wufei Ma, Zhuowan Li, Adam Kortylewski, Alan Yuille

In this work, we introduce the task of 3D-aware VQA, which focuses on challenging questions that require a compositional reasoning over the 3D structure of visual scenes.

Question Answering Visual Question Answering

Paper
Code

Synthetic Data as Validation

no code implementations • 24 Oct 2023 • Qixin Hu, Alan Yuille, Zongwei Zhou

Specifically, the DSC score for liver tumor segmentation improves from 26. 7% (95% CI: 22. 6%-30. 9%) to 34. 5% (30. 8%-38. 2%) when evaluated on an in-domain dataset and from 31. 1% (26. 0%-36. 2%) to 35. 4% (32. 1%-38. 7%) on an out-domain dataset.

Computed Tomography (CT) Continual Learning +1

Paper
Add Code

Acquiring Weak Annotations for Tumor Localization in Temporal and Volumetric Data

1 code implementation • 23 Oct 2023 • Yu-Cheng Chou, Bowen Li, Deng-Ping Fan, Alan Yuille, Zongwei Zhou

In summary, this research proposes an efficient annotation strategy for tumor detection and localization that is less accurate than per-pixel annotations but useful for creating large-scale datasets for screening tumors in various medical modalities.

Weakly-supervised Learning

Paper
Code

3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

2 code implementations • 11 Oct 2023 • Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew Lungren, Lei Xing, Le Lu, Alan Yuille, Yuyin Zhou

In this paper, we extend the 2D TransUNet architecture to a 3D network by building upon the state-of-the-art nnU-Net architecture, and fully exploring Transformers' potential in both the encoder and decoder design.

Image Segmentation Medical Image Segmentation +3

2,114

Paper
Code

FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning

1 code implementation • 6 Oct 2023 • Peiran Xu, Zeyu Wang, Jieru Mei, Liangqiong Qu, Alan Yuille, Cihang Xie, Yuyin Zhou

Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices to mitigate the risk of data leakage.

Federated Learning

Paper
Code

Understanding Pan-Sharpening via Generalized Inverse

no code implementations • 4 Oct 2023 • Shiqi Liu, Yutong Bai, Xinyang Han, Alan Yuille

By the generalized inverse theory, we derived two forms of general inverse matrix formulations that can correspond to the two prominent classes of Pan-sharpening methods, that is, component substitution and multi-resolution analysis methods.

Paper
Add Code

Boosting Dermatoscopic Lesion Segmentation via Diffusion Models with Visual and Textual Prompts

no code implementations • 4 Oct 2023 • Shiyi Du, Xiaosong Wang, Yongyi Lu, Yuyin Zhou, Shaoting Zhang, Alan Yuille, Kang Li, Zongwei Zhou

Image synthesis approaches, e. g., generative adversarial networks, have been popular as a form of data augmentation in medical image analysis tasks.

Data Augmentation Image Generation +2

Paper
Add Code

Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape

no code implementations • ICCV 2023 • Jiacong Xu, Yi Zhang, Jiawei Peng, Wufei Ma, Artur Jesslen, Pengliang Ji, Qixin Hu, Jiehua Zhang, Qihao Liu, Jiahao Wang, Wei Ji, Chen Wang, Xiaoding Yuan, Prakhar Kaushik, Guofeng Zhang, Jie Liu, Yushan Xie, Yawen Cui, Alan Yuille, Adam Kortylewski

Animal3D consists of 3379 images collected from 40 mammal species, high-quality annotations of 26 keypoints, and importantly the pose and shape parameters of the SMAL model.

Ranked #1 on Animal Pose Estimation on Animal3D

Animal Pose Estimation

Paper
Add Code

3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation

1 code implementation • ICCV 2023 • Yi Zhang, Pengliang Ji, Angtian Wang, Jieru Mei, Adam Kortylewski, Alan Yuille

Motivated by the recent success of generative models in rigid object pose estimation, we propose 3D-aware Neural Body Fitting (3DNBF) - an approximate analysis-by-synthesis approach to 3D human pose estimation with SOTA performance and occlusion robustness.

3D Human Pose Estimation Contrastive Learning

Paper
Code

Early Detection and Localization of Pancreatic Cancer by Label-Free Tumor Synthesis

1 code implementation • 6 Aug 2023 • Bowen Li, Yu-Cheng Chou, Shuwen Sun, Hualin Qiao, Alan Yuille, Zongwei Zhou

We further investigate the per-voxel segmentation performance of pancreatic tumors if AI is trained on a combination of CT scans with synthetic tumors and CT scans with annotated large tumors at an advanced stage.

Specificity

265

Paper
Code

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

1 code implementation • 24 Jul 2023 • YiQing Wang, Zihan Li, Jieru Mei, Zihao Wei, Li Liu, Chen Wang, Shengtian Sang, Alan Yuille, Cihang Xie, Yuyin Zhou

To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for enabling accurate and data-efficient self-supervised medical image analysis.

Contrastive Learning Image Reconstruction +4

Paper
Code

Generating Images with 3D Annotations Using Diffusion Models

no code implementations • 13 Jun 2023 • Wufei Ma, Qihao Liu, Jiahao Wang, Angtian Wang, Xiaoding Yuan, Yi Zhang, Zihao Xiao, Guofeng Zhang, Beijia Lu, Ruxiao Duan, Yongrui Qi, Adam Kortylewski, Yaoyao Liu, Alan Yuille

With explicit 3D geometry control, we can easily change the 3D structures of the objects in the generated images and obtain ground-truth 3D annotations automatically.

3D Pose Estimation Style Transfer

Paper
Add Code

Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segmentation

1 code implementation • CVPR 2023 • Ju He, Jieneng Chen, Ming-Xian Lin, Qihang Yu, Alan Yuille

Compositor achieves state-of-the-art performance on PartImageNet and Pascal-Part by outperforming previous methods by around 0. 9% and 1. 3% on PartImageNet, 0. 4% and 1. 7% on Pascal-Part in terms of part and object mIoU and demonstrates better robustness against occlusion by around 4. 4% and 7. 1% on part and object respectively.

Clustering Object +2

Paper
Code

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

no code implementations • 1 Jun 2023 • Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille

(2) We find regions in the latent space that lead to distorted images independent of the text prompt, suggesting that parts of the latent space are not well-structured.

Adversarial Attack Efficient Exploration +1

Paper
Add Code

Continual Learning for Abdominal Multi-Organ and Tumor Segmentation

1 code implementation • 1 Jun 2023 • Yixiao Zhang, Xinyi Li, Huimiao Chen, Alan Yuille, Yaoyao Liu, Zongwei Zhou

The ability to dynamically extend a model to new data and classes is critical for multiple organ and tumor segmentation.

Continual Learning Organ Segmentation +2

Paper
Code

Neural Textured Deformable Meshes for Robust Analysis-by-Synthesis

no code implementations • 31 May 2023 • Angtian Wang, Wufei Ma, Alan Yuille, Adam Kortylewski

Human vision demonstrates higher robustness than current AI algorithms under out-of-distribution scenarios.

Paper
Add Code

Robust Category-Level 3D Pose Estimation from Synthetic Data

no code implementations • 25 May 2023 • Jiahao Yang, Wufei Ma, Angtian Wang, Xiaoding Yuan, Alan Yuille, Adam Kortylewski

In this work, we aim to narrow the performance gap between models trained on synthetic data and few real images and fully supervised models trained on large-scale data.

3D Pose Estimation 3D Reconstruction +4

Paper
Add Code

Robust 3D-aware Object Classification via Discriminative Render-and-Compare

no code implementations • 24 May 2023 • Artur Jesslen, Guofeng Zhang, Angtian Wang, Alan Yuille, Adam Kortylewski

Using differentiable rendering, we estimate the 3D object pose by minimizing the reconstruction error between the mesh and the feature representation of the target image.

Classification Image Classification +2

Paper
Add Code

AbdomenAtlas-8K: Annotating 8,000 CT Volumes for Multi-Organ Segmentation in Three Weeks

1 code implementation • NeurIPS 2023 • Chongyu Qu, Tiezheng Zhang, Hualin Qiao, Jie Liu, Yucheng Tang, Alan Yuille, Zongwei Zhou

Annotating medical images, particularly for organ segmentation, is laborious and time-consuming.

8k Active Learning +2

147

Paper
Code

OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

no code implementations • 17 Apr 2023 • Bingchen Zhao, Jiahao Wang, Wufei Ma, Artur Jesslen, Siwei Yang, Shaozuo Yu, Oliver Zendel, Christian Theobalt, Alan Yuille, Adam Kortylewski

Enhancing the robustness of vision algorithms in real-world scenarios is challenging.

3D Pose Estimation Benchmarking +4

Paper
Add Code

Diffusion Models as Masked Autoencoders

no code implementations • ICCV 2023 • Chen Wei, Karttikeya Mangalam, Po-Yao Huang, Yanghao Li, Haoqi Fan, Hu Xu, Huiyu Wang, Cihang Xie, Alan Yuille, Christoph Feichtenhofer

There has been a longstanding belief that generation can facilitate a true understanding of visual data.

Denoising Image Inpainting

Paper
Add Code

Label-Free Liver Tumor Segmentation

1 code implementation • CVPR 2023 • Qixin Hu, Yixiong Chen, Junfei Xiao, Shuwen Sun, Jieneng Chen, Alan Yuille, Zongwei Zhou

We demonstrate that AI models can accurately segment liver tumors without the need for manual annotation by using synthetic tumors in CT scans.

Ranked #1 on Tumor Segmentation on LiTS17

Segmentation Tumor Segmentation

265

Paper
Code

InstMove: Instance Motion for Object-centric Video Segmentation

1 code implementation • CVPR 2023 • Qihao Liu, Junfeng Wu, Yi Jiang, Xiang Bai, Alan Yuille, Song Bai

A common solution is to use optical flow to provide motion information, but essentially it only considers pixel-level motion, which still relies on appearance similarity and hence is often inaccurate under occlusion and fast movement.

Object Optical Flow Estimation +3

591

Paper
Code

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

1 code implementation • CVPR 2023 • Qihao Liu, Adam Kortylewski, Alan Yuille

We introduce a learning-based testing method, termed PoseExaminer, that automatically diagnoses HPS algorithms by searching over the parameter space of human pose images to find the failure modes.

Multi-agent Reinforcement Learning

Paper
Code

CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans

no code implementations • ICCV 2023 • Jieneng Chen, Yingda Xia, Jiawen Yao, Ke Yan, Jianpeng Zhang, Le Lu, Fakai Wang, Bo Zhou, Mingyan Qiu, Qihang Yu, Mingze Yuan, Wei Fang, Yuxing Tang, Minfeng Xu, Jian Zhou, Yuqian Zhao, Qifeng Wang, Xianghua Ye, Xiaoli Yin, Yu Shi, Xin Chen, Jingren Zhou, Alan Yuille, Zaiyi Liu, Ling Zhang

Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice, while most medical AI systems are built to focus on single organs with a narrow list of a few diseases.

Organ Segmentation Representation Learning +1

Paper
Add Code

Benchmarking Robustness in Neural Radiance Fields

no code implementations • 10 Jan 2023 • Chen Wang, Angtian Wang, Junbo Li, Alan Yuille, Cihang Xie

We find that NeRF-based models are significantly degraded in the presence of corruption, and are more sensitive to a different set of corruptions than image recognition models.

Benchmarking Camera Calibration +2

Paper
Add Code

Learning Road Scene-level Representations via Semantic Region Prediction

no code implementations • 2 Jan 2023 • Zihao Xiao, Alan Yuille, Yi-Ting Chen

In this work, we tackle two vital tasks in automated driving systems, i. e., driver intent prediction and risk object identification from egocentric images.

Paper
Add Code

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

2 code implementations • ICCV 2023 • Jie Liu, Yixiao Zhang, Jie-Neng Chen, Junfei Xiao, Yongyi Lu, Bennett A. Landman, Yixuan Yuan, Alan Yuille, Yucheng Tang, Zongwei Zhou

The proposed model is developed from an assembly of 14 datasets, using a total of 3, 410 CT scans for training and then evaluated on 6, 162 external CT scans from 3 additional datasets.

Ranked #1 on Organ Segmentation on BTCV

Organ Segmentation Segmentation +1

465

Paper
Code

Unleashing the Power of Visual Prompting At the Pixel Level

1 code implementation • 20 Dec 2022 • Junyang Wu, Xianhang Li, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

This paper presents a simple and effective visual prompting method for adapting pre-trained models to downstream recognition tasks.

Visual Prompting

Paper
Code

AsyInst: Asymmetric Affinity with DepthGrad and Color for Box-Supervised Instance Segmentation

no code implementations • 7 Dec 2022 • Siwei Yang, Longlong Jing, Junfei Xiao, Hang Zhao, Alan Yuille, Yingwei Li

Through systematic analysis, we found that the commonly used pairwise affinity loss has two limitations: (1) it works with color affinity but leads to inferior performance with other modalities such as depth gradient, (2)the original affinity loss does not prevent trivial predictions as intended but actually accelerates this process due to the affinity loss term being symmetric.

Box-supervised Instance Segmentation Segmentation +2

Paper
Add Code

Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models

no code implementations • 1 Dec 2022 • Zhuowan Li, Cihang Xie, Benjamin Van Durme, Alan Yuille

Despite the impressive advancements achieved through vision-and-language pretraining, it remains unclear whether this joint learning paradigm can help understand each individual modality.

Attribute Representation Learning

Paper
Add Code

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

2 code implementations • CVPR 2023 • Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille

Visual Question Answering (VQA) models often perform poorly on out-of-distribution data and struggle on domain generalization.

Domain Generalization Question Answering +2

Paper
Code

LUMix: Improving Mixup by Better Modelling Label Uncertainty

no code implementations • 29 Nov 2022 • Shuyang Sun, Jie-Neng Chen, Ruifei He, Alan Yuille, Philip Torr, Song Bai

LUMix is simple as it can be implemented in just a few lines of code and can be universally applied to any deep networks \eg CNNs and Vision Transformers, with minimal computational cost.

Data Augmentation

Paper
Add Code

SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training

no code implementations • ICCV 2023 • Yuanze Lin, Chen Wei, Huiyu Wang, Alan Yuille, Cihang Xie

Coupling all these designs allows our method to enjoy both competitive performances on text-to-video retrieval and video question answering tasks, and much less pre-training costs by 1. 9X or more.

Question Answering Retrieval +3

Paper
Add Code

Synthetic Tumors Make AI Segment Tumors Better

1 code implementation • 26 Oct 2022 • Qixin Hu, Junfei Xiao, Yixiong Chen, Shuwen Sun, Jie-Neng Chen, Alan Yuille, Zongwei Zhou

We develop a novel strategy to generate synthetic tumors.

Tumor Segmentation

265

Paper
Code

Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification

1 code implementation • 23 Oct 2022 • Junfei Xiao, Yutong Bai, Alan Yuille, Zongwei Zhou

We hope that this study can direct future research on the application of Transformers to a larger variety of medical imaging tasks.

Computational Efficiency Transfer Learning

Paper
Code

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

1 code implementation • 23 Oct 2022 • Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy.

Segmentation Semantic Segmentation

Paper
Code

Context-Enhanced Stereo Transformer

1 code implementation • 21 Oct 2022 • Weiyu Guo, Zhaoshuo Li, Yongkui Yang, Zheng Wang, Russell H. Taylor, Mathias Unberath, Alan Yuille, Yingwei Li

We construct our stereo depth estimation model, Context Enhanced Stereo Transformer (CSTR), by plugging CEP into the state-of-the-art stereo depth estimation method Stereo Transformer.

Stereo Depth Estimation Stereo Matching

Paper
Code

MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models

2 code implementations • 4 Oct 2022 • Chenglin Yang, Siyuan Qiao, Qihang Yu, Xiaoding Yuan, Yukun Zhu, Alan Yuille, Hartwig Adam, Liang-Chieh Chen

The tiny-MOAT family is also benchmarked on downstream tasks, serving as a baseline for the community.

Ranked #1 on Object Detection on MS COCO

Image Classification Instance Segmentation +2

982

Paper
Code

Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features

1 code implementation • 12 Sep 2022 • Wufei Ma, Angtian Wang, Alan Yuille, Adam Kortylewski

We consider the problem of category-level 6D pose estimation from a single RGB image.

6D Pose Estimation Contrastive Learning +1

Paper
Code

Masked Autoencoders Enable Efficient Knowledge Distillers

1 code implementation • CVPR 2023 • Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

For example, by distilling the knowledge from an MAE pre-trained ViT-L into a ViT-B, our method achieves 84. 0% ImageNet top-1 accuracy, outperforming the baseline of directly distilling a fine-tuned ViT-L by 1. 2%.

Knowledge Distillation

Paper
Code

Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation

no code implementations • 29 Jul 2022 • Qihao Liu, Yi Zhang, Song Bai, Alan Yuille

Inspired by the remarkable ability of humans to infer occluded joints from visible cues, we develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation with or without occlusions.

Ranked #10 on 3D Multi-Person Pose Estimation (absolute) on MuPoTS-3D

3D Human Pose Estimation 3D Multi-Person Pose Estimation (absolute) +2

Paper
Add Code

In Defense of Online Models for Video Instance Segmentation

1 code implementation • 21 Jul 2022 • Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai

In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.

Ranked #9 on Video Instance Segmentation on YouTube-VIS validation (using extra training data)

Contrastive Learning Instance Segmentation +5

591

Paper
Code

kMaX-DeepLab: k-means Mask Transformer

2 code implementations • 8 Jul 2022 • Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.

Ranked #2 on Panoptic Segmentation on COCO test-dev

Clustering Object Detection +1

982

Paper
Code

Unsupervised Domain Adaptation through Shape Modeling for Medical Image Segmentation

1 code implementation • 6 Jul 2022 • Yuan YAO, Fengze Liu, Zongwei Zhou, Yan Wang, Wei Shen, Alan Yuille, Yongyi Lu

Previous methods proposed Variational Autoencoder (VAE) based models to learn the distribution of shape for a particular organ and used it to automatically evaluate the quality of a segmentation prediction by fitting it into the learned shape distribution.

Image Segmentation Pancreas Segmentation +3

Paper
Code

CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation

2 code implementations • CVPR 2022 • Qihang Yu, Huiyu Wang, Dahun Kim, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering.

Ranked #6 on Panoptic Segmentation on COCO test-dev

Clustering Panoptic Segmentation +1

Paper
Code

A Simple Data Mixing Prior for Improving Self-Supervised Learning

1 code implementation • CVPR 2022 • Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie

More notably, our SDMP is the first method that successfully leverages data mixing to improve (rather than hurt) the performance of Vision Transformers in the self-supervised setting.

Representation Learning Self-Supervised Learning

Paper
Code

VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis

1 code implementation • 30 May 2022 • Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille

The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds.

Pose Estimation

Paper
Code

In Defense of Image Pre-Training for Spatiotemporal Recognition

1 code implementation • 3 May 2022 • Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie

Inspired by this observation, we hypothesize that the key to effectively leveraging image pre-training lies in the decomposition of learning spatial and temporal features, and revisiting image pre-training as the appearance prior to initializing 3D kernels.

STS Video Recognition

Paper
Code

Fast AdvProp

1 code implementation • ICLR 2022 • Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

Paper
Code

SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering

1 code implementation • CVPR 2022 • Vipul Gupta, Zhuowan Li, Adam Kortylewski, Chenyu Zhang, Yingwei Li, Alan Yuille

By swapping the context object features, the model reliance on context can be suppressed effectively.

Data Augmentation Question Answering +1

Paper
Code

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation

1 code implementation • 22 Mar 2022 • Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen

Recent advances in self-supervised contrastive learning yield good image-level representation, which favors classification tasks but usually neglects pixel-level detailed information, leading to unsatisfactory transfer performance to dense prediction tasks such as semantic segmentation.

Contrastive Learning Representation Learning +2

Paper
Code

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

1 code implementation • CVPR 2022 • Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan

In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e. g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion.

3D Object Detection Autonomous Driving +2

2,781

Paper
Code

Point-Level Region Contrast for Object Detection Pre-Training

1 code implementation • CVPR 2022 • Yutong Bai, Xinlei Chen, Alexander Kirillov, Alan Yuille, Alexander C. Berg

In this work we present point-level region contrast, a self-supervised pre-training approach for the task of object detection.

Contrastive Learning Knowledge Distillation +2

Paper
Code

Lite Vision Transformer with Enhanced Self-Attention

1 code implementation • CVPR 2022 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille

We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.

Panoptic Segmentation Segmentation

127

Paper
Code

Masked Feature Prediction for Self-Supervised Visual Pre-Training

5 code implementations • CVPR 2022 • Chen Wei, Haoqi Fan, Saining Xie, Chao-yuan Wu, Alan Yuille, Christoph Feichtenhofer

We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training of video models.

Ranked #8 on Action Recognition on AVA v2.2 (using extra training data)

Action Classification Action Recognition +1

6,258

Paper
Code

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

1 code implementation • 3 Dec 2021 • Jingye Chen, Jieneng Chen, Zongwei Zhou, Bin Li, Alan Yuille, Yongyi Lu

However, these approaches formulated skin cancer diagnosis as a simple classification task, dismissing the potential benefit from lesion segmentation.

Classification Computational Efficiency +4

Paper
Code

PartImageNet: A Large, High-Quality Dataset of Parts

1 code implementation • 2 Dec 2021 • Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Qihang Yu, Alan Yuille

To help address this problem, we propose PartImageNet, a large, high-quality dataset with part segmentation annotations.

Activity Recognition Few-Shot Learning +6

108

Paper
Code

OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

no code implementations • 29 Nov 2021 • Bingchen Zhao, Shaozuo Yu, Wufei Ma, Mingxin Yu, Shenxiao Mei, Angtian Wang, Ju He, Alan Yuille, Adam Kortylewski

One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or ignore the effects of individual nuisance factors.

3D Pose Estimation Benchmarking +5

Paper
Add Code

Learning from Temporal Gradient for Semi-supervised Action Recognition

1 code implementation • CVPR 2022 • Junfei Xiao, Longlong Jing, Lin Zhang, Ju He, Qi She, Zongwei Zhou, Alan Yuille, Yingwei Li

Our method achieves the state-of-the-art performance on three video action recognition benchmarks (i. e., Kinetics-400, UCF-101, and HMDB-51) under several typical semi-supervised settings (i. e., different ratios of labeled data).

Action Recognition Temporal Action Localization

Paper
Code

TransMix: Attend to Mix for Vision Transformers

2 code implementations • CVPR 2022 • Jie-Neng Chen, Shuyang Sun, Ju He, Philip Torr, Alan Yuille, Song Bai

The confidence of the label will be larger if the corresponding input image is weighted higher by the attention map.

Instance Segmentation object-detection +3

565

Paper
Code

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge

no code implementations • 15 Nov 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.

Instance Segmentation Object Recognition +3

Paper
Add Code

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

no code implementations • 15 Nov 2021 • Huaijin Pi, Huiyu Wang, Yingwei Li, Zizhang Li, Alan Yuille

In order to effectively search in this huge architecture space, we propose Hierarchical Sampling for better training of the supernet.

Neural Architecture Search

Paper
Add Code

iBOT: Image BERT Pre-Training with Online Tokenizer

1 code implementation • 15 Nov 2021 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong

We present a self-supervised framework iBOT that can perform masked prediction with an online tokenizer.

Ranked #1 on Unsupervised Image Classification on ImageNet

Instance Segmentation Language Modelling +6

617

Paper
Code

Are Transformers More Robust Than CNNs?

1 code implementation • NeurIPS 2021 • Yutong Bai, Jieru Mei, Alan Yuille, Cihang Xie

Transformer emerges as a powerful tool for visual recognition.

Ranked #1 on Adversarial Robustness on Stylized ImageNet

Adversarial Robustness

173

Paper
Code

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

1 code implementation • NeurIPS 2021 • Angtian Wang, Shenxiao Mei, Alan Yuille, Adam Kortylewski

The model is initialized from a few labelled images and is subsequently used to synthesize feature representations of unseen 3D views.

3D Pose Estimation Few-Shot Learning

Paper
Code

A Light-weight Interpretable Compositional Model for Nuclei Detection and Weakly-Supervised Segmentation

no code implementations • 26 Oct 2021 • Yixiao Zhang, Adam Kortylewski, Qing Liu, Seyoun Park, Benjamin Green, Elizabeth Engle, Guillermo Almodovar, Ryan Walk, Sigfredo Soto-Diaz, Janis Taube, Alex Szalay, Alan Yuille

It only requires annotations on isolated nucleus, rather than on all nuclei in the dataset.

Segmentation Weakly supervised segmentation

Paper
Add Code

Nuisance-Label Supervision: Robustness Improvement by Free Labels

no code implementations • 14 Oct 2021 • Xinyue Wei, Weichao Qiu, Yi Zhang, Zihao Xiao, Alan Yuille

Nuisance factors are those irrelevant to a task, and an ideal model should be invariant to them.

Action Recognition Data Augmentation

Paper
Add Code

Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images

1 code implementation • ICCV 2021 • Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille

Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.

Question Answering Visual Question Answering

Paper
Code

Image BERT Pre-training with Online Tokenizer

no code implementations • ICLR 2022 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong

The success of language Transformers is primarily attributed to the pretext task of masked language modeling (MLM), where texts are first tokenized into semantically meaningful pieces.

Image Classification Instance Segmentation +5

Paper
Add Code

SAME: Deformable Image Registration based on Self-supervised Anatomical Embeddings

no code implementations • 23 Sep 2021 • Fengze Liu, Ke Yan, Adam Harrison, Dazhou Guo, Le Lu, Alan Yuille, Lingyun Huang, Guotong Xie, Jing Xiao, Xianghua Ye, Dakai Jin

In this work, we introduce a fast and accurate method for unsupervised 3D medical image registration.

Image Registration Medical Image Registration

Paper
Add Code

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

1 code implementation • 11 Sep 2021 • Shiyu Tang, Ruihao Gong, Yan Wang, Aishan Liu, Jiakai Wang, Xinyun Chen, Fengwei Yu, Xianglong Liu, Dawn Song, Alan Yuille, Philip H. S. Torr, DaCheng Tao

Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet regarding ARchitecture design (49 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ techniques, e. g., data augmentation) towards diverse noises (adversarial, natural, and system noises).

Adversarial Robustness Benchmarking +2

143

Paper
Code

Exploring Simple 3D Multi-Object Tracking for Autonomous Driving

3 code implementations • ICCV 2021 • Chenxu Luo, Xiaodong Yang, Alan Yuille

3D multi-object tracking in LiDAR point clouds is a key ingredient for self-driving vehicles.

3D Multi-Object Tracking Autonomous Driving +4

163

Paper
Code

Locally Enhanced Self-Attention: Combining Self-Attention and Convolution as Local and Context Terms

3 code implementations • 12 Jul 2021 • Chenglin Yang, Siyuan Qiao, Adam Kortylewski, Alan Yuille

Self-Attention has become prevalent in computer vision models.

Instance Segmentation object-detection +2

Paper
Code

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

no code implementations • CVPR 2021 • Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

For a given unsupervised task, we design multilevel tasks and define different learning stages for the deep network.

Paper
Add Code

Simulated Adversarial Testing of Face Recognition Models

no code implementations • CVPR 2022 • Nataniel Ruiz, Adam Kortylewski, Weichao Qiu, Cihang Xie, Sarah Adel Bargal, Alan Yuille, Stan Sclaroff

In this work, we propose a framework for learning how to test machine learning algorithms using simulators in an adversarial manner in order to find weaknesses in the model before deploying it in critical scenarios.

BIG-bench Machine Learning Face Recognition

Paper
Add Code

Glance-and-Gaze Vision Transformer

1 code implementation • NeurIPS 2021 • Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen

It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.

Paper
Code

Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

1 code implementation • 1 Jun 2021 • Ju He, Adam Kortylewski, Shaokang Yang, Shuai Liu, Cheng Yang, Changhu Wang, Alan Yuille

In particular, we decouple the training of the representation and the classifier, and systematically investigate the effects of different data re-sampling techniques when training the whole network including a classifier as well as fine-tuning the feature extractor only.

Paper
Code

Visual analogy: Deep learning versus compositional models

no code implementations • 14 May 2021 • Nicholas Ichien, Qing Liu, Shuhao Fu, Keith J. Holyoak, Alan Yuille, Hongjing Lu

We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations.

Relation Network Visual Analogies

Paper
Add Code

Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation

no code implementations • 20 Apr 2021 • Yingda Xia, Dong Yang, Wenqi Li, Andriy Myronenko, Daguang Xu, Hirofumi Obinata, Hitoshi Mori, Peng An, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Anna Ierardi, Alan Yuille, Holger Roth

In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models.

Federated Learning Image Segmentation +3

Paper
Add Code

Self-Supervised Pillar Motion Learning for Autonomous Driving

1 code implementation • CVPR 2021 • Chenxu Luo, Xiaodong Yang, Alan Yuille

Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments.

Autonomous Driving Motion Estimation

117

Paper
Code

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation

1 code implementation • ICCV 2021 • Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan Yuille, Nuno Vasconcelos, Xiaolong Wang

To deal with the large shape variance, we introduce Articulated Signed Distance Functions (A-SDF) to represent articulated shapes with a disentangled latent space, where we have separate codes for encoding shape and articulation.

Test-time Adaptation

Paper
Code

CateNorm: Categorical Normalization for Robust Medical Image Segmentation

1 code implementation • 29 Mar 2021 • Junfei Xiao, Lequan Yu, Zongwei Zhou, Yutong Bai, Lei Xing, Alan Yuille, Yuyin Zhou

We propose a new normalization strategy, named categorical normalization (CateNorm), to normalize the activations according to categorical statistics.

Image Segmentation Medical Image Segmentation +2

Paper
Code

Learning Part Segmentation through Unsupervised Domain Adaptation from Synthetic Vehicles

1 code implementation • CVPR 2022 • Qing Liu, Adam Kortylewski, Zhishuai Zhang, Zizhang Li, Mengqi Guo, Qihao Liu, Xiaoding Yuan, Jiteng Mu, Weichao Qiu, Alan Yuille

We believe our dataset provides a rich testbed to study UDA for part segmentation and will help to significantly push forward research in this area.

Geometric Matching Segmentation +2

Paper
Code

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

no code implementations • CVPR 2021 • Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.

Instance Segmentation Relation Network +3

Paper
Add Code

Understanding Catastrophic Forgetting and Remembering in Continual Learning with Optimal Relevance Mapping

1 code implementation • 22 Feb 2021 • Prakhar Kaushik, Alex Gain, Adam Kortylewski, Alan Yuille

Additionally, current approaches that deal with forgetting ignore the problem of catastrophic remembering, i. e. the worsening ability to discriminate between data from different tasks.

Ranked #1 on Continual Learning on ImageNet-50 (5 tasks)

Continual Learning

Paper
Code

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

1 code implementation • CVPR 2021 • Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied.

Paper
Code

Occluded Video Instance Segmentation: A Benchmark

2 code implementations • 2 Feb 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.

Ranked #39 on Video Instance Segmentation on OVIS validation

Instance Segmentation Segmentation +3

Paper
Code

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation

1 code implementation • ICLR 2021 • Angtian Wang, Adam Kortylewski, Alan Yuille

Using differentiable rendering we estimate the 3D object pose by minimizing the reconstruction error between NeMo and the feature representation of the target image.

3D Pose Estimation Contrastive Learning

Paper
Code

CORL: Compositional Representation Learning for Few-Shot Classification

no code implementations • 28 Jan 2021 • Ju He, Adam Kortylewski, Alan Yuille

In particular, during meta-learning, we train a knowledge base that consists of a dictionary of component representations and a dictionary of component activation maps that encode common spatial activation patterns of components.

Classification Few-Shot Image Classification +3

Paper
Add Code

Meticulous Object Segmentation

1 code implementation • 13 Dec 2020 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille

To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.

2k 4k +4

Paper
Code

Mask Guided Matting via Progressive Refinement Network

1 code implementation • CVPR 2021 • Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

314

Paper
Code

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation • CVPR 2021 • Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

Ranked #1 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Depth-aware Video Panoptic Segmentation Monocular Depth Estimation +2

212

Paper
Code

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion

1 code implementation • CVPR 2021 • Xiaoding Yuan, Adam Kortylewski, Yihong Sun, Alan Yuille

The improved segmentation masks are, in turn, integrated into the network in a top-down manner to improve the image classification.

Image Classification Instance Segmentation +3

Paper
Code

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

3 code implementations • CVPR 2021 • Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Ranked #12 on Panoptic Segmentation on COCO test-dev

Panoptic Segmentation

982

Paper
Code

Unsupervised Part Discovery via Feature Alignment

no code implementations • 1 Dec 2020 • Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille

Specifically, given a training image, we find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.

Object Object Recognition

Paper
Add Code

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

no code implementations • 1 Dec 2020 • Christian Cosgrove, Adam Kortylewski, Chenglin Yang, Alan Yuille

Second, we find that compositional deep networks, which have part-based representations that lead to innate robustness to natural occlusion, are robust to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition Benchmark, without adversarial training.

Traffic Sign Recognition

Paper
Add Code

Batch Normalization with Enhanced Linear Transformation

1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille

Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.

Paper
Code

Can Temporal Information Help with Contrastive Self-Supervised Learning?

no code implementations • 25 Nov 2020 • Yutong Bai, Haoqi Fan, Ishan Misra, Ganesh Venkatesh, Yongyi Lu, Yuyin Zhou, Qihang Yu, Vikas Chandra, Alan Yuille

To this end, we present Temporal-aware Contrastive self-supervised learningTaCo, as a general paradigm to enhance video CSL.

Data Augmentation Representation Learning +2

Paper
Add Code

Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model

1 code implementation • CVPR 2022 • Yihong Sun, Adam Kortylewski, Alan Yuille

Moreover, by leveraging an outlier process, Bayesian models can further generalize out-of-distribution to segment partially occluded objects and to predict their amodal object boundaries.

Amodal Instance Segmentation Object +2

Paper
Code

Shape-Texture Debiased Neural Network Training

1 code implementation • ICLR 2021 • Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.

Ranked #598 on Image Classification on ImageNet

Adversarial Robustness Data Augmentation +2

108

Paper
Code

CO2: Consistent Contrast for Unsupervised Visual Representation Learning

no code implementations • ICLR 2021 • Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille

Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.

Contrastive Learning Image Classification +5

Paper
Add Code

CoKe: Localized Contrastive Learning for Robust Keypoint Detection

no code implementations • 29 Sep 2020 • Yutong Bai, Angtian Wang, Adam Kortylewski, Alan Yuille

In this paper, we introduce a contrastive learning framework for keypoint detection (CoKe).

Contrastive Learning Keypoint Detection +1

Paper
Add Code

Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

no code implementations • 29 Aug 2020 • Chun-Hung Chao, Zhuotun Zhu, Dazhou Guo, Ke Yan, Tsung-Ying Ho, Jinzheng Cai, Adam P. Harrison, Xianghua Ye, Jing Xiao, Alan Yuille, Min Sun, Le Lu, Dakai Jin

Specifically, we first utilize a 3D convolutional neural network with ROI-pooling to extract the GTV$_{LN}$'s instance-wise appearance features.

Clinical Knowledge

Paper
Add Code

Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

no code implementations • 27 Aug 2020 • Zhuotun Zhu, Dakai Jin, Ke Yan, Tsung-Ying Ho, Xianghua Ye, Dazhou Guo, Chun-Hung Chao, Jing Xiao, Alan Yuille, Le Lu

Finding, identifying and segmenting suspicious cancer metastasized lymph nodes from 3D multi-modality imaging is a clinical task of paramount importance.

Paper
Add Code

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

1 code implementation • 12 Aug 2020 • Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.

Segmentation

Paper
Code

Probabilistic Multi-modal Trajectory Prediction with Lane Attention for Autonomous Vehicles

no code implementations • 6 Jul 2020 • Chenxu Luo, Lin Sun, Dariush Dabiri, Alan Yuille

As for vehicles, their trajectories are significantly influenced by the lane geometry and how to effectively use the lane information is of active interest.

Autonomous Vehicles Motion Forecasting +1

Paper
Add Code

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

no code implementations • 28 Jun 2020 • Yingda Xia, Dong Yang, Zhiding Yu, Fengze Liu, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation.

Image Segmentation Organ Segmentation +6

Paper
Add Code

Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition under Occlusion

no code implementations • 28 Jun 2020 • Adam Kortylewski, Qing Liu, Angtian Wang, Yihong Sun, Alan Yuille

The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose.

Image Classification object-detection +2

Paper
Add Code

Smooth Adversarial Training

1 code implementation • 25 Jun 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le

SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82. 2% accuracy and 58. 6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9. 5% for accuracy and 11. 6% for robustness.

Ranked #1 on Adversarial Defense on ImageNet (non-targeted PGD, max perturbation=4)

Adversarial Defense Adversarial Robustness

Paper
Code

DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution

6 code implementations • CVPR 2021 • Siyuan Qiao, Liang-Chieh Chen, Alan Yuille

In this paper, we explore this mechanism in the backbone design for object detection.

Ranked #2 on Object Detection on AI-TOD

Instance Segmentation Object +4

27,678

Paper
Code

Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

no code implementations • 27 May 2020 • Zhuotun Zhu, Ke Yan, Dakai Jin, Jinzheng Cai, Tsung-Ying Ho, Adam P. Harrison, Dazhou Guo, Chun-Hung Chao, Xianghua Ye, Jing Xiao, Alan Yuille, Le Lu

We focus on the detection and segmentation of oncology-significant (or suspicious cancer metastasized) lymph nodes (OSLNs), which has not been studied before as a computational task.

Paper
Add Code

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans

no code implementations • ECCV 2020 • Fengze Liu, Jingzheng Cai, Yuankai Huo, Chi-Tung Cheng, Ashwin Raju, Dakai Jin, Jing Xiao, Alan Yuille, Le Lu, Chien-Hung Liao, Adam P. Harrison

We extensively evaluate our JSSR system on a large-scale medical image dataset containing 1, 485 patient CT imaging studies of four different phases (i. e., 5, 940 3D CT scans with pathological livers) on the registration, segmentation and synthesis tasks.

Image Registration Multi-Task Learning +2

Paper
Add Code

Robust Object Detection under Occlusion with Context-Aware CompositionalNets

no code implementations • CVPR 2020 • Angtian Wang, Yihong Sun, Adam Kortylewski, Alan Yuille

In this work, we propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects: 1) CompositionalNets, as well as other DCNN architectures, do not explicitly separate the representation of the context from the object itself.

Object object-detection +1

Paper
Add Code

Domain Adaptive Relational Reasoning for 3D Multi-Organ Segmentation

no code implementations • 18 May 2020 • Shuhao Fu, Yongyi Lu, Yan Wang, Yuyin Zhou, Wei Shen, Elliot Fishman, Alan Yuille

In this paper, we present a novel unsupervised domain adaptation (UDA) method, named Domain Adaptive Relational Reasoning (DARR), to generalize 3D multi-organ segmentation models to medical data collected from different scanners and/or protocols (domains).

Organ Segmentation Relational Reasoning +3

Paper
Add Code

Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

no code implementations • CVPR 2020 • Dazhou Guo, Dakai Jin, Zhuotun Zhu, Tsung-Ying Ho, Adam P. Harrison, Chun-Hung Chao, Jing Xiao, Alan Yuille, Chien-Yu Lin, Le Lu

This is the goal of our work, where we introduce stratified organ at risk segmentation (SOARS), an approach that stratifies OARs into anchor, mid-level, and small & hard (S&H) categories.

Anatomy Neural Architecture Search +1

Paper
Add Code

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

2 code implementations • ECCV 2020 • Chenglin Yang, Adam Kortylewski, Cihang Xie, Yinzhi Cao, Alan Yuille

PatchAttack induces misclassifications by superimposing small textured patches on the input image.

Adversarial Defense Clustering +2

Paper
Code

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations • CVPR 2020 • Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning

Paper
Add Code

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations • CVPR 2020 • Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Ranked #60 on Neural Architecture Search on ImageNet

Image Classification Neural Architecture Search

105

Paper
Code

Are Labels Necessary for Neural Architecture Search?

2 code implementations • ECCV 2020 • Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie

Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels.

Neural Architecture Search

2,109

Paper
Code

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation

1 code implementation • ECCV 2020 • Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan Yuille

The ability to detect failures and anomalies are fundamental requirements for building reliable systems for computer vision applications, especially safety-critical applications of semantic segmentation, such as autonomous driving and medical image analysis.

Ranked #8 on Anomaly Detection on Road Anomaly (using extra training data)

Anomaly Detection Autonomous Driving +3

Paper
Code

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

5 code implementations • ECCV 2020 • Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions.

Ranked #4 on Panoptic Segmentation on Cityscapes val (using extra training data)

Image Classification Panoptic Segmentation +1

1,139

Paper
Code

Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

1 code implementation • CVPR 2020 • Adam Kortylewski, Ju He, Qing Liu, Alan Yuille

Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion.

General Classification

108

Paper
Code

When Radiology Report Generation Meets Knowledge Graph

no code implementations • 19 Feb 2020 • Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

Paper
Add Code

Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots

no code implementations • ECCV 2020 • Qi Chen, Lin Sun, Zhixin Wang, Kui Jia, Alan Yuille

Accurate 3D object detection in LiDAR based point clouds suffers from the challenges of data sparsity and irregularities.

Ranked #3 on 3D Object Detection on KITTI Pedestrians Moderate

3D Object Detection Object +2

Paper
Add Code

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

1 code implementation • ICLR 2020 • Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang

We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms.

Ranked #61 on Neural Architecture Search on ImageNet

Neural Architecture Search

223

Paper
Code

Learning from Synthetic Animals

2 code implementations • CVPR 2020 • Jiteng Mu, Weichao Qiu, Gregory Hager, Alan Yuille

Despite great success in human parsing, progress for parsing other deformable articulated objects, like animals, is still limited by the lack of labeled data.

Domain Adaptation Human Parsing +1

Paper
Code

Identity Preserve Transform: Understand What Activity Classification Models Have Learnt

no code implementations • 13 Dec 2019 • Jialing Lyu, Weichao Qiu, Xinyue Wei, Yi Zhang, Alan Yuille, Zheng-Jun Zha

This can explain why an activity classification model usually fails to generalize to datasets it is not trained on.

Classification General Classification

Paper
Add Code

DASZL: Dynamic Action Signatures for Zero-shot Learning

no code implementations • 8 Dec 2019 • Tae Soo Kim, Jonathan D. Jones, Michael Peven, Zihao Xiao, Jin Bai, Yi Zhang, Weichao Qiu, Alan Yuille, Gregory D. Hager

There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large.

Action Detection Activity Detection +3

Paper
Add Code

RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition

no code implementations • 3 Dec 2019 • Yi Zhang, Xinyue Wei, Weichao Qiu, Zihao Xiao, Gregory D. Hager, Alan Yuille

In this paper, we propose the Randomized Simulation as Augmentation (RSA) framework which augments real-world training data with synthetic data to improve the robustness of action recognition networks.

Action Recognition Temporal Action Localization

Paper
Add Code

Identifying Model Weakness with Adversarial Examiner

no code implementations • 25 Nov 2019 • Michelle Shu, Chenxi Liu, Weichao Qiu, Alan Yuille

Different from the existing strategy to always give the same (distribution of) test data, the adversarial examiner will dynamically select the next test data to hand out based on the testing history so far, with the goal being to undermine the model's performance.

Autonomous Driving

Paper
Add Code

Deeply Shape-guided Cascade for Instance Segmentation

1 code implementation • CVPR 2021 • Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen

The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages.

Instance Segmentation Region Proposal +2

Paper
Code

Adversarial Examples Improve Image Recognition

6 code implementations • CVPR 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le

We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger.

Ranked #4 on Domain Generalization on VizWiz-Classification

Domain Generalization Image Classification

29,624

Paper
Code

Rethinking Normalization and Elimination Singularity in Neural Networks

1 code implementation • 21 Nov 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille

To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models.

Image Classification Instance Segmentation +4

Paper
Code

Localizing Occluders with Compositional Convolutional Networks

no code implementations • 18 Nov 2019 • Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

Our experimental results demonstrate that the proposed extensions increase the model's performance at localizing occluders as well as at classifying partially occluded objects.

Paper
Add Code

Grouped Spatial-Temporal Aggregation for Efficient Action Recognition

1 code implementation • ICCV 2019 • Chenxu Luo, Alan Yuille

This decomposition is more parameter-efficient and enables us to quantitatively analyze the contributions of spatial and temporal features in different layers.

Action Recognition

Paper
Code

Universal Physical Camouflage Attacks on Object Detectors

2 code implementations • CVPR 2020 • Lifeng Huang, Chengying Gao, Yuyin Zhou, Cihang Xie, Alan Yuille, Changqing Zou, Ning Liu

In this paper, we study physical adversarial attacks on object detectors in the wild.

Object Region Proposal

Paper
Code

TDAPNet: Prototype Network with Recurrent Top-Down Attention for Robust Object Classification under Partial Occlusion

no code implementations • 9 Sep 2019 • Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille

Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data.

General Classification Object +1

Paper
Add Code

Hyper-Pairing Network for Multi-Phase Pancreatic Ductal Adenocarcinoma Segmentation

no code implementations • 3 Sep 2019 • Yuyin Zhou, Yingwei Li, Zhishuai Zhang, Yan Wang, Angtian Wang, Elliot Fishman, Alan Yuille, Seyoun Park

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers with an overall five-year survival rate of 8%.

Paper
Add Code

FusionNet: Incorporating Shape and Texture for Abnormality Detection in 3D Abdominal CT Scans

no code implementations • 21 Aug 2019 • Fengze Liu, Yuyin Zhou, Elliot Fishman, Alan Yuille

Second, a FusionNet is proposed to take both the binary mask and CT image as input and perform a binary classification.

3D Classification Anomaly Detection +4

Paper
Add Code

Deep Differentiable Random Forests for Age Estimation

no code implementations • 23 Jul 2019 • Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille

Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes.

Age Estimation regression

Paper
Add Code

Multi-Scale Attentional Network for Multi-Focal Segmentation of Active Bleed after Pelvic Fractures

no code implementations • 23 Jun 2019 • Yuyin Zhou, David Dreizin, Yingwei Li, Zhishuai Zhang, Yan Wang, Alan Yuille

Trauma is the worldwide leading cause of death and disability in those younger than 45 years, and pelvic fractures are a major source of morbidity and mortality.

Segmentation

Paper
Add Code

Intriguing properties of adversarial training at scale

no code implementations • ICLR 2020 • Cihang Xie, Alan Yuille

This two-domain hypothesis may explain the issue of BN when training with a mixture of clean and adversarial images, as estimating normalization statistics of this mixture distribution is challenging.

Adversarial Robustness

Paper
Add Code

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

no code implementations • 6 Jun 2019 • Zhuotun Zhu, Chenxi Liu, Dong Yang, Alan Yuille, Daguang Xu

Deep learning algorithms, in particular 2D and 3D fully convolutional neural networks (FCNs), have rapidly become the mainstream methodology for volumetric medical image segmentation.

Image Segmentation Neural Architecture Search +3

Paper
Add Code

Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion

no code implementations • 28 May 2019 • Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks.

General Classification Image Classification +1

Paper
Add Code

Robustness of Object Recognition under Extreme Occlusion in Humans and Computational Models

1 code implementation • 11 May 2019 • Hongru Zhu, Peng Tang, Jeongho Park, Soojin Park, Alan Yuille

We test both humans and the above-mentioned computational models in a challenging task of object recognition under extreme occlusion, where target objects are heavily occluded by irrelevant real objects in real backgrounds.

Object Object Recognition

Paper
Code

Structured Prediction using cGANs with Fusion Discriminator

no code implementations • ICLR 2019 • Faisal Mahmood, Wenhao Xu, Nicholas J. Durr, Jeremiah W. Johnson, Alan Yuille

We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentation, and depth estimation.

Depth Estimation Generative Adversarial Network +3

Paper
Add Code

Prior-aware Neural Network for Partially-Supervised Multi-Organ Segmentation

no code implementations • ICCV 2019 • Yuyin Zhou, Zhe Li, Song Bai, Chong Wang, Xinlei Chen, Mei Han, Elliot Fishman, Alan Yuille

Accurate multi-organ abdominal CT segmentation is essential to many clinical applications such as computer-aided intervention.

Medical Image Segmentation Organ Segmentation +2

Paper
Add Code

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval

1 code implementation • ICCV 2019 • Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille

Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications.

Domain Adaptation Retrieval +2

Paper
Code

An Alarm System For Segmentation Algorithm Based On Shape Model

no code implementations • ICLR 2019 • Fengze Liu, Yingda Xia, Dong Yang, Alan Yuille, Daguang Xu

Motivated by this, in this paper, we learn a feature space using the shape information which is a strong prior shared among different datasets and robust to the appearance variation of input data. The shape feature is captured using a Variational Auto-Encoder (VAE) network that trained with only the ground truth masks.

Segmentation

Paper
Add Code

Micro-Batch Training with Batch-Channel Normalization and Weight Standardization

8 code implementations • 25 Mar 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille

Batch Normalization (BN) has become an out-of-box technique to improve deep network training.

Ranked #76 on Instance Segmentation on COCO minival

Image Classification Instance Segmentation +5

47,189

Paper
Code

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

12 code implementations • CVPR 2019 • Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei

Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space.

Ranked #7 on Semantic Segmentation on PASCAL VOC 2012 val

Image Classification Image Segmentation +3

76,563

Paper
Code

CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions

3 code implementations • CVPR 2019 • Runtao Liu, Chenxi Liu, Yutong Bai, Alan Yuille

Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art models cannot be easily evaluated on their intermediate reasoning process.

Ranked #1 on Referring Expression Segmentation on CLEVR-Ref+

Image Segmentation object-detection +8

Paper
Code

ELASTIC: Improving CNNs with Dynamic Scaling Policies

1 code implementation • CVPR 2019 • Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari

We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture.

General Classification Multi-Label Classification +1

Paper
Code

Feature Denoising for Improving Adversarial Robustness

2 code implementations • CVPR 2019 • Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, Kaiming He

This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks.

Ranked #1 on Adversarial Defense on CAAD 2018

Adversarial Defense Adversarial Robustness +2

670

Paper
Code

Learning Transferable Adversarial Examples via Ghost Networks

1 code implementation • 9 Dec 2018 • Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille

The critical principle of ghost networks is to apply feature-level perturbations to an existing model to potentially create a huge set of diverse models.

Adversarial Attack

Paper
Code

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

1 code implementation • CVPR 2019 • Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille

By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.

Network Pruning Neural Architecture Search

Paper
Code

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

no code implementations • 29 Nov 2018 • Yingda Xia, Fengze Liu, Dong Yang, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

Image Segmentation Medical Image Segmentation +3

Paper
Add Code

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

1 code implementation • ICCV 2019 • Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille

In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).

Clustering Object +1

Paper
Code

Robust Face Detection via Learning Small Faces on Hard Images

1 code implementation • 28 Nov 2018 • Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille

In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.

Ranked #8 on Face Detection on WIDER Face (Hard)

Face Detection

139

Paper
Code

OriNet: A Fully Convolutional Network for 3D Human Pose Estimation

1 code implementation • 12 Nov 2018 • Chenxu Luo, Xiao Chu, Alan Yuille

We use limb orientations as a new way to represent 3D poses and bind the orientation together with the bounding box of each limb region to better associate images and predictions.

Ranked #76 on 3D Human Pose Estimation on MPI-INF-3DHP (AUC metric)

3D Human Pose Estimation

Paper
Code

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation • 14 Oct 2018 • Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Ranked #1 on Scene Flow Estimation on KITTI 2015 Scene Flow Training

Depth Estimation Optical Flow Estimation +2

Paper
Code

Weakly Supervised Region Proposal Network and Object Detection

no code implementations • ECCV 2018 • Peng Tang, Xinggang Wang, Angtian Wang, Yongluan Yan, Wenyu Liu, Junzhou Huang, Alan Yuille

The Convolutional Neural Network (CNN) based region proposal generation method (i. e. region proposal network), trained using bounding box annotations, is an essential component in modern fully supervised object detectors.

Object object-detection +2

Paper
Add Code

Rethinking Monocular Depth Estimation with Adversarial Training

no code implementations • 22 Aug 2018 • Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr

Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function.

Monocular Depth Estimation

Paper
Add Code

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

4 code implementations • 9 Jul 2018 • Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille

The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.

Ranked #1 on Weakly Supervised Object Detection on ImageNet

Multiple Instance Learning Object +3

245

Paper
Code

Resisting Large Data Variations via Introspective Transformation Network

no code implementations • 16 May 2018 • Yunhan Zhao, Ye Tian, Charless Fowlkes, Wei Shen, Alan Yuille

Experimental results verify that our approach significantly improves the ability of deep networks to resist large variations between training and testing data and achieves classification accuracy improvements on several benchmark datasets, including MNIST, affNIST, SVHN, CIFAR-10 and miniImageNet.

Data Augmentation Few-Shot Learning

Paper
Add Code

Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students

no code implementations • 15 May 2018 • Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille

We focus on the problem of training a deep neural network in generations.

General Classification Image Classification +1

Paper
Add Code

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

no code implementations • 1 Apr 2018 • Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille

But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?

Classification General Classification

Paper
Add Code

Adversarial Attacks and Defences Competition

1 code implementation • 31 Mar 2018 • Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

145

Paper
Code

Scene Graph Parsing as Dependency Parsing

2 code implementations • NAACL 2018 • Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille

The scene graphs generated by our learned neural dependency parser achieve an F-score similarity of 49. 67% to ground truth graphs on our evaluation set, surpassing best previous approaches by 5%.

Dependency Parsing Image Retrieval +2

Paper
Code

Improving Transferability of Adversarial Examples with Input Diversity

2 code implementations • CVPR 2019 • Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jian-Yu Wang, Zhou Ren, Alan Yuille

We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future.

Adversarial Attack Image Classification

159

Paper
Code

Deep Co-Training for Semi-Supervised Image Recognition

1 code implementation • ECCV 2018 • Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille

We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.

Paper
Code

Unleashing the Potential of CNNs for Interpretable Few-Shot Learning

no code implementations • ICLR 2018 • Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille

Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs.

Few-Shot Learning

Paper
Add Code

Deep Regression Forests for Age Estimation

2 code implementations • CVPR 2018 • Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille

Age estimation from facial images is typically cast as a nonlinear regression problem.

Ranked #6 on Age Estimation on FGNET

Age Estimation regression

Paper
Code

Progressive Neural Architecture Search

17 code implementations • ECCV 2018 • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.

Ranked #15 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120 (Accuracy (Val) metric)

Evolutionary Algorithms General Classification +3

76,564

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.