Search Results for author: Wentao Zhu

Found 63 papers, 24 papers with code

Efficient Action Counting with Dynamic Queries

1 code implementation • 3 Mar 2024 • Zishi Li, Xiaoxuan Ma, Qiuyan Shang, Wentao Zhu, Hai Ci, Yu Qiao, Yizhou Wang

Temporal repetition counting aims to quantify the repeated action cycles within a video.

Paper
Code

OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine

no code implementations • 28 Feb 2024 • Xiaosong Wang, Xiaofan Zhang, Guotai Wang, Junjun He, Zhongyu Li, Wentao Zhu, Yi Guo, Qi Dou, Xiaoxiao Li, Dequan Wang, Liang Hong, Qicheng Lao, Tong Ruan, Yukun Zhou, Yixue Li, Jie Zhao, Kang Li, Xin Sun, Lifeng Zhu, Shaoting Zhang

The emerging trend of advancing generalist artificial intelligence, such as GPTv4 and Gemini, has reshaped the landscape of research (academia and industry) in machine learning and many other research areas.

Transfer Learning

Paper
Add Code

Language Models Represent Beliefs of Self and Others

no code implementations • 28 Feb 2024 • Wentao Zhu, Zhining Zhang, Yizhou Wang

Understanding and attributing mental states, known as Theory of Mind (ToM), emerges as a fundamental capability for human social reasoning.

Causal Inference

Paper
Add Code

Real-time Holistic Robot Pose Estimation with Unknown States

1 code implementation • 8 Feb 2024 • Shikun Ban, Juling Fan, Wentao Zhu, Xiaoxuan Ma, Yu Qiao, Yizhou Wang

We propose an end-to-end pipeline for real-time, holistic robot pose estimation from a single RGB image, even in the absence of known robot states.

Ranked #1 on Robot Pose Estimation on DREAM-dataset

6D Pose Estimation using RGB Robot Pose Estimation

Paper
Code

Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification

no code implementations • 8 Jan 2024 • Wentao Zhu

To learn from multimodal videos effectively, in this work, we propose a novel audio-video recognition approach termed audio video Transformer, AVT, leveraging the effective spatio-temporal representation by the video Transformer to improve action recognition accuracy.

Action Recognition Contrastive Learning +2

Paper
Add Code

Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification

no code implementations • 8 Jan 2024 • Wentao Zhu

In recent years, researchers combine both audio and video signals to deal with challenges where actions are not well represented or captured by visual cues.

Representation Learning Video Classification

Paper
Add Code

TPC-ViT: Token Propagation Controller for Efficient Vision Transformer

no code implementations • 3 Jan 2024 • Wentao Zhu

Previous approaches that employ gradual token reduction to address this challenge assume that token redundancy in one layer implies redundancy in all the following layers.

Token Reduction

Paper
Add Code

Deformable Audio Transformer for Audio Event Detection

no code implementations • 24 Dec 2023 • Wentao Zhu

Hence, we introduce a learnable input adaptor to alleviate this issue, and DATAR achieves state-of-the-art performance.

Event Detection

Paper
Add Code

ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors

1 code implementation • NeurIPS 2023 • Xiaoxuan Ma, Stephan P. Kaufhold, Jiajun Su, Wentao Zhu, Jack Terwilliger, Andres Meza, Yixin Zhu, Federico Rossano, Yizhou Wang

ChimpACT is both comprehensive and challenging, consisting of 163 videos with a cumulative 160, 500 frames, each richly annotated with detection, identification, pose estimation, and fine-grained spatiotemporal behavior labels.

Action Detection Pose Estimation

Paper
Code

A Multi-Scale Spatial Transformer U-Net for Simultaneously Automatic Reorientation and Segmentation of 3D Nuclear Cardiac Images

no code implementations • 16 Oct 2023 • Yangfan Ni, Duo Zhang, Gege Ma, Lijun Lu, Zhongke Huang, Wentao Zhu

Accurate reorientation and segmentation of the left ventricular (LV) is essential for the quantitative analysis of myocardial perfusion imaging (MPI), in which one critical step is to reorient the reconstructed transaxial nuclear cardiac images into standard short-axis slices for subsequent image processing.

LV Segmentation Segmentation

Paper
Add Code

UPL-SFDA: Uncertainty-aware Pseudo Label Guided Source-Free Domain Adaptation for Medical Image Segmentation

1 code implementation • 19 Sep 2023 • Jianghao Wu, Guotai Wang, Ran Gu, Tao Lu, Yinan Chen, Wentao Zhu, Tom Vercauteren, Sébastien Ourselin, Shaoting Zhang

The different predictions in these duplicated heads are used to obtain pseudo labels for unlabeled target-domain images and their uncertainty to identify reliable pseudo labels.

Brain Segmentation Image Segmentation +5

Paper
Code

BROW: Better featuRes fOr Whole slide image based on self-distillation

no code implementations • 15 Sep 2023 • Yuanfeng Wu, Shaojie Li, Zhiqiang Du, Wentao Zhu

Hence, we proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning.

Instance Segmentation Semantic Segmentation

Paper
Add Code

Classification of lung cancer subtypes on CT images with synthetic pathological priors

no code implementations • 9 Aug 2023 • Wentao Zhu, Yuan Jin, Gege Ma, Geng Chen, Jan Egger, Shaoting Zhang, Dimitris N. Metaxas

The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements.

Computed Tomography (CT)

Paper
Add Code

Human Motion Generation: A Survey

no code implementations • 20 Jul 2023 • Wentao Zhu, Xiaoxuan Ma, Dongwoo Ro, Hai Ci, Jinlu Zhang, Jiaxin Shi, Feng Gao, Qi Tian, Yizhou Wang

In this survey, we present a comprehensive literature review of human motion generation, which, to the best of our knowledge, is the first of its kind in this field.

Paper
Add Code

AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection

1 code implementation • Submitted to ICLR 2022 • Wentao Zhu, Yufang Huang, Xiufeng Xie, Wenxian Liu, Jincan Deng, Debing Zhang, Zhangyang Wang, Ji Liu

For video content creation and understanding, the shot boundary detection (SBD) is one of the most essential components in various scenarios.

Ranked #1 on Camera shot boundary detection on ClipShots

Boundary Detection Neural Architecture Search

Paper
Code

Selective Structured State-Spaces for Long-Form Video Understanding

no code implementations • CVPR 2023 • Jue Wang, Wentao Zhu, Pichao Wang, Xiang Yu, Linda Liu, Mohamed Omar, Raffay Hamid

To address this limitation, we present a novel Selective S4 (i. e., S5) model that employs a lightweight mask generator to adaptively select informative image tokens resulting in more efficient and accurate modeling of long-term spatiotemporal dependencies in videos.

Ranked #2 on Video Classification on Breakfast

Contrastive Learning Token Reduction +2

Paper
Add Code

3D Human Mesh Estimation from Virtual Markers

1 code implementation • CVPR 2023 • Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Wentao Zhu, Yizhou Wang

The advanced motion capture systems solve the problem by placing dense physical markers on the body surface, which allows to extract realistic meshes from their non-rigid motions.

Ranked #1 on 3D Human Pose Estimation on Surreal

3D Human Pose Estimation 3D Pose Estimation

236

Paper
Code

Multiscale Audio Spectrogram Transformer for Efficient Audio Classification

no code implementations • 19 Mar 2023 • Wentao Zhu, Mohamed Omar

Audio event has a hierarchical architecture in both time and frequency and can be grouped together to construct more abstract semantic audio classes.

Ranked #12 on Audio Classification on VGGSound

Audio Classification Representation Learning

Paper
Add Code

Dynamic Inference With Grounding Based Vision and Language Models

no code implementations • CVPR 2023 • Burak Uzkent, Amanmeet Garg, Wentao Zhu, Keval Doshi, Jingru Yi, Xiaolong Wang, Mohamed Omar

For example, recent image and language models with more than 200M parameters have been proposed to learn visual grounding in the pre-training step and show impressive results on downstream vision and language tasks.

Language Modelling Referring Expression +3

Paper
Add Code

GFPose: Learning 3D Human Pose Prior with Gradient Fields

1 code implementation • CVPR 2023 • Hai Ci, Mingdong Wu, Wentao Zhu, Xiaoxuan Ma, Hao Dong, Fangwei Zhong, Yizhou Wang

During the denoising process, GFPose implicitly incorporates pose priors in gradients and unifies various discriminative and generative tasks in an elegant framework.

Ranked #1 on Multi-Hypotheses 3D Human Pose Estimation on Human3.6M

Denoising Monocular 3D Human Pose Estimation +1

111

Paper
Code

Intelligent Computing: The Latest Advances, Challenges and Future

no code implementations • 21 Nov 2022 • Shiqiang Zhu, Ting Yu, Tao Xu, Hongyang Chen, Schahram Dustdar, Sylvain Gigan, Deniz Gunduz, Ekram Hossain, Yaochu Jin, Feng Lin, Bo Liu, Zhiguo Wan, Ji Zhang, Zhifeng Zhao, Wentao Zhu, Zuoning Chen, Tariq Durrani, Huaimin Wang, Jiangxing Wu, Tongyi Zhang, Yunhe Pan

In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications.

Paper
Add Code

MONAI: An open-source framework for deep learning in healthcare

1 code implementation • 4 Nov 2022 • M. Jorge Cardoso, Wenqi Li, Richard Brown, Nic Ma, Eric Kerfoot, Yiheng Wang, Benjamin Murrey, Can Zhao, Dong Yang, Vishwesh Nath, Yufan He, Ziyue Xu, Ali Hatamizadeh, Andriy Myronenko, Wentao Zhu, Yun Liu, Mingxin Zheng, Yucheng Tang, Isaac Yang, Michael Zephyr, Behrooz Hashemian, Sachidanand Alle, Mohammad Zalbagi Darestani, Charlie Budd, Marc Modat, Tom Vercauteren, Guotai Wang, Yiwen Li, Yipeng Hu, Yunguan Fu, Benjamin Gorman, Hans Johnson, Brad Genereaux, Barbaros S. Erdal, Vikash Gupta, Andres Diaz-Pinto, Andre Dourson, Lena Maier-Hein, Paul F. Jaeger, Michael Baumgartner, Jayashree Kalpathy-Cramer, Mona Flores, Justin Kirby, Lee A. D. Cooper, Holger R. Roth, Daguang Xu, David Bericat, Ralf Floca, S. Kevin Zhou, Haris Shuaib, Keyvan Farahani, Klaus H. Maier-Hein, Stephen Aylward, Prerna Dogra, Sebastien Ourselin, Andrew Feng

For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e. g. geometry, physiology, physics) of medical data being processed.

Medical Image Classification medical image detection +2

5,273

Paper
Code

MotionBERT: A Unified Perspective on Learning Human Motion Representations

1 code implementation • ICCV 2023 • Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang

We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (using extra training data)

3D Pose Estimation Action Recognition +3

847

Paper
Code

AVT: Audio-Video Transformer for Multimodal Action Recognition

no code implementations • Submitted to ICLR 2022 • Wentao Zhu, Jingru Yi, Kevin Hsu, Xiaohang Sun, Xiang Hao, Linda Liu, Mohamed Omar

AVT uses a combination of video and audio signals to improve action recognition accuracy, leveraging the effective spatio-temporal representation by the video Transformer.

Ranked #4 on Multi-modal Classification on VGG-Sound

Action Recognition Audio Classification +3

Paper
Add Code

Multiscale Multimodal Transformer for Multimodal Action Recognition

no code implementations • Submitted to ICLR 2022 • Wentao Zhu, Jingru Yi, Xiaohang Sun, Xiang Hao, Linda Liu, Mohamed Omar

In this work, we develop a multiscale multimodal Transformer (MMT) that employs hierarchical representation learning.

Ranked #1 on Multi-modal Classification on VGG-Sound

Action Recognition Audio Classification +2

Paper
Add Code

Anti-Retroactive Interference for Lifelong Learning

1 code implementation • 27 Aug 2022 • Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo

Second, according to the similarity between incremental knowledge and base knowledge, we design an adaptive fusion of incremental knowledge, which helps the model allocate capacity to the knowledge of different difficulties.

Meta-Learning

Paper
Code

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Ranked #1 on Unconditional Video Generation on CelebV-HQ

Attribute Face Generation +1

352

Paper
Code

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

1 code implementation • 22 Jul 2022 • Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang

While the voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras, they suffer from heavy computation burdens, especially for large scenes.

Ranked #5 on 3D Multi-Person Pose Estimation on Campus

3D Multi-Person Pose Estimation 3D Pose Estimation

139

Paper
Code

Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

no code implementations • 17 Mar 2022 • Runqi Wang, Linlin Yang, Baochang Zhang, Wentao Zhu, David Doermann, Guodong Guo

Research on the generalization ability of deep neural networks (DNNs) has recently attracted a great deal of attention.

Image Classification object-detection +1

Paper
Add Code

Adversarial Contrastive Self-Supervised Learning

no code implementations • 26 Feb 2022 • Wentao Zhu, Hang Shang, Tingxun Lv, Chao Liao, Sen yang, Ji Liu

Recently, learning from vast unlabeled data, especially self-supervised learning, has been emerging and attracted widespread attention.

Self-Supervised Learning

Paper
Add Code

Associative Adversarial Learning Based on Selective Attack

no code implementations • 28 Dec 2021 • Runqi Wang, Xiaoyue Duan, Baochang Zhang, Song Xue, Wentao Zhu, David Doermann, Guodong Guo

We show that our method improves the recognition accuracy of adversarial training on ImageNet by 8. 32% compared with the baseline.

Adversarial Robustness Few-Shot Learning +2

Paper
Add Code

MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks

no code implementations • 19 Dec 2021 • Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen Change Loy

Trained with the canonicalization operations and the derived regularizations, our method learns to factorize a skeleton sequence into three independent semantic subspaces, i. e., motion, structure, and view angle.

3D Reconstruction Action Analysis +2

Paper
Add Code

Self-Supervised Monocular Depth and Ego-Motion Estimation in Endoscopy: Appearance Flow to the Rescue

1 code implementation • 15 Dec 2021 • Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu, Dianmin Sun, Baochang Zhang

Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios.

Depth Estimation Motion Estimation +1

Paper
Code

Towards Comprehensive Monocular Depth Estimation: Multiple Heads Are Better Than One

no code implementations • 16 Nov 2021 • Shuwei Shao, Ran Li, Zhongcai Pei, Zhong Liu, Weihai Chen, Wentao Zhu, Xingming Wu, Baochang Zhang

In this work, we investigate into the phenomenon and propose to integrate the strengths of multiple weak depth predictor to build a comprehensive and accurate depth predictor, which is critical for many real-world applications, e. g., 3D reconstruction.

3D Reconstruction Ensemble Learning +2

Paper
Add Code

Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices

1 code implementation • 15 Oct 2021 • Tianli Zhao, Xi Sheryl Zhang, Wentao Zhu, Jiaxing Wang, Sen yang, Ji Liu, Jian Cheng

In this paper, we present a unified framework with Joint Channel pruning and Weight pruning (JCW), and achieves a better Pareto-frontier between the latency and accuracy than previous model compression approaches.

Model Compression

Paper
Code

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

1 code implementation • 18 Sep 2021 • Wentao Zhu, Tianlong Kong, Shun Lu, Jixiang Li, Dawei Zhang, Feng Deng, Xiaorui Wang, Sen yang, Ji Liu

Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances.

Ranked #1 on Speaker Verification on VoxCeleb1

Neural Architecture Search Speaker Recognition +2

Paper
Code

Shifted Chunk Transformer for Spatio-Temporal Representational Learning

no code implementations • NeurIPS 2021 • Xuefan Zha, Wentao Zhu, Tingxun Lv, Sen yang, Ji Liu

However, the pure-Transformer based spatio-temporal learning can be prohibitively costly on memory and computation to extract fine-grained features from a tiny patch.

Action Anticipation Action Recognition +4

Paper
Add Code

Federated Whole Prostate Segmentation in MRI with Personalized Neural Architectures

no code implementations • 16 Jul 2021 • Holger R. Roth, Dong Yang, Wenqi Li, Andriy Myronenko, Wentao Zhu, Ziyue Xu, Xiaosong Wang, Daguang Xu

Building robust deep learning-based models requires diverse training data, ideally from several sources.

Federated Learning Neural Architecture Search

Paper
Add Code

Test-Time Training for Deformable Multi-Scale Image Registration

no code implementations • 25 Mar 2021 • Wentao Zhu, Yufang Huang, Daguang Xu, Zhen Qian, Wei Fan, Xiaohui Xie

Registration is a fundamental task in medical robotics and is often a crucial step for many downstream tasks such as motion analysis, intra-operative tracking and image segmentation.

Image Registration Image Segmentation +1

Paper
Add Code

Deformable Gabor Feature Networks for Biomedical Image Classification

no code implementations • 7 Dec 2020 • Xuan Gong, Xin Xia, Wentao Zhu, Baochang Zhang, David Doermann, Lian Zhuo

In recent years, deep learning has dominated progress in the field of medical image analysis.

Classification General Classification +2

Paper
Add Code

Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan

no code implementations • 23 Nov 2020 • Dong Yang, Ziyue Xu, Wenqi Li, Andriy Myronenko, Holger R. Roth, Stephanie Harmon, Sheng Xu, Baris Turkbey, Evrim Turkbey, Xiaosong Wang, Wentao Zhu, Gianpaolo Carrafiello, Francesca Patella, Maurizio Cariati, Hirofumi Obinata, Hitoshi Mori, Kaku Tamura, Peng An, Bradford J. Wood, Daguang Xu

To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results.

Federated Learning Management

Paper
Add Code

Cycle-Consistent Adversarial Autoencoders for Unsupervised Text Style Transfer

no code implementations • COLING 2020 • Yufang Huang, Wentao Zhu, Deyi Xiong, Yiye Zhang, Changjian Hu, Feiyu Xu

Unsupervised text style transfer is full of challenges due to the lack of parallel data and difficulties in content preservation.

Style Transfer Text Style Transfer +1

Paper
Add Code

Multi-Domain Image Completion for Random Missing Input Data

no code implementations • 10 Jul 2020 • Liyue Shen, Wentao Zhu, Xiaosong Wang, Lei Xing, John M. Pauly, Baris Turkbey, Stephanie Anne Harmon, Thomas Hogue Sanford, Sherif Mehralivand, Peter Choyke, Bradford Wood, Daguang Xu

Multi-domain data are widely leveraged in vision applications taking advantage of complementary information from different modalities, e. g., brain tumor segmentation from multi-parametric magnetic resonance imaging (MRI).

Brain Tumor Segmentation Disentanglement +3

Paper
Add Code

Cardiac Segmentation on Late Gadolinium Enhancement MRI: A Benchmark Study from Multi-Sequence Cardiac MR Segmentation Challenge

no code implementations • 22 Jun 2020 • Xiahai Zhuang, Jiahang Xu, Xinzhe Luo, Chen Chen, Cheng Ouyang, Daniel Rueckert, Victor M. Campello, Karim Lekadir, Sulaiman Vesal, Nishant Ravikumar, Yashu Liu, Gongning Luo, Jingkun Chen, Hongwei Li, Buntheng Ly, Maxime Sermesant, Holger Roth, Wentao Zhu, Jiexiang Wang, Xinghao Ding, Xinyue Wang, Sen yang, Lei LI

In addition, the paired MS-CMR images could enable algorithms to combine the complementary information from the other sequences for the segmentation of LGE CMR.

Cardiac Segmentation Management +1

Paper
Add Code

LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation

1 code implementation • 22 Jun 2020 • Wentao Zhu, Can Zhao, Wenqi Li, Holger Roth, Ziyue Xu, Daguang Xu

In this work, we introduce Large deep 3D ConvNets with Automated Model Parallelism (LAMP) and investigate the impact of both input's and deep 3D ConvNets' size on segmentation accuracy.

Image Segmentation Segmentation +1

Paper
Code

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations • CVPR 2020 • Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

Paper
Add Code

NeurReg: Neural Registration and Its Application to Image Segmentation

1 code implementation • 4 Oct 2019 • Wentao Zhu, Andriy Myronenko, Ziyue Xu, Wenqi Li, Holger Roth, Yufang Huang, Fausto Milletari, Daguang Xu

Furthermore, we design three segmentation frameworks based on the proposed registration framework: 1) atlas-based segmentation, 2) joint learning of both segmentation and registration tasks, and 3) multi-task learning with atlas-based segmentation as an intermediate feature.

Image Registration Image Segmentation +3

Paper
Code

Cardiac Segmentation of LGE MRI with Noisy Labels

no code implementations • 2 Oct 2019 • Holger Roth, Wentao Zhu, Dong Yang, Ziyue Xu, Daguang Xu

In the first step, we register a small set of five LGE cardiac magnetic resonance (CMR) images with ground truth labels to a set of 40 target LGE CMR images without annotation.

Cardiac Segmentation Data Augmentation +2

Paper
Add Code

Privacy-preserving Federated Brain Tumour Segmentation

no code implementations • 2 Oct 2019 • Wenqi Li, Fausto Milletarì, Daguang Xu, Nicola Rieke, Jonny Hancox, Wentao Zhu, Maximilian Baust, Yan Cheng, Sébastien Ourselin, M. Jorge Cardoso, Andrew Feng

Due to medical data privacy regulations, it is often infeasible to collect and share patient data in a centralised data lake.

Federated Learning Privacy Preserving +1

Paper
Add Code

Neural Multi-Scale Self-Supervised Registration for Echocardiogram Dense Tracking

no code implementations • 18 Jun 2019 • Wentao Zhu, Yufang Huang, Mani A. Vannan, Shizhen Liu, Daguang Xu, Wei Fan, Zhen Qian, Xiaohui Xie

In this work, we propose a neural multi-scale self-supervised registration (NMSR) method for automated myocardial and cardiac blood flow dense tracking.

Paper
Add Code

Deep Learning for Automated Medical Image Analysis

no code implementations • 12 Mar 2019 • Wentao Zhu

Second, we will demonstrate how to use the weakly labeled data for the mammogram breast cancer diagnosis by efficiently design deep learning for multi-instance learning.

Anatomy Lung Nodule Detection

Paper
Add Code

AnatomyNet: Deep Learning for Fast and Fully Automated Whole-volume Segmentation of Head and Neck Anatomy

2 code implementations • 15 Aug 2018 • Wentao Zhu, Yufang Huang, Liang Zeng, Xuming Chen, Yong liu, Zhen Qian, Nan Du, Wei Fan, Xiaohui Xie

Methods: Our deep learning model, called AnatomyNet, segments OARs from head and neck CT images in an end-to-end fashion, receiving whole-volume HaN CT images as input and generating masks of all OARs of interest in one shot.

Ranked #1 on Medical Image Segmentation on MICCAI 2015 Head and Neck Challenge

3D Medical Imaging Segmentation Anatomy

144

Paper
Code

DeepEM: Deep 3D ConvNets With EM For Weakly Supervised Pulmonary Nodule Detection

2 code implementations • 14 May 2018 • Wentao Zhu, Yeeleng S. Vang, Yufang Huang, Xiaohui Xie

Recently deep learning has been witnessing widespread adoption in various medical image applications.

Computed Tomography (CT) Lung Nodule Detection

Paper
Code

DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification

2 code implementations • 25 Jan 2018 • Wentao Zhu, Chaochun Liu, Wei Fan, Xiaohui Xie

DeepLung consists of two components, nodule detection (identifying the locations of candidate nodules) and classification (classifying candidate nodules into benign or malignant).

Ranked #5 on Lung Nodule Classification on LIDC-IDRI

Classification Computed Tomography (CT) +2

149

Paper
Code

Adversarial Deep Structured Nets for Mass Segmentation from Mammograms

1 code implementation • 24 Oct 2017 • Wentao Zhu, Xiang Xiang, Trac. D. Tran, Gregory D. Hager, Xiaohui Xie

Mass segmentation provides effective morphological features which are important for mass diagnosis.

Mass Segmentation From Mammograms Position +1

Paper
Code

DeepLung: 3D Deep Convolutional Nets for Automated Pulmonary Nodule Detection and Classification

no code implementations • 16 Sep 2017 • Wentao Zhu, Chaochun Liu, Wei Fan, Xiaohui Xie

Considering the 3D nature of lung CT data, two 3D networks are designed for the nodule detection and classification respectively.

Automated Pulmonary Nodule Detection And Classification Classification +1

Paper
Add Code

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

1 code implementation • 23 May 2017 • Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie

Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning (MIL) for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned ROIs.

Ranked #9 on Suspicous (BIRADS 4,5)-no suspicous (BIRADS 1,2,3) per image classification on InBreast

Classification General Classification +2

111

Paper
Code

Leak Event Identification in Water Systems Using High Order CRF

no code implementations • 12 Mar 2017 • Qing Han, Wentao Zhu, Yang Shi

Today, detection of anomalous events in civil infrastructures (e. g. water pipe breaks and leaks) is time consuming and often takes hours or days.

Vocal Bursts Intensity Prediction

Paper
Add Code

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

no code implementations • 18 Dec 2016 • Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie

Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned costly need to annotate the training data.

Classification General Classification +1

Paper
Add Code

Adversarial Deep Structural Networks for Mammographic Mass Segmentation

1 code implementation • 18 Dec 2016 • Wentao Zhu, Xiang Xiang, Trac. D. Tran, Xiaohui Xie

Experimental results on two public datasets, INbreast and DDSM-BCRP, show that our end-to-end network combined with adversarial training achieves the-state-of-the-art results.

Position Segmentation

Paper
Code

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

no code implementations • 24 Mar 2016 • Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.

Action Recognition Skeleton Based Action Recognition +1

Paper
Add Code

Deep Trans-layer Unsupervised Networks for Representation Learning

no code implementations • 27 Sep 2015 • Wentao Zhu, Jun Miao, Laiyun Qing, Xilin Chen

Compared to traditional deep learning methods, the implemented feature learning method has much less parameters and is validated in several typical experiments, such as digit recognition on MNIST and MNIST variations, object recognition on Caltech 101 dataset and face verification on LFW dataset.

Face Verification Object Recognition +1

Paper
Add Code

Constrained Extreme Learning Machines: A Study on Classification Cases

1 code implementation • 25 Jan 2015 • Wentao Zhu, Jun Miao, Laiyun Qing

Extreme learning machine (ELM) is an extremely fast learning method and has a powerful performance for pattern recognition tasks proven by enormous researches and engineers.

Classification General Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.