Search Results for author: Zhirong Wu

Found 31 papers, 20 papers with code

Exploring Transferability for Randomized Smoothing

no code implementations • 14 Dec 2023 • Kai Qiu, Huishuai Zhang, Zhirong Wu, Stephen Lin

However, the model robustness, which is a critical aspect for safety, is often optimized for each specific task rather than at the pretraining stage.

Paper
Add Code

NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time Series Pretraining

no code implementations • 11 Oct 2023 • Chenguo Lin, Xumeng Wen, Wei Cao, Congrui Huang, Jiang Bian, Stephen Lin, Zhirong Wu

In this work, we make key technical contributions that are tailored to the numerical properties of time-series data and allow the model to scale to large datasets, e. g., millions of temporal sequences.

Learning Semantic Representations Temporal Sequences +1

Paper
Add Code

Associative Transformer

1 code implementation • 22 Sep 2023 • Yuwei Sun, Hideya Ochiai, Zhirong Wu, Stephen Lin, Ryota Kanai

Existing studies such as the Coordination method employ iterative cross-attention mechanisms with a bottleneck to enable the sparse association of inputs.

Artificial Global Workspace Inductive Bias +2

Paper
Code

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

1 code implementation • CVPR 2023 • Long Lian, Zhirong Wu, Stella X. Yu

The Gestalt law of common fate, i. e., what move at the same speed belong together, has inspired unsupervised object discovery based on motion segmentation.

Ranked #1 on Unsupervised Object Segmentation on FBMS-59

Motion Segmentation Object +7

Paper
Code

Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning

1 code implementation • ICCV 2023 • Huimin Wu, Chenyang Lei, Xiao Sun, Peng-Shuai Wang, Qifeng Chen, Kwang-Ting Cheng, Stephen Lin, Zhirong Wu

Self-supervised representation learning follows a paradigm of withholding some part of the data and tasking the network to predict it from the remaining part.

Data Augmentation Quantization +2

Paper
Code

Improving Unsupervised Video Object Segmentation with Motion-Appearance Synergy

no code implementations • 17 Dec 2022 • Long Lian, Zhirong Wu, Stella X. Yu

Previous methods in unsupervised video object segmentation (UVOS) have demonstrated the effectiveness of motion as either input or supervision for segmentation.

Misconceptions Object +5

Paper
Add Code

ClipCrop: Conditioned Cropping Driven by Vision-Language Model

no code implementations • 21 Nov 2022 • Zhihang Zhong, Mingxi Cheng, Zhirong Wu, Yuhui Yuan, Yinqiang Zheng, Ji Li, Han Hu, Stephen Lin, Yoichi Sato, Imari Sato

Image cropping has progressed tremendously under the data-driven paradigm.

Decoder Image Cropping +1

Paper
Add Code

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

1 code implementation • 20 Jul 2022 • Zhihang Zhong, Xiao Sun, Zhirong Wu, Yinqiang Zheng, Stephen Lin, Imari Sato

Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region.

Optical Flow Estimation Quantization

Paper
Code

Extreme Masking for Learning Instance and Distributed Visual Representations

1 code implementation • 9 Jun 2022 • Zhirong Wu, Zihang Lai, Xiao Sun, Stephen Lin

The paper presents a scalable approach for learning spatially distributed visual representations over individual tokens and a holistic instance representation simultaneously.

Data Augmentation Representation Learning

Paper
Code

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion

1 code implementation • 12 Mar 2022 • Zhihang Zhong, Mingdeng Cao, Xiao Sun, Zhirong Wu, Zhongyi Zhou, Yinqiang Zheng, Stephen Lin, Imari Sato

In this paper, instead of two consecutive frames, we propose to exploit a pair of images captured by dual RS cameras with reversed RS directions for this highly challenging task.

Optical Flow Estimation

Paper
Code

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation

4 code implementations • CVPR 2022 • Yutong Chen, Fangyun Wei, Xiao Sun, Zhirong Wu, Stephen Lin

Concretely, we pretrain the sign-to-gloss visual network on the general domain of human actions and the within-domain of a sign-to-gloss dataset, and pretrain the gloss-to-text translation network on the general domain of a multilingual corpus and the within-domain of a gloss-to-text corpus.

Ranked #3 on Sign Language Translation on CSL-Daily

Sign Language Recognition Sign Language Translation +2

213

Paper
Code

Debiased Learning from Naturally Imbalanced Pseudo-Labels

1 code implementation • CVPR 2022 • Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu

Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balanced target data.

Ranked #1 on Few-Shot Image Classification on ImageNet - 0-Shot (using extra training data)

counterfactual Counterfactual Reasoning +4

Paper
Code

Towards Tokenized Human Dynamics Representation

1 code implementation • 22 Nov 2021 • Kenneth Li, Xiao Sun, Zhirong Wu, Fangyun Wei, Stephen Lin

For human action understanding, a popular research direction is to analyze short video clips with unambiguous semantic content, such as jumping and drinking.

Action Segmentation Action Understanding +3

Paper
Code

One-Shot Generative Domain Adaptation

no code implementations • ICCV 2023 • Ceyuan Yang, Yujun Shen, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou

We then equip the well-learned discriminator backbone with an attribute classifier to ensure that the generator captures the appropriate characters from the reference.

Attribute Domain Adaptation +1

Paper
Add Code

The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

1 code implementation • NeurIPS 2021 • Runtao Liu, Zhirong Wu, Stella X. Yu, Stephen Lin

Our model starts with two separate pathways: an appearance pathway that outputs feature-based region segmentation for a single image, and a motion pathway that outputs motion features for a pair of images.

Ranked #7 on Unsupervised Object Segmentation on FBMS-59

Contrastive Learning Image Segmentation +6

Paper
Code

Self-supervised Discovery of Human Actons from Long Kinematic Videos

no code implementations • 29 Sep 2021 • Kenneth Li, Xiao Sun, Zhirong Wu, Fangyun Wei, Stephen Lin

However, methods for understanding short semantic actions cannot be directly translated to long kinematic sequences such as dancing, where it becomes challenging even to semantically label the human movements.

Action Understanding Sentence

Paper
Add Code

Aligning Pretraining for Detection via Object-Level Contrastive Learning

1 code implementation • NeurIPS 2021 • Fangyun Wei, Yue Gao, Zhirong Wu, Han Hu, Stephen Lin

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.

Contrastive Learning Object +6

170

Paper
Code

Instance Localization for Self-supervised Detection Pretraining

1 code implementation • CVPR 2021 • Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin

The pretext task is to predict the instance category given the composited images as well as the foreground bounding boxes.

Classification General Classification +6

145

Paper
Code

Consistent Instance Classification for Unsupervised Representation Learning

no code implementations • 1 Jan 2021 • Depu Meng, Zigang Geng, Zhirong Wu, Bin Xiao, Houqiang Li, Jingdong Wang

The proposed consistent instance classification (ConIC) approach simultaneously optimizes the classification loss and an additional consistency loss explicitly penalizing the feature dissimilarity between the augmented views from the same instance.

Classification General Classification +2

Paper
Add Code

Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination

1 code implementation • 3 Aug 2020 • Peng-Shuai Wang, Yu-Qi Yang, Qian-Fang Zou, Zhirong Wu, Yang Liu, Xin Tong

Although unsupervised feature learning has demonstrated its advantages to reducing the workload of data labeling and network design in many fields, existing unsupervised 3D learning methods still cannot offer a generic network for various shape analysis tasks with competitive performance to supervised methods.

Ranked #2 on 3D Semantic Segmentation on PartNet

3D Point Cloud Linear Classification 3D Semantic Segmentation

707

Paper
Code

What makes instance discrimination good for transfer learning?

no code implementations • ICLR 2021 • Nanxuan Zhao, Zhirong Wu, Rynson W. H. Lau, Stephen Lin

Contrastive visual pretraining based on the instance discrimination pretext task has made significant progress.

object-detection Object Detection +1

Paper
Add Code

A Transductive Approach for Video Object Segmentation

1 code implementation • CVPR 2020 • Yizhuo Zhang, Zhirong Wu, Houwen Peng, Stephen Lin

Semi-supervised video object segmentation aims to separate a target object from a video sequence, given the mask in the first frame.

Ranked #15 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Instance Segmentation Object +4

155

Paper
Code

Distilling Localization for Self-Supervised Representation Learning

no code implementations • 14 Apr 2020 • Nanxuan Zhao, Zhirong Wu, Rynson W. H. Lau, Stephen Lin

To address this problem, we propose a data-driven approach for learning invariance to backgrounds.

Colorization Contrastive Learning +8

Paper
Add Code

Deep Metric Transfer for Label Propagation with Limited Annotated Data

1 code implementation • 20 Dec 2018 • Bin Liu, Zhirong Wu, Han Hu, Stephen Lin

In this paper, we propose a generic framework that utilizes unlabeled data to aid generalization for all three tasks.

Metric Learning Object Recognition +1

Paper
Code

Improving Generalization via Scalable Neighborhood Component Analysis

2 code implementations • ECCV 2018 • Zhirong Wu, Alexei A. Efros, Stella X. Yu

Current major approaches to visual recognition follow an end-to-end formulation that classifies an input image into one of the pre-determined set of semantic categories.

136

Paper
Code

Unsupervised Feature Learning via Non-Parametric Instance Discrimination

4 code implementations • CVPR 2018 • Zhirong Wu, Yuanjun Xiong, Stella X. Yu, Dahua Lin

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.

Ranked #40 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (Top 5 Accuracy metric)

General Classification object-detection +4

3,095

Paper
Code

Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination

14 code implementations • 5 May 2018 • Zhirong Wu, Yuanjun Xiong, Stella Yu, Dahua Lin

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.

Ranked #13 on Contrastive Learning on imagenet-1k

Contrastive Learning General Classification +3

3,230

Paper
Code

Temporal Action Detection with Structured Segment Networks

6 code implementations • ICCV 2017 • Yue Zhao, Yuanjun Xiong, Li-Min Wang, Zhirong Wu, Xiaoou Tang, Dahua Lin

Detecting actions in untrimmed videos is an important yet challenging task.

Ranked #6 on Action Recognition on THUMOS’14

Action Detection Action Recognition +1

3,924

Paper
Code

Deep Markov Random Field for Image Modeling

1 code implementation • 7 Sep 2016 • Zhirong Wu, Dahua Lin, Xiaoou Tang

Markov Random Fields (MRFs), a formulation widely used in generative image modeling, have long been plagued by the lack of expressive power.

Paper
Code

Adjustable Bounded Rectifiers: Towards Deep Binary Representations

no code implementations • 19 Nov 2015 • Zhirong Wu, Dahua Lin, Xiaoou Tang

This suggests that the semantic structure of a neural network may be manifested through a guided binarization process.

Binarization

Paper
Add Code

3D ShapeNets: A Deep Representation for Volumetric Shapes

no code implementations • CVPR 2015 • Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, Jianxiong Xiao

Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically.

Ranked #35 on 3D Point Cloud Classification on ModelNet40 (Mean Accuracy metric)

3D Point Cloud Classification 3D Shape Representation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.