Search Results for author: Fanyi Xiao

Found 28 papers, 11 papers with code

Gen2Det: Generate to Detect

no code implementations • 7 Dec 2023 • Saksham Suri, Fanyi Xiao, Animesh Sinha, Sean Chang Culatana, Raghuraman Krishnamoorthi, Chenchen Zhu, Abhinav Shrivastava

In the long-tailed detection setting on LVIS, Gen2Det improves the performance on rare categories by a large margin while also significantly improving the performance on other categories, e. g. we see an improvement of 2. 13 Box AP and 1. 84 Mask AP over just training on real data on LVIS with Mask R-CNN.

Image Generation Object +2

Paper
Add Code

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images

no code implementations • 4 Dec 2023 • Zhuoran Yu, Chenchen Zhu, Sean Culatana, Raghuraman Krishnamoorthi, Fanyi Xiao, Yong Jae Lee

We present a new framework leveraging off-the-shelf generative models to generate synthetic training images, addressing multiple challenges: class name ambiguity, lack of diversity in naive prompts, and domain shifts.

Domain Generalization Text-to-Image Generation

Paper
Add Code

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

1 code implementation • 1 Dec 2023 • Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra

On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e. g., ~4 AP on COCO/LVIS) over other fast SAM models.

Ranked #3 on Zero-Shot Instance Segmentation on LVIS v1.0 val

Image Classification Instance Segmentation +5

1,753

Paper
Code

EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding

1 code implementation • ICCV 2023 • Chenchen Zhu, Fanyi Xiao, Andres Alvarado, Yasmine Babaei, Jiabo Hu, Hichem El-Mohri, Sean Chang Culatana, Roshan Sumbaly, Zhicheng Yan

To bootstrap the research on EgoObjects, we present a suite of 4 benchmark tasks around the egocentric object understanding, including a novel instance level- and the classical category level object detection.

Continual Learning Object +2

Paper
Code

Exploring Open-Vocabulary Semantic Segmentation without Human Labels

no code implementations • 1 Jun 2023 • Jun Chen, Deyao Zhu, Guocheng Qian, Bernard Ghanem, Zhicheng Yan, Chenchen Zhu, Fanyi Xiao, Mohamed Elhoseiny, Sean Chang Culatana

Although acquired extensive knowledge of visual concepts, it is non-trivial to exploit knowledge from these VL models to the task of semantic segmentation, as they are usually trained at an image level.

Open Vocabulary Semantic Segmentation Segmentation +3

Paper
Add Code

Going Denser with Open-Vocabulary Part Segmentation

2 code implementations • ICCV 2023 • Peize Sun, Shoufa Chen, Chenchen Zhu, Fanyi Xiao, Ping Luo, Saining Xie, Zhicheng Yan

In this paper, we propose a detector with the ability to predict both open-vocabulary objects and their part segmentation.

Object object-detection +3

361

Paper
Code

Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only

no code implementations • ICCV 2023 • Jun Chen, Deyao Zhu, Guocheng Qian, Bernard Ghanem, Zhicheng Yan, Chenchen Zhu, Fanyi Xiao, Sean Chang Culatana, Mohamed Elhoseiny

Semantic segmentation is a crucial task in computer vision that involves segmenting images into semantically meaningful regions at the pixel level.

Open Vocabulary Semantic Segmentation Segmentation +3

Paper
Add Code

3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding

1 code implementation • 13 Dec 2022 • Lorenzo Pellegrini, Chenchen Zhu, Fanyi Xiao, Zhicheng Yan, Antonio Carta, Matthias De Lange, Vincenzo Lomonaco, Roshan Sumbaly, Pau Rodriguez, David Vazquez

Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community.

Continual Learning Incremental Learning +3

Paper
Code

SCVRL: Shuffled Contrastive Video Representation Learning

no code implementations • 24 May 2022 • Michael Dorkenwald, Fanyi Xiao, Biagio Brattoli, Joseph Tighe, Davide Modolo

We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos.

Contrastive Learning Representation Learning +1

Paper
Add Code

Hierarchical Self-supervised Representation Learning for Movie Understanding

no code implementations • CVPR 2022 • Fanyi Xiao, Kaustav Kundu, Joseph Tighe, Davide Modolo

Most self-supervised video representation learning approaches focus on action recognition.

Action Recognition Contrastive Learning +1

Paper
Add Code

MaCLR: Motion-aware Contrastive Learning of Representations for Videos

1 code implementation • 17 Jun 2021 • Fanyi Xiao, Joseph Tighe, Davide Modolo

We present MaCLR, a novel method to explicitly perform cross-modal self-supervised video representations learning from visual and motion modalities.

Action Detection Action Recognition +2

Paper
Code

YolactEdge: Real-time Instance Segmentation on the Edge

2 code implementations • 22 Dec 2020 • Haotian Liu, Rafael A. Rivera Soto, Fanyi Xiao, Yong Jae Lee

We propose YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds.

Real-time Instance Segmentation Semantic Segmentation

1,257

Paper
Code

MARS: Mixed Virtual and Real Wearable Sensors for Human Activity Recognition with Multi-Domain Deep Learning Model

no code implementations • 20 Sep 2020 • Ling Pei, Songpengcheng Xia, Lei Chu, Fanyi Xiao, Qi Wu, Wenxian Yu, Robert Qiu

Together with the rapid development of the Internet of Things (IoT), human activity recognition (HAR) using wearable Inertial Measurement Units (IMUs) becomes a promising technology for many research areas.

Human Activity Recognition Transfer Learning

Paper
Add Code

Delving Deeper into Anti-aliasing in ConvNets

2 code implementations • 21 Aug 2020 • Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yong Jae Lee

Aliasing refers to the phenomenon that high frequency signals degenerate into completely different ones after sampling.

Instance Segmentation Segmentation +1

187

Paper
Code

A Deep Learning Method for Complex Human Activity Recognition Using Virtual Wearable Sensors

no code implementations • 4 Mar 2020 • Fanyi Xiao, Ling Pei, Lei Chu, Danping Zou, Wenxian Yu, Yifan Zhu, Tao Li

The experimental results show that the proposed method can surprisingly converge in a few iterations and achieve an accuracy of 91. 15% on a real IMU dataset, demonstrating the efficiency and effectiveness of the proposed method.

Human Activity Recognition Transfer Learning

Paper
Add Code

Audiovisual SlowFast Networks for Video Recognition

3 code implementations • 23 Jan 2020 • Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, Christoph Feichtenhofer

We present Audiovisual SlowFast Networks, an architecture for integrated audiovisual perception.

Action Classification Video Recognition

6,274

Paper
Code

YOLACT++: Better Real-time Instance Segmentation

36 code implementations • 3 Dec 2019 • Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Ranked #15 on Real-time Instance Segmentation on MSCOCO (using extra training data)

Real-time Instance Segmentation Segmentation +1

4,925

Paper
Code

Identity From Here, Pose From There: Self-Supervised Disentanglement and Generation of Objects Using Unlabeled Videos

no code implementations • ICCV 2019 • Fanyi Xiao, Haotian Liu, Yong Jae Lee

We propose a novel approach that disentangles the identity and pose of objects for image generation.

Disentanglement Image Generation

Paper
Add Code

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

1 code implementation • CVPR 2019 • Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz

In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.

Ranked #7 on Action Detection on UCF101-24

Action Detection Action Recognition

245

Paper
Code

YOLACT: Real-time Instance Segmentation

48 code implementations • ICCV 2019 • Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Ranked #21 on Real-time Instance Segmentation on MSCOCO (using extra training data)

Real-time Instance Segmentation Segmentation +2

27,806

Paper
Code

Video Object Detection with an Aligned Spatial-Temporal Memory

no code implementations • ECCV 2018 • Fanyi Xiao, Yong Jae Lee

We introduce Spatial-Temporal Memory Networks for video object detection.

Object object-detection +1

Paper
Add Code

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks

no code implementations • 25 May 2017 • Wenjian Hu, Krishna Kumar Singh, Fanyi Xiao, Jinyoung Han, Chen-Nee Chuah, Yong Jae Lee

Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest.

Paper
Add Code

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures

no code implementations • CVPR 2017 • Fanyi Xiao, Leonid Sigal, Yong Jae Lee

We propose a weakly-supervised approach that takes image-sentence pairs as input and learns to visually ground (i. e., localize) arbitrary linguistic phrases, in the form of spatial attention masks.

Sentence Visual Grounding

Paper
Add Code

Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals

no code implementations • CVPR 2016 • Fanyi Xiao, Yong Jae Lee

We present an unsupervised approach that generates a diverse, ranked set of bounding box and segmentation video object proposals---spatio-temporal tubes that localize the foreground objects---in an unannotated video.

Segmentation

Paper
Add Code

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

no code implementations • CVPR 2016 • Krishna Kumar Singh, Fanyi Xiao, Yong Jae Lee

The status quo approach to training object detectors requires expensive bounding box annotations.

Object object-detection +1

Paper
Add Code

Discovering the Spatial Extent of Relative Attributes

no code implementations • ICCV 2015 • Fanyi Xiao, Yong Jae Lee

We present a weakly-supervised approach that discovers the spatial extent of relative attributes, given only pairs of ordered images.

Attribute

Paper
Add Code

Transitive Distance Clustering with K-Means Duality

no code implementations • CVPR 2014 • Zhiding Yu, Chunjing Xu, Deyu Meng, Zhuo Hui, Fanyi Xiao, Wenbo Liu, Jianzhuang Liu

We propose a very intuitive and simple approximation for the conventional spectral clustering methods.

Clustering Image Segmentation +1

Paper
Add Code

Multi-Task Regularization with Covariance Dictionary for Linear Classifiers

no code implementations • 21 Oct 2013 • Fanyi Xiao, Ruikun Luo, Zhiding Yu

In this paper we propose a multi-task linear classifier learning problem called D-SVM (Dictionary SVM).

Transfer Learning valid

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.