no code implementations • 20 Jan 2024 • Yanlong Zang, Han Yang, Jiaxu Miao, Yi Yang
Image-based virtual try-on systems, which fit new garments onto human portraits, are gaining research attention. An ideal pipeline should preserve the static features of clothes(like textures and logos)while also generating dynamic elements(e. g. shadows, folds)that adapt to the model's pose and environment. Previous works fail specifically in generating dynamic features, as they preserve the warped in-shop clothes trivially with predicted an alpha mask by composition. To break the dilemma of over-preserving and textures losses, we propose a novel diffusion-based Product-level virtual try-on pipeline,\ie PLTON, which can preserve the fine details of logos and embroideries while producing realistic clothes shading and wrinkles. The main insights are in three folds:1)Adaptive Dynamic Rendering:We take a pre-trained diffusion model as a generative prior and tame it with image features, training a dynamic extractor from scratch to generate dynamic tokens that preserve high-fidelity semantic information.
no code implementations • ICCV 2023 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Recent advances in semi-supervised semantic segmentation have been heavily reliant on pseudo labeling to compensate for limited labeled data, disregarding the valuable relational knowledge among semantic concepts.
no code implementations • CVPR 2023 • Mengze Li, Han Wang, Wenqiao Zhang, Jiaxu Miao, Zhou Zhao, Shengyu Zhang, Wei Ji, Fei Wu
WINNER first builds the language decomposition tree in a bottom-up manner, upon which the structural attention mechanism and top-down feature backtracking jointly build a multi-modal decomposition tree, permitting a hierarchical understanding of unstructured videos.
1 code implementation • ICCV 2023 • Liangqi Li, Jiaxu Miao, Dahu Shi, Wenming Tan, Ye Ren, Yi Yang, ShiLiang Pu
Current methods for open-vocabulary object detection (OVOD) rely on a pre-trained vision-language model (VLM) to acquire the recognition ability.
no code implementations • CVPR 2023 • Jiaxu Miao, Zongxin Yang, Leilei Fan, Yi Yang
In this work, we propose FedSeg, a basic federated learning approach for class-heterogeneous semantic segmentation.
3 code implementations • 5 Oct 2022 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature, class).
1 code implementation • 19 Jul 2022 • Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).
2 code implementations • 22 Mar 2022 • Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang
This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS).
1 code implementation • 18 Mar 2022 • Chen Liang, Wenguan Wang, Tianfei Zhou, Jiaxu Miao, Yawei Luo, Yi Yang
We explore the task of language-guided video segmentation (LVS).
Ranked #7 on Referring Expression Segmentation on A2D Sentences
Referring Expression Segmentation Referring Video Object Segmentation +5
no code implementations • ACL 2022 • Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu
To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.
1 code implementation • CVPR 2022 • Yunqiu Xu, Yifan Sun, Zongxin Yang, Jiaxu Miao, Yi Yang
How to align the source and target domains is critical to the CDWSOD accuracy.
Ranked #1 on Weakly Supervised Object Detection on Clipart1k
1 code implementation • CVPR 2022 • Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang
In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3, 536 videos and 84, 750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories.
no code implementations • CVPR 2021 • Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang
To the best of our knowledge, our VSPW is the first attempt to tackle the challenging video scene parsing task in the wild by considering diverse scenarios.
no code implementations • CVPR 2020 • Jiaxu Miao, Yunchao Wei, Yi Yang
Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions.
Ranked #5 on Interactive Video Object Segmentation on DAVIS 2017 (AUC-J metric)
1 code implementation • ICCV 2019 • Jiaxu Miao, Yu Wu, Ping Liu, Yuhang Ding, Yi Yang
Our method largely outperforms existing person re-id methods on three occlusion datasets, while remains top performance on two holistic datasets.