no code implementations • ECCV 2020 • Tong He, Yifan Liu, Chunhua Shen, Xinlong Wang, Changming Sun
However, these methods are unaware of the instance context and fail to realize the boundary and geometric information of an instance, which are critical to separate adjacent objects.
no code implementations • 20 May 2024 • Yifan Liu, Chenchen Kuai, Haoxuan Ma, Xishun Liao, Brian Yueshuai He, Jiaqi Ma
In our evaluation using the OpenStreetMap (OSM) POI dataset, our approach achieves a 93. 4% accuracy and a 96. 1% F-1 score in POI classification, and a 91. 7% accuracy with a 92. 3% F-1 score in activity inference.
no code implementations • 15 May 2024 • Yifan Liu, You Wang, Guang Li
Model Predictive Control (MPC)-based trajectory planning has been widely used in robotics, and incorporating Control Barrier Function (CBF) constraints into MPC can greatly improve its obstacle avoidance efficiency.
no code implementations • 17 Apr 2024 • Kangning Zhang, Yingjie Qin, Ruilong Su, Yifan Liu, Jiarui Jin, Weinan Zhang, Yong Yu
After obtaining separate behavior and modal representations, we design a Behavior-Modal Alignment Module (BMA) to align and fuse the dual representations to solve the misalignment problem.
no code implementations • 19 Mar 2024 • Yifan Liu, Kangning Zhang, Xiangyuan Ren, Yanhua Huang, Jiarui Jin, Yingjie Qin, Ruilong Su, Ruiwen Xu, Weinan Zhang
In AlignRec, the recommendation objective is decomposed into three alignments, namely alignment within contents, alignment between content and categorical ID, and alignment between users and items.
no code implementations • 17 Mar 2024 • Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan
In a nutshell, Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research, setting a substantial stage for further advances in medical content generation.
1 code implementation • 27 Feb 2024 • Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Cao
In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field.
1 code implementation • 8 Feb 2024 • Feihu Jin, Yifan Liu, Ying Tan
Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks and exhibited impressive reasoning abilities by applying zero-shot Chain-of-Thought (CoT) prompting.
1 code implementation • 23 Jan 2024 • Yifan Liu, Chenxin Li, Chen Yang, Yixuan Yuan
To adapt 3DGS for endoscopic scenes, we propose two strategies, Holistic Gaussian Initialization (HGI) and Spatio-temporal Gaussian Tracking (SGT), to handle the non-trivial Gaussian initialization and tissue deformation problems, respectively.
1 code implementation • 18 Jan 2024 • René Zurbrügg, Yifan Liu, Francis Engelmann, Suryansh Kumar, Marco Hutter, Vaishakh Patil, Fisher Yu
Executing a successful grasp in a cluttered environment requires multiple levels of scene understanding: First, the robot needs to analyze the geometric properties of individual objects to find feasible grasps.
no code implementations • 26 Dec 2023 • Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang
We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.
no code implementations • 11 Dec 2023 • Yifan Liu, Tiecheng Song
To obtain a better robust model with better cross-domain generalization in the presence of poor data quality, we propose the SCPQ model, in which we first propose a method for fusing shallow information using attention mechanism (FSIAM), which utilizes feature maps fused with deep convolved feature maps after fully extracting the global sensory field of shallow information via the attention mechanism module, which can fully fit the data to obtain a better sense of the domain in the presence of poor data, and thus better multiscale adaptability.
1 code implementation • 26 Nov 2023 • Junhui Yin, Wei Yin, Hao Chen, Xuqian Ren, Zhanyu Ma, Jun Guo, Yifan Liu
These priors ensure the color rendered along rays to be robust to view direction and reduce the inherent ambiguities of density estimated along rays.
no code implementations • 21 Nov 2023 • Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen
To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion models to enhance the quality of video synthesis.
no code implementations • 10 Oct 2023 • Minghan Qin, Yifan Liu, Yuelang Xu, Xiaochen Zhao, Yebin Liu, Haoqian Wang
One crucial aspect of 3D head avatar reconstruction lies in the details of facial expressions.
no code implementations • 1 Oct 2023 • Jiancheng Huang, Yifan Liu, Yi Huang, Shifeng Chen
To address the lack of labelled datasets for these seal-related tasks, we propose Seal2Real, a generative method that generates a large amount of labelled document seal data, and construct a Seal-DB dataset containing 20K images with labels.
no code implementations • 28 Sep 2023 • Jiancheng Huang, Yifan Liu, Jin Qin, Shifeng Chen
Text-conditioned image editing is a recently emerged and highly practical task, and its potential is immeasurable.
no code implementations • 26 Sep 2023 • Jiancheng Huang, Yifan Liu, Shifeng Chen
Learning-based methods have attracted a lot of research attention and led to significant improvements in low-light image enhancement.
no code implementations • ICCV 2023 • Thomas E. Huang, Yifan Liu, Luc van Gool, Fisher Yu
VTD is a promising new direction for exploring the unification of perception tasks in autonomous driving.
1 code implementation • 5 Sep 2023 • Lingyue Fu, Huacan Chai, Shuang Luo, Kounianhua Du, Weiming Zhang, Longteng Fan, Jiayi Lei, Renting Rui, Jianghao Lin, Yuchen Fang, Yifan Liu, Jingkuan Wang, Siyuan Qi, Kangning Zhang, Weinan Zhang, Yong Yu
With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers.
1 code implementation • 22 Aug 2023 • Bingqing Zhang, Sen Wang, Yifan Liu, Brano Kusy, Xue Li, Jiajun Liu
The ODD score enhances the VOD system in two ways: 1) it enables the VOD system to select superior global reference frames, thereby improving overall accuracy; and 2) it serves as an indicator in the newly designed ODD Scheduler to eliminate the aggregation of frames that are easy to detect, thus accelerating the VOD process.
1 code implementation • 22 Aug 2023 • Biao Wu, Yutong Xie, Zeyu Zhang, Jinchao Ge, Kaspar Yaxley, Suzan Bahadir, Qi Wu, Yifan Liu, Minh-Son To
Intracranial hemorrhage (ICH) is a pathological condition characterized by bleeding inside the skull or brain, which can be attributed to various factors.
1 code implementation • ICCV 2023 • Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan, Weian Mao, Chenchen Jing, Yifan Liu, Chunhua Shen
In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories.
1 code implementation • 10 Aug 2023 • Quan Tang, Chuanjian Liu, Fagui Liu, Yifan Liu, Jun Jiang, BoWen Zhang, Kai Han, Yunhe Wang
Aggregation of multi-stage features has been revealed to play a significant role in semantic segmentation.
1 code implementation • ICCV 2023 • Quan Tang, BoWen Zhang, Jiajun Liu, Fagui Liu, Yifan Liu
Experiments suggest that the proposed DToP architecture reduces on average $20\% - 35\%$ of computational cost for current semantic segmentation methods based on plain vision transformers without accuracy degradation.
1 code implementation • ICCV 2023 • Kaining Ying, Qing Zhong, Weian Mao, Zhenhua Wang, Hao Chen, Lin Yuanbo Wu, Yifan Liu, Chengxiang Fan, Yunzhi Zhuge, Chunhua Shen
The discrimination of instance embeddings plays a vital role in associating instances across time for online video instance segmentation (VIS).
Ranked #2 on Video Instance Segmentation on Youtube-VIS 2022 Validation (using extra training data)
1 code implementation • 13 Jun 2023 • Liyang Liu, Zihan Wang, Minh Hieu Phan, BoWen Zhang, Jinchao Ge, Yifan Liu
Current knowledge distillation approaches in semantic segmentation tend to adopt a holistic approach that treats all spatial locations equally.
1 code implementation • 9 Jun 2023 • BoWen Zhang, Liyang Liu, Minh Hieu Phan, Zhi Tian, Chunhua Shen, Yifan Liu
This paper investigates the capability of plain Vision Transformers (ViTs) for semantic segmentation using the encoder-decoder framework and introduces \textbf{SegViTv2}.
Ranked #16 on Semantic Segmentation on ADE20K
no code implementations • 8 Jun 2023 • Yuling Xi, Hao Chen, Ning Wang, Peng Wang, Yanning Zhang, Chunhua Shen, Yifan Liu
In particular, one feature merge branch is designed for instance-level recognition the other for dense predictions.
no code implementations • 4 Jun 2023 • Jintao Rong, Hao Chen, Tianxiao Chen, Linlin Ou, Xinyi Yu, Yifan Liu
Prompt learning has become a popular approach for adapting large vision-language models, such as CLIP, to downstream tasks.
2 code implementations • NeurIPS 2023 • Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu
HQ-SAM is only trained on the introduced detaset of 44k masks, which takes only 4 hours on 8 GPUs.
Ranked #1 on Zero-Shot Instance Segmentation on LVIS v1.0 val
1 code implementation • 4 May 2023 • Sen Zhao, Wei Wei, Yifan Liu, Ziyang Wang, Wendi Li, Xian-Ling Mao, Shuai Zhu, Minghui Yang, Zujie Wen
Conversational recommendation systems (CRS) aim to timely and proactively acquire user dynamic preferred attributes through conversations for item recommendation.
no code implementations • 9 Mar 2023 • Caiyuan Chu, Ya Li, Yifan Liu, Jia-Chen Gu, Quan Liu, Yongxin Ge, Guoping Hu
The key to automatic intention induction is that, for any given set of new data, the sentence representation obtained by the model can be well distinguished from different labels.
no code implementations • 9 Jan 2023 • Xiangyu Li, Gongning Luo, Kuanquan Wang, Hongyu Wang, Jun Liu, Xinjie Liang, Jie Jiang, Zhenghao Song, Chunyue Zheng, Haokai Chi, Mingwang Xu, Yingte He, Xinghua Ma, Jingwen Guo, Yifan Liu, Chuanpu Li, Zeli Chen, Md Mahfuzur Rahman Siddiquee, Andriy Myronenko, Antoine P. Sanner, Anirban Mukhopadhyay, Ahmed E. Othman, Xingyu Zhao, Weiping Liu, Jinhuang Zhang, Xiangyuan Ma, Qinghui Liu, Bradley J. MacIntosh, Wei Liang, Moona Mazher, Abdul Qayyum, Valeriia Abramova, Xavier Lladó, Shuo Li
It is intended to resolve the above-mentioned problems and promote the development of both intracranial hemorrhage segmentation and anisotropic data processing.
1 code implementation • ICCV 2023 • Changyong Shu, Jiajun Deng, Fisher Yu, Yifan Liu
Although 3D measurements are not available at the inference time of monocular 3D object detection, 3DPPE uses predicted depth to approximate the real point positions.
no code implementations • ICCV 2023 • Chen Yang, Meilu Zhu, Yifan Liu, Yixuan Yuan
To this end, we aim to study a novel problem of federated open-set recognition (FedOSR), which learns an open-set recognition (OSR) model under federated paradigm such that it classifies seen classes while at the same time detects unknown classes.
1 code implementation • CVPR 2023 • Ziqin Zhou, BoWen Zhang, Yinjie Lei, Lingqiao Liu, Yifan Liu
Recently, CLIP has been applied to pixel-level zero-shot learning tasks via a two-stage scheme.
1 code implementation • 27 Nov 2022 • Changyong Shu, Jiajun Deng, Fisher Yu, Yifan Liu
Although 3D measurements are not available at the inference time of monocular 3D object detection, 3DPPE uses predicted depth to approximate the real point positions.
no code implementations • 10 Nov 2022 • Yifan Liu, YouBao Tang, Ning Zhang, Ruei-Sung Lin, Haoqian Wang
Temporal action localization (TAL) aims to detect the boundary and identify the class of each action instance in a long untrimmed video.
1 code implementation • 12 Oct 2022 • BoWen Zhang, Zhi Tian, Quan Tang, Xiangxiang Chu, Xiaolin Wei, Chunhua Shen, Yifan Liu
We explore the capability of plain Vision Transformers (ViTs) for semantic segmentation and propose the SegVit.
Ranked #4 on Semantic Segmentation on COCO-Stuff test
1 code implementation • 30 Aug 2022 • Jianlong Yuan, Qian Qi, Fei Du, Zhibin Wang, Fan Wang, Yifan Liu
Inspired by the recent progress on semantic directions on feature-space, we propose to include augmentations in feature space for efficient distillation.
1 code implementation • 28 Aug 2022 • Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Yifan Liu, Chunhua Shen
To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
1 code implementation • 24 Aug 2022 • Jianlong Yuan, Jinchao Ge, Zhibin Wang, Yifan Liu
More specifically, we use the pseudo-labels generated by a mean teacher to supervise the student network to achieve a mutual knowledge distillation between the two branches.
1 code implementation • 23 Aug 2022 • Yifan Liu, Wei Wei, Jiayi Liu, Xianling Mao, Rui Fang, Dangyang Chen
Endowing chatbots with a consistent personality plays a vital role for agents to deliver human-like interactions.
1 code implementation • 24 Jul 2022 • Xuqian Ren, Yifan Liu
Experiments have been conducted on our constructed benchmarks to verify that our proposed operator mask-based framework can locate and modify the inharmonious regions in more complex scenes.
no code implementations • 12 Jul 2022 • Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes
It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.
no code implementations • 30 Apr 2022 • Jiao Suo, Yifan Liu, Clio Cheng, Keer Wang, Meng Chen, Ho-Yin Chan, Roy Vellaisamy, Ning Xi, Vivian W. Q. Lou, Wen Jung Li
Human limb motion tracking and recognition plays an important role in medical rehabilitation training, lower limb assistance, prosthetics design for amputees, feedback control for assistive robots, etc.
no code implementations • 24 Feb 2022 • Yifan Liu, Chunhua Shen, Changqian Yu, Jingdong Wang
To this end, we perform inference at each frame.
no code implementations • 4 Feb 2022 • Wei Yin, Yifan Liu, Chunhua Shen, Baichuan Sun, Anton Van Den Hengel
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets, despite not using any images therefrom.
Ranked #1 on Semantic Segmentation on WildDash
no code implementations • 23 Aug 2021 • Yifan Liu, Bin Duo, Qingqing Wu, Xiaojun Yuan, Jun Li, Yonghui Li
This paper investigates an aerial reconfigurable intelligent surface (RIS)-aided communication system under the probabilistic line-of-sight (LoS) channel, where an unmanned aerial vehicle (UAV) equipped with an RIS is deployed to assist two ground nodes in their information exchange.
no code implementations • 13 Aug 2021 • Xuqian Ren, Yifan Liu, Chunlei Song
Image matting, aiming to achieve foreground boundary details, and image harmonization, aiming to make the background compatible with the foreground, are both promising yet challenging tasks.
1 code implementation • NeurIPS 2021 • BoWen Zhang, Yifan Liu, Zhi Tian, Chunhua Shen
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
1 code implementation • 24 Jul 2021 • Jingjing Jiang, Ziyi Liu, Yifan Liu, Zhixiong Nan, Nanning Zheng
In this paper, we formulate OOD generalization in VQA as a compositional generalization problem and propose a graph generative modeling-based training scheme (X-GGM) to implicitly model the problem.
no code implementations • 5 Jun 2021 • Yifan Liu, Bin Duo, Qingqing Wu, Xiaojun Yuan, Yonghui Li
This paper investigates the achievable rate maximization problem of a downlink unmanned aerial vehicle (UAV)-enabled communication system aided by an intelligent omni-surface (IOS).
1 code implementation • ICCV 2021 • Jianlong Yuan, Yifan Liu, Chunhua Shen, Zhibin Wang, Hao Li
Previous works [3, 27] fail to employ strong augmentation in pseudo label learning efficiently, as the large distribution change caused by strong augmentation harms the batch normalisation statistics.
no code implementations • CVPR 2021 • Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen
We hope that this simple, extended perceptual loss may serve as a generic structured-output loss that is applicable to most structured output learning tasks.
3 code implementations • 7 Mar 2021 • Wei Yin, Yifan Liu, Chunhua Shen
In this work, we show the importance of the high-order 3D geometric constraints for depth prediction.
no code implementations • 1 Mar 2021 • He Zhang, Zhixiong Nan, Tao Yang, Yifan Liu, Nanning Zheng
In autonomous driving, perceiving the driving behaviors of surrounding agents is important for the ego-vehicle to make a reasonable decision.
3 code implementations • ICCV 2021 • Changyong Shu, Yifan Liu, Jianfei Gao, Zheng Yan, Chunhua Shen
Observing that in semantic segmentation, some layers' feature activations of each channel tend to encode saliency of scene categories (analogue to class activation mapping), we propose to align features channel-wise between the student and teacher networks.
no code implementations • ECCV 2020 • Changqian Yu, Yifan Liu, Changxin Gao, Chunhua Shen, Nong Sang
In this paper, we present a Representative Graph (RepGraph) layer to dynamically sample a few representative features, which dramatically reduces redundancy.
no code implementations • 7 Mar 2020 • Zhikang Zou, Yifan Liu, Shuangjie Xu, Wei Wei, Shiping Wen, Pan Zhou
Extensive experiments on crowd counting datasets (ShanghaiTech, MALL, WorldEXPO'10, and UCSD) show that our HSRNet can deliver superior results over all state-of-the-art approaches.
1 code implementation • ECCV 2020 • Yifan Liu, Chunhua Shen, Changqian Yu, Jingdong Wang
For semantic segmentation, most existing real-time deep models trained with each frame independently may produce inconsistent results for a video sequence.
Ranked #2 on Video Semantic Segmentation on CamVid
2 code implementations • 3 Feb 2020 • Wei Yin, Xinlong Wang, Chunhua Shen, Yifan Liu, Zhi Tian, Songcen Xu, Changming Sun, Dou Renyin
Compared with previous learning objectives, i. e., learning metric depth or relative depth, we propose to learn the affine-invariant depth using our diverse dataset to ensure both generalization and high-quality geometric shapes of scenes.
no code implementations • 5 Sep 2019 • Yifan Liu, Bohan Zhuang, Chunhua Shen, Hao Chen, Wei Yin
The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized.
no code implementations • 11 Aug 2019 • Yang Zhao, Yifan Liu, Chunhua Shen, Yongsheng Gao, Shengwu Xiong
To this end, we propose an effective lightweight model, namely Mobile Face Alignment Network (MobileFAN), using a simple backbone MobileNetV2 as the encoder and three deconvolutional layers as the decoder.
3 code implementations • ICCV 2019 • Wei Yin, Yifan Liu, Chunhua Shen, Youliang Yan
Monocular depth prediction plays a crucial role in understanding 3D scene geometry.
Ranked #10 on Depth Estimation on NYU-Depth V2
no code implementations • SEMEVAL 2019 • Yifan Liu, Keyu Ding, Yi Zhou
AiFu has won the first place in the SemEval-2019 Task 10 - {''}Math Question Answering{''}competition.
1 code implementation • CVPR 2019 • Yifan Liu, Ke Chen, Chris Liu, Zengchang Qin, Zhenbo Luo, Jingdong Wang
We further propose to distill the structured knowledge from cumbersome networks into compact networks, which is motivated by the fact that semantic segmentation is a structured prediction problem.
1 code implementation • CVPR 2019 • Yifan Liu, Changyong Shun, Jingdong Wang, Chunhua Shen
Here we propose to distill structured knowledge from large networks to compact networks, taking into account the fact that dense prediction is a structured prediction problem.
no code implementations • 1 Nov 2018 • Shuangting Liu, Jia-Qi Zhang, Yuxin Chen, Yifan Liu, Zengchang Qin, Tao Wan
Semantic segmentation is one of the basic topics in computer vision, it aims to assign semantic labels to every pixel of an image.
no code implementations • 2 Nov 2017 • Xinyue Zhu, Yifan Liu, Zengchang Qin, Jiahong Li
In this paper, we propose a data augmentation method using generative adversarial networks (GAN).
no code implementations • 9 May 2017 • Liang Li, Pengyu Li, Yifan Liu, Tao Wan, Zengchang Qin
Under our learning policy, the Seq2Seq model can learn mappings gradually with noises.
4 code implementations • 4 May 2017 • Yifan Liu, Zengchang Qin, Zhenbo Luo, Hua Wang
Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment.