Search Results for author: Haosen Yang

Found 9 papers, 4 papers with code

Unsupervised Audio-Visual Segmentation with Modality Alignment

no code implementations21 Mar 2024 Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiangkang Deng, Xiatian Zhu

Audio-Visual Segmentation (AVS) aims to identify, at the pixel level, the object in a visual scene that produces a given sound.

Contrastive Learning

Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation

1 code implementation17 Mar 2024 Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhu

Source-free unsupervised domain adaptation (SFUDA) aims to enable the utilization of a pre-trained source model in an unlabeled target domain without access to source data.

Contrastive Learning Memorization +3

WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

no code implementations14 Mar 2024 Hong Liu, Haosen Yang, Paul J. van Diest, Josien P. W. Pluim, Mitko Veta

In particular, our model outperforms SAM by 4. 1 and 2. 5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task (CAMELYON16 dataset).

Segmentation Semantic Segmentation +1

Recognize Any Regions

1 code implementation2 Nov 2023 Haosen Yang, Chuofan Ma, Bin Wen, Yi Jiang, Zehuan Yuan, Xiatian Zhu

Understanding the semantics of individual regions or patches within unconstrained images, such as in open-world object detection, represents a critical yet challenging task in computer vision.

object-detection Object Recognition +1

Leveraging Foundation models for Unsupervised Audio-Visual Segmentation

no code implementations13 Sep 2023 Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Xiatian Zhu

Particularly, in situations where existing supervised AVS methods struggle with overlapping foreground objects, our models still excel in accurately segmenting overlapped auditory objects.

Segmentation

Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders

1 code implementation9 Oct 2022 Haosen Yang, Deng Huang, Bin Wen, Jiannan Wu, Hongxun Yao, Yi Jiang, Xiatian Zhu, Zehuan Yuan

As a result, our model can extract effectively both static appearance and dynamic motion spontaneously, leading to superior spatiotemporal representation learning capability.

Representation Learning Semantic Segmentation +2

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

no code implementations21 Jul 2022 Boyang xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang

On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations.

Action Recognition Video Classification +1

Temporal Action Proposal Generation with Background Constraint

1 code implementation15 Dec 2021 Haosen Yang, Wenhao Wu, Lining Wang, Sheng Jin, Boyang xia, Hongxun Yao, Hujie Huang

To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.

Temporal Action Proposal Generation

Temporal Action Proposal Generation with Transformers

no code implementations25 May 2021 Lining Wang, Haosen Yang, Wenhao Wu, Hongxun Yao, Hujie Huang

Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately.

Temporal Action Proposal Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.