Search Results for author: Junyu Gao

Found 32 papers, 15 papers with code

Dynamic Proxy Domain Generalizes the Crowd Localization by Better Binary Segmentation

1 code implementation22 Apr 2024 Junyu Gao, Da Zhang, Xuelong Li

Then, based on the theory, we design a DPD algorithm which is composed by a training paradigm and proxy domain generator to enhance the domain generalization of the confidence-threshold learner.

Binary Classification Domain Generalization

NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial Images

1 code implementation19 Jan 2024 Junyu Gao, Liangliang Zhao, Xuelong Li

Considering the absence of a dataset for this task, a large-scale Dataset (NWPU-MOC) is collected, consisting of 3, 416 scenes with a resolution of 1024 $\times$ 1024 pixels, and well-annotated using 14 fine-grained object categories.

Object Object Counting

SamLP: A Customized Segment Anything Model for License Plate Detection

1 code implementation12 Jan 2024 Haoxuan Ding, Junyu Gao, Yuan Yuan, Qi Wang

Meanwhile, the proposed SamLP has great few-shot and zero-shot learning ability, which shows the potential of transferring vision foundation model.

License Plate Detection Zero-Shot Learning

Test-time Adaptive Vision-and-Language Navigation

no code implementations22 Nov 2023 Junyu Gao, Xuan Yao, Changsheng Xu

Then, these components are adaptively accumulated to pinpoint a concordant direction for fast model adaptation.

Test-time Adaptation Vision and Language Navigation

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding

1 code implementation6 Nov 2023 Shengkai Sun, Daizong Liu, Jianfeng Dong, Xiaoye Qu, Junyu Gao, Xun Yang, Xun Wang, Meng Wang

In this manner, our framework is able to learn the unified representations of uni-modal or multi-modal skeleton input, which is flexible to different kinds of modality input for robust action understanding in practical cases.

Action Understanding Representation Learning +1

Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation

no code implementations12 Oct 2023 Junyu Gao, Xinhong Ma, Changsheng Xu

Despite the great progress of unsupervised domain adaptation (UDA) with the deep neural networks, current UDA models are opaque and cannot provide promising explanations, limiting their applications in the scenarios that require safe and controllable model decisions.

Decision Making Pseudo Label +2

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing

no code implementations5 Jul 2023 Jie Fu, Junyu Gao, Changsheng Xu

In this paper, to balance the feature learning processes of different modalities, a dynamic gradient modulation (DGM) mechanism is explored, where a novel and effective metric function is designed to measure the imbalanced feature learning between audio and visual modalities.

Imbalanced Aircraft Data Anomaly Detection

no code implementations17 May 2023 Hao Yang, Junyu Gao, Yuan Yuan, Xuelong Li

Anomaly detection in temporal data from sensors under aviation scenarios is a practical but challenging task: 1) long temporal data is difficult to extract contextual information with temporal correlation; 2) the anomalous data are rare in time series, causing normal/abnormal imbalance in anomaly detection, making the detector classification degenerate or even fail.

Anomaly Detection Time Series

Cascade Evidential Learning for Open-World Weakly-Supervised Temporal Action Localization

no code implementations CVPR 2023 Mengyuan Chen, Junyu Gao, Changsheng Xu

Targeting at recognizing and localizing action instances with only video-level labels during training, Weakly-supervised Temporal Action Localization (WTAL) has achieved significant progress in recent years.

Open Set Learning Weakly-supervised Temporal Action Localization +1

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception

1 code implementation CVPR 2023 Junyu Gao, Mengyuan Chen, Changsheng Xu

We argue that, for an event residing in one modality, the modality itself should provide ample presence evidence of this event, while the other complementary modality is encouraged to afford the absence evidence as a reference signal.

Counting Like Human: Anthropoid Crowd Counting on Modeling the Similarity of Objects

no code implementations2 Dec 2022 Qi Wang, Juncheng Wang, Junyu Gao, Yuan Yuan, Xuelong Li

The mainstream crowd counting methods regress density map and integrate it to obtain counting results.

Crowd Counting

MAFNet: A Multi-Attention Fusion Network for RGB-T Crowd Counting

no code implementations14 Aug 2022 PengYu Chen, Junyu Gao, Yuan Yuan, Qi Wang

RGB-Thermal (RGB-T) crowd counting is a challenging task, which uses thermal images as complementary information to RGB images to deal with the decreased performance of unimodal RGB-based methods in scenes with low-illumination or similar backgrounds.

Crowd Counting

Crowd Localization from Gaussian Mixture Scoped Knowledge and Scoped Teacher

no code implementations12 Jun 2022 Juncheng Wang, Junyu Gao, Yuan Yuan, Qi Wang

The core reason of intrinsic scale shift being one of the most essential issues in crowd localization is that it is ubiquitous in crowd scenes and makes scale distribution chaotic.

Learning Muti-expert Distribution Calibration for Long-tailed Video Classification

no code implementations22 May 2022 Yufan Hu, Junyu Gao, Changsheng Xu

Most existing state-of-the-art video classification methods assume that the training data obey a uniform distribution.

Classification Image Classification +1

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

1 code implementation4 Apr 2022 Ziyue Wu, Junyu Gao, Shucheng Huang, Changsheng Xu

Then, a commonsense-aware interaction module is designed to obtain bridged visual and text features by utilizing the learned commonsense concepts.

Natural Language Queries

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization

1 code implementation CVPR 2022 Junyu Gao, Mengyuan Chen, Changsheng Xu

We target at the task of weakly-supervised action localization (WSAL), where only video-level action labels are available during model training.

Classification Contrastive Learning +4

DR.VIC: Decomposition and Reasoning for Video Individual Counting

2 code implementations CVPR 2022 Tao Han, Lei Bai, Junyu Gao, Qi Wang, Wanli Ouyang

Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame.

Crowd Counting Density Estimation +2

Weakly-Supervised Video Object Grounding via Causal Intervention

no code implementations1 Dec 2021 Wei Wang, Junyu Gao, Changsheng Xu

With this in mind, we design a unified causal framework to learn the deconfounded object-relevant association for more accurate and robust video object grounding.

Contrastive Learning Object +1

LDC-Net: A Unified Framework for Localization, Detection and Counting in Dense Crowds

no code implementations10 Oct 2021 Qi Wang, Tao Han, Junyu Gao, Yuan Yuan, Xuelong Li

The rapid development in visual crowd analysis shows a trend to count people by positioning or even detecting, rather than simply summing a density map.

Visual Crowd Analysis

Unsupervised Domain Adaptive Learning via Synthetic Data for Person Re-identification

no code implementations12 Sep 2021 Qi Wang, Sikai Bai, Junyu Gao, Yuan Yuan, Xuelong Li

In addition, due to domain gaps between different datasets, the performance is dramatically decreased when re-ID models pre-trained on label-rich datasets (source domain) are directly applied to other unlabeled datasets (target domain).

Person Re-Identification Unsupervised Domain Adaptation

Congested Crowd Instance Localization with Dilated Convolutional Swin Transformer

1 code implementation2 Aug 2021 Junyu Gao, Maoguo Gong, Xuelong Li

To this end, we propose a Dilated Convolutional Swin Transformer (DCST) for congested crowd scenes.

Crowd Counting Representation Learning

Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark

1 code implementation19 Jul 2021 Haopeng Li, Lingbo Liu, Kunlin Yang, Shinan Liu, Junyu Gao, Bin Zhao, Rui Zhang, Jun Hou

Video crowd localization is a crucial yet challenging task, which aims to estimate exact locations of human heads in the given crowded videos.

Health Status Prediction with Local-Global Heterogeneous Behavior Graph

no code implementations23 Mar 2021 Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu

However, these data streams are multi-source and heterogeneous, containing complex temporal structures with local contextual and global temporal aspects, which makes the feature learning and data joint utilization challenging.

Management

Fast Video Moment Retrieval

no code implementations ICCV 2021 Junyu Gao, Changsheng Xu

To tackle this issue, we replace the cross-modal interaction module with a cross-modal common space, in which moment-query alignment is learned and efficient moment search can be performed.

Moment Retrieval Retrieval +1

Active Universal Domain Adaptation

no code implementations ICCV 2021 Xinhong Ma, Junyu Gao, Changsheng Xu

This paper proposes a new paradigm for unsupervised domain adaptation, termed as Active Universal Domain Adaptation (AUDA), which removes all label set assumptions and aims for not only recognizing target samples from source classes but also inferring those from target-private classes by using active learning to annotate a small budget of target data.

Active Learning Universal Domain Adaptation +1

Learning Independent Instance Maps for Crowd Localization

1 code implementation8 Dec 2020 Junyu Gao, Tao Han, Qi Wang, Yuan Yuan, Xuelong Li

Furthermore, to improve the segmentation quality for different density regions, we present a differentiable Binarization Module (BM) to output structured instance maps.

Binarization Segmentation

Unsupervised Semantic Aggregation and Deformable Template Matching for Semi-Supervised Learning

1 code implementation NeurIPS 2020 Tao Han, Junyu Gao, Yuan Yuan, Qi Wang

In this paper, we combine both to propose an Unsupervised Semantic Aggregation and Deformable Template Matching (USADTM) framework for SSL, which strives to improve the classification performance with few labeled data and then reduce the cost in data annotating.

Template Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.