Search Results for author: Yuxin Song

Found 10 papers, 7 papers with code

A Survey on Consumer IoT Traffic: Security and Privacy

1 code implementation • 24 Mar 2024 • Yan Jia, Yuxin Song, Zihou Liu, Qingyin Tan, Fangming Wang, Yu Zhang, Zheli Liu

From the security and privacy perspective, this survey seeks out the new characteristics in CIoT traffic analysis, the state-of-the-art progress in CIoT traffic analysis, and the challenges yet to be solved.

Paper
Code

Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction

1 code implementation • 15 Dec 2023 • Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren

Specifically, this paper proposes a lightweight multi-level graph learning (MLGL) based on local and global semantic graphs to simultaneously perform audio event classification (AEC) and human annoyance rating prediction (ARP).

Graph Learning

Paper
Code

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

2 code implementations • 27 Nov 2023 • Wenhao Wu, Huanjin Yao, Mengxi Zhang, Yuxin Song, Wanli Ouyang, Jingdong Wang

Our study centers on the evaluation of GPT-4's linguistic and visual capabilities in zero-shot visual recognition tasks: Firstly, we explore the potential of its generated rich textual descriptions across various categories to enhance recognition performance without any training.

Zero-Shot Learning

842

Paper
Code

What Can Simple Arithmetic Operations Do for Temporal Modeling?

2 code implementations • ICCV 2023 • Wenhao Wu, Yuxin Song, Zhun Sun, Jingdong Wang, Chang Xu, Wanli Ouyang

We conduct comprehensive ablation studies on the instantiation of ATMs and demonstrate that this module provides powerful temporal modeling capability at a low computational cost.

Ranked #4 on Action Recognition on Something-Something V1

Action Classification Action Recognition +1

Paper
Code

UATVR: Uncertainty-Adaptive Text-Video Retrieval

1 code implementation • ICCV 2023 • Bo Fang, Wenhao Wu, Chang Liu, Yu Zhou, Yuxin Song, Weiping Wang, Xiangbo Shu, Xiangyang Ji, Jingdong Wang

In the refined embedding space, we represent text-video pairs as probabilistic distributions where prototypes are sampled for matching evaluation.

Retrieval Semantic correspondence +1

Paper
Code

GRATIS: Deep Learning Graph Representation with Task-specific Topology and Multi-dimensional Edge Features

1 code implementation • 19 Nov 2022 • Siyang Song, Yuxin Song, Cheng Luo, Zhiyuan Song, Selim Kuzucu, Xi Jia, Zhijiang Guo, Weicheng Xie, Linlin Shen, Hatice Gunes

Our framework is effective, robust and flexible, and is a plug-and-play module that can be combined with different backbones and Graph Neural Networks (GNNs) to generate a task-specific graph representation from various graph and non-graph data.

Graph Representation Learning

Paper
Code

Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification

1 code implementation • 27 Oct 2022 • Yuanbo Hou, Siyang Song, Chuang Yu, Yuxin Song, Wenwu Wang, Dick Botteldooren

Experiments on a polyphonic acoustic scene dataset show that the proposed ERGL achieves competitive performance on ASC by using only a limited number of embeddings of audio events without any data augmentations.

Ranked #1 on Acoustic Scene Classification on TUT Urban Acoustic Scenes 2018

Acoustic Scene Classification Graph Representation Learning +1

Paper
Code

It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training

no code implementations • 11 Oct 2022 • Yuxin Song, Min Yang, Wenhao Wu, Dongliang He, Fu Li, Jingdong Wang

In order to guide the encoder to fully excavate spatial-temporal features, two separate decoders are used for two pretext tasks of disentangled appearance and motion prediction.

motion prediction

Paper
Add Code

DALG: Deep Attentive Local and Global Modeling for Image Retrieval

no code implementations • 1 Jul 2022 • Yuxin Song, Ruolin Zhu, Min Yang, Dongliang He

Deeply learned representations have achieved superior image retrieval performance in a retrieve-then-rerank manner.

Image Retrieval Representation Learning +1

Paper
Add Code

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval

no code implementations • 10 Nov 2019 • Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song

How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.