Search Results for author: Shaoning Xiao

Found 5 papers, 1 papers with code

Rethinking the Evaluation of Unbiased Scene Graph Generation

no code implementations • 3 Aug 2022 • Xingchen Li, Long Chen, Jian Shao, Shaoning Xiao, Songyang Zhang, Jun Xiao

Current Scene Graph Generation (SGG) methods tend to predict frequent predicate categories and fail to recognize rare ones due to the severe imbalanced distribution of predicates.

Graph Generation Unbiased Scene Graph Generation

Paper
Add Code

Rethinking Multi-Modal Alignment in Video Question Answering from Feature and Sample Perspectives

no code implementations • 25 Apr 2022 • Shaoning Xiao, Long Chen, Kaifeng Gao, Zhao Wang, Yi Yang, Zhimeng Zhang, Jun Xiao

From the view of feature, we break down the video into trajectories and first leverage trajectory feature in VideoQA to enhance the alignment between two modalities.

Question Answering Video Question Answering

Paper
Add Code

Natural Language Video Localization with Learnable Moment Proposals

1 code implementation • EMNLP 2021 • Shaoning Xiao, Long Chen, Jian Shao, Yueting Zhuang, Jun Xiao

Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by the query.

Paper
Code

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

no code implementations • 26 May 2021 • Feifei Shao, Long Chen, Jian Shao, Wei Ji, Shaoning Xiao, Lu Ye, Yueting Zhuang, Jun Xiao

With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention.

Object object-detection +2

Paper
Add Code

Boundary Proposal Network for Two-Stage Natural Language Video Localization

no code implementations • 15 Mar 2021 • Shaoning Xiao, Long Chen, Songyang Zhang, Wei Ji, Jian Shao, Lu Ye, Jun Xiao

State-of-the-art NLVL methods are almost in one-stage fashion, which can be typically grouped into two categories: 1) anchor-based approach: it first pre-defines a series of video segment candidates (e. g., by sliding window), and then does classification for each candidate; 2) anchor-free approach: it directly predicts the probabilities for each video frame as a boundary or intermediate frame inside the positive segment.

Vocal Bursts Valence Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.