Search Results for author: Xi Zhou

Found 32 papers, 13 papers with code

V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges

no code implementations5 Oct 2023 Tao Huang, Jianan Liu, Xi Zhou, Dinh C. Nguyen, Mostafa Rahimi Azghadi, Yuxuan Xia, Qing-Long Han, Sumei Sun

To address this gap, this paper provides a comprehensive overview of the evolution of CP technologies, spanning from early explorations to recent developments, including advancements in V2X communication technologies.

Autonomous Driving Object Recognition

Domain-adaptive Message Passing Graph Neural Network

1 code implementation31 Aug 2023 Xiao Shen, Shirui Pan, Kup-Sze Choi, Xi Zhou

Cross-network node classification (CNNC), which aims to classify nodes in a label-deficient target network by transferring the knowledge from a source network with abundant labels, draws increasing attention recently.

Domain Adaptation Node Classification

Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos

1 code implementation ICCV 2023 Zhiqiang Shen, Xiaoxiao Sheng, Hehe Fan, Longguang Wang, Yulan Guo, Qiong Liu, Hao Wen, Xi Zhou

In this paper, we propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations.

Self-Supervised Learning Video Understanding

All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment

no code implementations7 Jul 2023 Chunhui Zhang, Xin Sun, Li Liu, Yiqian Yang, Qiong Liu, Xi Zhou, Yanfeng Wang

This approach achieves feature integration in a unified backbone, removing the need for carefully-designed fusion modules and resulting in a more effective and efficient VL tracking framework.

PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos

1 code implementation CVPR 2023 Zhiqiang Shen, Xiaoxiao Sheng, Longguang Wang, Yulan Guo, Qiong Liu, Xi Zhou

Self-supervised learning can extract representations of good quality from solely unlabeled data, which is appealing for point cloud videos due to their high labelling cost.

Self-Supervised Learning Transfer Learning

HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model Pretraining

1 code implementation10 Oct 2022 Chunhui Zhang, Yixiong Chen, Li Liu, Qiong Liu, Xi Zhou

This work proposes a hierarchical contrastive learning (HiCo) method to improve the transferability for the US video model pretraining.

Contrastive Learning

A Data Driven Method for Multi-step Prediction of Ship Roll Motion in High Sea States

no code implementations26 Jul 2022 Dan Zhang, Xi Zhou, Zi-Hao Wang, Yan Peng, Shao-Rong Xie

This paper presents a novel data-driven methodology to provide a multi-step prediction of ship roll motions in high sea states.

feature selection

You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in Videos

1 code implementation25 May 2022 Xin Sun, Xuan Wang, Jialin Gao, Qiong Liu, Xi Zhou

Moment retrieval in videos is a challenging task that aims to retrieve the most relevant video moment in an untrimmed video given a sentence description.

Moment Retrieval Reading Comprehension +2

Composing Answer from Multi-spans for Reading Comprehension

no code implementations14 Sep 2020 Zhuosheng Zhang, Yiqing Zhang, Hai Zhao, Xi Zhou, Xiang Zhou

This paper presents a novel method to generate answers for non-extraction machine reading comprehension (MRC) tasks whose answers cannot be simply extracted as one span from the given passages.

Machine Reading Comprehension

Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue

1 code implementation14 Sep 2020 Longxiang Liu, Zhuosheng Zhang, Hai Zhao, Xi Zhou, Xiang Zhou

A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.

Retrieval

Receptive Multi-granularity Representation for Person Re-Identification

no code implementations31 Aug 2020 Guanshuo Wang, Yufeng Yuan, Jiwei Li, Shiming Ge, Xi Zhou

Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness, which easily suffers from part semantic inconsistency for the conflict between rigid partition and misalignment.

Person Re-Identification

Focusing and Diffusion: Bidirectional Attentive Graph Convolutional Networks for Skeleton-based Action Recognition

no code implementations24 Dec 2019 Jialin Gao, Tong He, Xi Zhou, Shiming Ge

A collection of approaches based on graph convolutional networks have proven success in skeleton-based action recognition by exploring neighborhood information and dense dependencies between intra-frame joints.

Action Recognition Skeleton Based Action Recognition

Semantics-aware BERT for Language Understanding

1 code implementation5 Sep 2019 Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks.

Language Modelling Machine Reading Comprehension +5

DCMN+: Dual Co-Matching Network for Multi-choice Reading Comprehension

2 code implementations30 Aug 2019 Shuailiang Zhang, Hai Zhao, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou

Multi-choice reading comprehension is a challenging task to select an answer from a set of candidate options when given passage and question.

Reading Comprehension Sentence

Relation-Aware Pyramid Network (RapNet) for temporal action proposal

no code implementations9 Aug 2019 Jialin Gao, Zhixiang Shi, Jiani Li, Yufeng Yuan, Jiwei Li, Xi Zhou

In this technical report, we describe our solution to temporal action proposal (task 1) in ActivityNet Challenge 2019.

Relation

Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks

no code implementations19 Nov 2018 Yuan Li, Yuanjie Yu, Zefeng Li, Yangkun Lin, Meifang Xu, Jiwei Li, Xi Zhou

Recently, semantic segmentation and general object detection frameworks have been widely adopted by scene text detecting tasks.

object-detection Object Detection +2

Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition

no code implementations29 Oct 2018 Xinpei Zhou, Jiwei Li, Xi Zhou

Automatic speech recognition (ASR) tasks are resolved by end-to-end deep learning models, which benefits us by less preparation of raw data, and easier transformation between languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition

no code implementations26 Oct 2018 Xuerui Yang, Jiwei Li, Xi Zhou

Deep Feedforward Sequential Memory Network (DFSMN) has shown superior performance on speech recognition tasks.

Sound Audio and Speech Processing

Toward Better Loanword Identification in Uyghur Using Cross-lingual Word Embeddings

no code implementations COLING 2018 Chenggang Mi, Yating Yang, Lei Wang, Xi Zhou, Tonghai Jiang

Neural machine translation models integrating results of loanword identification experiments achieve the best results on OOV translation(with 0. 5-0. 9 BLEU improvements)

Cross-Lingual Word Embeddings Language Modelling +3

Learning Discriminative Features with Multiple Granularities for Person Re-Identification

16 code implementations4 Apr 2018 Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, Xi Zhou

Instead of learning on semantic regions, we uniformly partition the images into several stripes, and vary the number of parts in different local branches to obtain local feature representations with multiple granularities.

Ranked #3 on Person Re-Identification on SYSU-30k (using extra training data)

Person Re-Identification Re-Ranking

A Deep Regression Architecture With Two-Stage Re-Initialization for High Performance Facial Landmark Detection

1 code implementation CVPR 2017 Jiangjing Lv, Xiaohu Shao, Junliang Xing, Cheng Cheng, Xi Zhou

At the global stage, given an image with a rough face detection result, the full face region is firstly re-initialized by a supervised spatial transformer network to a canonical shape state and then trained to regress a coarse landmark estimation.

Face Detection Facial Landmark Detection +1

A Bilingual Discourse Corpus and Its Applications

no code implementations LREC 2016 Yang Liu, Jiajun Zhang, Cheng-qing Zong, Yating Yang, Xi Zhou

Existing discourse research only focuses on the monolingual languages and the inconsistency between languages limits the power of the discourse theory in multilingual applications such as machine translation.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.