Search Results for author: Linjun Li

Found 11 papers, 5 papers with code

SPOT: Selective Point Cloud Voting for Better Proposal in Point Cloud Object Detection

no code implementations ECCV 2020 Hongyuan Du, Linjun Li, Bo Liu, Nuno Vasconcelos

The sparsity of point clouds limits deep learning models on capturing long-range dependencies, which makes features extracted by the models ambiguous.

Object object-detection +1

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

no code implementations23 Dec 2023 Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao

However, talking head translation, converting audio-visual speech (i. e., talking head video) from one language into another, still confronts several challenges compared to audio speech: (1) Existing methods invariably rely on cascading, synthesizing via both audio and text, resulting in delays and cascading errors.

Self-Supervised Learning Speech-to-Speech Translation +1

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

no code implementations25 Jul 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.

Object Position +3

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

1 code implementation ICCV 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.

Object Semantic Similarity +3

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

1 code implementation10 Jun 2023 Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao

We demonstrate that OpenSR enables modality transfer from one to any in three different settings (zero-, few- and full-shot), and achieves highly competitive zero-shot performance compared to the existing few-shot and full-shot lip-reading methods.

Audio-Visual Speech Recognition Lip Reading +2

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

no code implementations24 May 2023 Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao

Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date.

Speech-to-Speech Translation Translation

Connecting Multi-modal Contrastive Representations

no code implementations NeurIPS 2023 Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).

3D Point Cloud Classification counterfactual +4

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

2 code implementations ICCV 2023 Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao

However, despite researchers exploring cross-lingual translation techniques such as machine translation and audio speech translation to overcome language barriers, there is still a shortage of cross-lingual studies on visual speech.

Lip Reading Machine Translation +4

Exploring Group Video Captioning with Efficient Relational Approximation

no code implementations ICCV 2023 Wang Lin, Tao Jin, Ye Wang, Wenwen Pan, Linjun Li, Xize Cheng, Zhou Zhao

In this study, we propose a new task, group video captioning, which aims to infer the desired content among a group of target videos and describe it with another group of related reference videos.

Video Captioning

Motion Planning Transformers: A Motion Planning Framework for Mobile Robots

1 code implementation5 Jun 2021 Jacob J. Johnson, Uday S. Kalra, Ankit Bhatia, Linjun Li, Ahmed H. Qureshi, Michael C. Yip

A popular technique to improve the efficiency of these planners is to restrict search space in the planning domain.

Motion Planning valid

Cannot find the paper you are looking for? You can Submit a new open access paper.