Search Results for author: Linjun Li

Found 11 papers, 5 papers with code

SPOT: Selective Point Cloud Voting for Better Proposal in Point Cloud Object Detection

no code implementations • ECCV 2020 • Hongyuan Du, Linjun Li, Bo Liu, Nuno Vasconcelos

The sparsity of point clouds limits deep learning models on capturing long-range dependencies, which makes features extracted by the models ambiguous.

Object object-detection +1

Paper
Add Code

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

no code implementations • 23 Dec 2023 • Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao

However, talking head translation, converting audio-visual speech (i. e., talking head video) from one language into another, still confronts several challenges compared to audio speech: (1) Existing methods invariably rely on cascading, synthesizing via both audio and text, resulting in delays and cascading errors.

Self-Supervised Learning Speech-to-Speech Translation +1

Paper
Add Code

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

no code implementations • 25 Jul 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.

Object Position +3

Paper
Add Code

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

1 code implementation • ICCV 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.

Object Semantic Similarity +3

Paper
Code

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

1 code implementation • 10 Jun 2023 • Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao

We demonstrate that OpenSR enables modality transfer from one to any in three different settings (zero-, few- and full-shot), and achieves highly competitive zero-shot performance compared to the existing few-shot and full-shot lip-reading methods.

Audio-Visual Speech Recognition Lip Reading +2

Paper
Code

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

no code implementations • 24 May 2023 • Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao

Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date.

Speech-to-Speech Translation Translation

Paper
Add Code

Connecting Multi-modal Contrastive Representations

no code implementations • NeurIPS 2023 • Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).

3D Point Cloud Classification counterfactual +4

Paper
Add Code

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

2 code implementations • ICCV 2023 • Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao

However, despite researchers exploring cross-lingual translation techniques such as machine translation and audio speech translation to overcome language barriers, there is still a shortage of cross-lingual studies on visual speech.

Lip Reading Machine Translation +4

157

Paper
Code

Exploring Group Video Captioning with Efficient Relational Approximation

no code implementations • ICCV 2023 • Wang Lin, Tao Jin, Ye Wang, Wenwen Pan, Linjun Li, Xize Cheng, Zhou Zhao

In this study, we propose a new task, group video captioning, which aims to infer the desired content among a group of target videos and describe it with another group of related reference videos.

Video Captioning

Paper
Add Code

Motion Planning Transformers: A Motion Planning Framework for Mobile Robots

1 code implementation • 5 Jun 2021 • Jacob J. Johnson, Uday S. Kalra, Ankit Bhatia, Linjun Li, Ahmed H. Qureshi, Michael C. Yip

A popular technique to improve the efficiency of these planners is to restrict search space in the planning domain.

Motion Planning valid

Paper
Code

MPC-MPNet: Model-Predictive Motion Planning Networks for Fast, Near-Optimal Planning under Kinodynamic Constraints

1 code implementation • 17 Jan 2021 • Linjun Li, Yinglong Miao, Ahmed H. Qureshi, Michael C. Yip

Kinodynamic Motion Planning (KMP) is to find a robot motion subject to concurrent kinematics and dynamics constraints.

Imitation Learning Motion Planning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.