Search Results for author: Jianbo Ma

Found 3 papers, 2 papers with code

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

no code implementations • 5 Jan 2024 • Dongdi Zhao, Jianbo Ma, Lu Lu, Jinke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

Far-field speech recognition is a challenging task that conventionally uses signal processing beamforming to attack noise and interference problem.

Speech Enhancement speech-recognition +1

Paper
Add Code

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

1 code implementation • 18 Aug 2023 • Heng Wang, Jianbo Ma, Santiago Pascual, Richard Cartwright, Weidong Cai

In this paper, we propose a lightweight solution to this problem by leveraging foundation models, specifically CLIP, CLAP, and AudioLDM.

Audio Generation

Paper
Code

A low latency attention module for streaming self-supervised speech representation learning

1 code implementation • 27 Feb 2023 • Jianbo Ma, Siqi Pan, Deepak Chandran, Andrea Fanelli, Richard Cartwright

The SA represents our proposal for an efficient streaming SSRL implementation, while the LLSA solves the latency build-up problem of other streaming attention architectures, such as the masked acausal attention (MAA), guaranteeing a latency equal to one layer even when multiple layers are stacked.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.