Search Results for author: Jianbo Ma

Found 3 papers, 2 papers with code

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

1 code implementation18 Aug 2023 Heng Wang, Jianbo Ma, Santiago Pascual, Richard Cartwright, Weidong Cai

In this paper, we propose a lightweight solution to this problem by leveraging foundation models, specifically CLIP, CLAP, and AudioLDM.

Audio Generation

A low latency attention module for streaming self-supervised speech representation learning

1 code implementation27 Feb 2023 Jianbo Ma, Siqi Pan, Deepak Chandran, Andrea Fanelli, Richard Cartwright

The SA represents our proposal for an efficient streaming SSRL implementation, while the LLSA solves the latency build-up problem of other streaming attention architectures, such as the masked acausal attention (MAA), guaranteeing a latency equal to one layer even when multiple layers are stacked.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Cannot find the paper you are looking for? You can Submit a new open access paper.