Search Results for author: Zhisheng Zheng

Found 10 papers, 5 papers with code

BAT: Learning to Reason about Spatial Sounds with Large Language Models

no code implementations • 2 Feb 2024 • Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath

By integrating Spatial-AST with LLaMA-2 7B model, BAT transcends standard Sound Event Localization and Detection (SELD) tasks, enabling the model to reason about the relationships between the sounds in its environment.

Event Detection Language Modelling +5

Paper
Add Code

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

1 code implementation • 7 Jan 2024 • Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen

Audio self-supervised learning (SSL) pre-training, which aims to learn good representations from unlabeled audio, has made remarkable progress.

Self-Supervised Learning

Paper
Code

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

2 code implementations • 23 Dec 2023 • Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

3,624

Paper
Code

Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

1 code implementation • 25 Sep 2023 • Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen

Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks.

Representation Learning Self-Supervised Learning +2

Paper
Code

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition

no code implementations • 19 Sep 2023 • Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen

In this paper, we explored how to boost speech emotion recognition (SER) with the state-of-the-art speech pre-trained model (PTM), data2vec, text generation technique, GPT-4, and speech synthesis technique, Azure TTS.

Data Augmentation Language Modelling +5

Paper
Add Code

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition

no code implementations • 28 Aug 2023 • Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen

In recent years, speech-based self-supervised learning (SSL) has made significant progress in various tasks, including automatic speech recognition (ASR).

Active Learning Automatic Speech Recognition +3

Paper
Add Code

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation

1 code implementation • 15 Jun 2023 • Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen

Our models outperform other SSL models significantly on the LibriSpeech benchmark without the need for iterative re-clustering and re-training.

Ranked #1 on Automatic Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition Clustering +4

Paper
Code

Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition

no code implementations • 18 Feb 2023 • Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng

However, the training of SSL models is computationally expensive and a common practice is to fine-tune a released SSL model on the specific task.

Self-Supervised Learning speech-recognition +1

Paper
Add Code

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

1 code implementation • 14 Nov 2022 • Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen

In this paper, we provide a new perspective on self-supervised speech models from how the training targets are obtained.

Ranked #40 on Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition Multi-Task Learning +3

Paper
Code

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition

no code implementations • 27 Oct 2022 • Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang

Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.