Search Results for author: Xuenan Xu

Found 8 papers, 3 papers with code

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

no code implementations • 30 Apr 2024 • Haohe Liu, Xuenan Xu, Yi Yuan, Mengyue Wu, Wenwu Wang, Mark D. Plumbley

Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modelling techniques to audio data.

Decoder Language Modelling

Paper
Add Code

T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining

no code implementations • 27 Apr 2024 • Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang

Contrastive language-audio pretraining~(CLAP) has been developed to align the representations of audio and language, achieving remarkable performance in retrieval and classification tasks.

Retrieval

Paper
Add Code

A Large-scale Dataset for Audio-Language Representation Learning

no code implementations • 20 Sep 2023 • Luoyi Sun, Xuenan Xu, Mengyue Wu, Weidi Xie

To tackle these challenges, we present an innovative and automatic audio caption generation pipeline based on a series of public tools or APIs, and construct a large-scale, high-quality, audio-language dataset, named as Auto-ACD, comprising over 1. 9M audio-text pairs.

Audio captioning Caption Generation +2

Paper
Add Code

Improving Audio Caption Fluency with Automatic Error Correction

no code implementations • 16 Jun 2023 • Hanxue Zhang, Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu

Automated audio captioning (AAC) is an important cross-modality translation task, aiming at generating descriptions for audio clips.

Audio captioning Sentence

Paper
Add Code

Audio-text Retrieval in Context

no code implementations • 25 Mar 2022 • Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu

Using pre-trained audio features and a descriptor-based aggregation method, we build our contextual audio-text retrieval system.

AudioCaps Retrieval +1

Paper
Add Code

Can Audio Captions Be Evaluated with Image Caption Metrics?

1 code implementation • 10 Oct 2021 • Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

Current metrics are found in poor correlation with human annotations on these datasets.

AudioCaps Audio captioning +2

Paper
Code

THE SJTU SYSTEM FOR DCASE2021 CHALLENGE TASK 6: AUDIO CAPTIONING BASED ON ENCODER PRE-TRAINING AND REINFORCEMENT LEARNING

1 code implementation • DCASE Challenge 2021 • Xuenan Xu, Zeyu Xie, Mengyue Wu, Kai Yu

This report proposes an audio captioning system for the Detection and Classification of Acoustic Scenes and Events (DCASE) 2021 challenge task Task 6.

Ranked #2 on Audio captioning on Clotho (using extra training data)

Audio captioning Audio Tagging +3

Paper
Code

Audio Caption in a Car Setting with a Sentence-Level Loss

1 code implementation • 31 May 2019 • Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu

Captioning has attracted much attention in image and video understanding while a small amount of work examines audio captioning.

Audio captioning Decoder +6

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.