Spoken Language Understanding

118 papers with code • 5 benchmarks • 14 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Spoken Language Understanding

Dataset	Best Model	Compare
Fluent Speech Commands	Finstreder (Conformer + AMT, character-based)	See all
Snips-SmartLights	Finstreder (Conformer, character-based)	See all
Snips-SmartSpeaker	Finstreder (Conformer, character-based)	See all
Spoken-SQuAD	ALBERT	See all
Timers and Such	Finstreder (Conformer)	See all

Libraries

Use these libraries to find Spoken Language Understanding models and implementations

espnet/espnet

5 papers

7,903

CoraJung/flexible-input-slu

3 papers

speechbrain/speechbrain

2 papers

7,911

alibaba-damo-academy/FunASR

2 papers

3,393

See all 5 libraries.

Datasets

Subtasks

Spoken language identification

Latest papers with no code

Most implemented Social Latest No code

Creating Spoken Dialog Systems in Ultra-Low Resourced Settings

no code yet • 11 Dec 2023

We build on existing light models for intent classification in Flemish, and our main contribution is applying different augmentation techniques on two levels -- the voice level, and the phonetic transcripts level -- to the existing models to counter the problem of scarce labeled data in low-resource languages.

Paper
Add Code

Leveraging cache to enable SLU on tiny devices

no code yet • 30 Nov 2023

Our idea is simple: let the device match new inputs against cached results, and only offload unmatched inputs to the cloud for full inference.

Paper
Add Code

Co-guiding for Multi-intent Spoken Language Understanding

no code yet • 22 Nov 2023

For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning, which considers the two tasks' mutual guidances in the contrastive learning procedure.

Paper
Add Code

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

no code yet • 19 Nov 2023

Specifically, in fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively, aiming to iteratively share knowledge between these two models.

Paper
Add Code

Generalized zero-shot audio-to-intent classification

no code yet • 4 Nov 2023

Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2. 75% and 18. 2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.

Paper
Add Code

Toward Joint Language Modeling for Speech Units and Text

no code yet • 12 Oct 2023

However, in the field of language modeling, very little effort has been made to model them jointly.

Paper
Add Code

Few-Shot Spoken Language Understanding via Joint Speech-Text Models

no code yet • 9 Oct 2023

Recent work on speech representation models jointly pre-trained with text has demonstrated the potential of improving speech representations by encoding speech and text in a shared space.

Paper
Add Code

Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis

no code yet • 9 Oct 2023

For SLU, LaSyn improves our E2E baseline by absolute 4. 1% for intent classification accuracy and 3. 8% for slot filling SLU-F1 on SLURP, and absolute 4. 49% and 2. 25% for exact match (EM) and EM-Tree accuracies on STOP respectively.

Paper
Add Code

Continual Contrastive Spoken Language Understanding

no code yet • 4 Oct 2023

In this paper, we investigate the problem of learning sequence-to-sequence models for spoken language understanding in a class-incremental learning (CIL) setting and we propose COCONUT, a CIL method that relies on the combination of experience replay and contrastive learning.

Paper
Add Code

UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions

no code yet • 4 Oct 2023

Recent studies leverage large language models with multi-tasking capabilities, using natural language prompts to guide the model's behavior and surpassing performance of task-specific models.

Paper
Add Code

Spoken Language Understanding

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result