Search Results for author: Yassir Fathullah

Found 13 papers, 1 papers with code

Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models

no code implementations • 20 Mar 2024 • Adian Liusie, Yassir Fathullah, Mark J. F. Gales

Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities and versatility in NLP tasks, however they sometimes fail to maintain crucial invariances for specific tasks.

Paper
Add Code

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs

no code implementations • 12 Nov 2023 • Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of original LLM capabilities, without using any carefully curated paired data.

Question Answering

Paper
Add Code

End-to-End Speech Recognition Contextualization with Large Language Models

no code implementations • 19 Sep 2023 • Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen

Overall, we demonstrate that by only adding a handful number of trainable parameters via adapters, we can unlock contextualized speech recognition capability for the pretrained LLM while keeping the same text-only input functionality.

Language Modelling speech-recognition +1

Paper
Add Code

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

no code implementations • 5 Sep 2023 • Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Results demonstrate that our TODM Supernet either matches or surpasses the performance of manually tuned models by up to a relative of 3% better in word error rate (WER), while efficiently keeping the cost of training many models at a small constant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Prompting Large Language Models with Speech Recognition Abilities

no code implementations • 21 Jul 2023 • Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Furthermore, we perform ablation studies to investigate whether the LLM can be completely frozen during training to maintain its original capabilities, scaling up the audio encoder, and increasing the audio encoder striding to generate fewer embeddings.

Abstractive Text Summarization Automatic Speech Recognition +3

Paper
Add Code

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

1 code implementation • 8 Jun 2023 • Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales

In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting.

Paper
Code

Multi-Head State Space Model for Speech Recognition

no code implementations • 21 May 2023 • Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales

State space models (SSMs) have recently shown promising results on small-scale sequence and language modelling tasks, rivalling and outperforming many attention-based approaches.

Ranked #8 on Speech Recognition on LibriSpeech test-clean

Language Modelling speech-recognition +1

Paper
Add Code

Logit-Based Ensemble Distribution Distillation for Robust Autoregressive Sequence Uncertainties

no code implementations • 17 May 2023 • Yassir Fathullah, Guoxuan Xia, Mark Gales

Efficiently and reliably estimating uncertainty is an important objective in deep learning.

Image Classification Out-of-Distribution Detection +1

Paper
Add Code

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

no code implementations • 9 May 2023 • Yassir Fathullah, Puria Radmard, Adian Liusie, Mark J. F. Gales

In these scenarios, where for example knowing the quality of a system's output to predict poor performance prevails over knowing the output itself, is it possible to bypass the autoregressive decoding?

Attribute Automatic Speech Recognition +4

Paper
Add Code

Self-Distribution Distillation: Efficient Uncertainty Estimation

no code implementations • 15 Mar 2022 • Yassir Fathullah, Mark J. F. Gales

Furthermore it is possible to build ensembles of these models and apply hierarchical ensemble distillation approaches.

Out-of-Distribution Detection

Paper
Add Code

Subsequence Based Deep Active Learning for Named Entity Recognition

no code implementations • ACL 2021 • Puria Radmard, Yassir Fathullah, Aldo Lipani

Active Learning (AL) has been successfully applied to Deep Learning in order to drastically reduce the amount of data required to achieve high performance.

Active Learning named-entity-recognition +3

Paper
Add Code

Ensemble Distillation Approaches for Grammatical Error Correction

no code implementations • 24 Nov 2020 • Yassir Fathullah, Mark Gales, Andrey Malinin

It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.

Grammatical Error Correction

Paper
Add Code

Improved Large-margin Softmax Loss for Speaker Diarisation

no code implementations • 10 Nov 2019 • Yassir Fathullah, Chao Zhang, Philip C. Woodland

Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.