Search Results for author: Yassir Fathullah

Found 13 papers, 1 papers with code

Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models

no code implementations20 Mar 2024 Adian Liusie, Yassir Fathullah, Mark J. F. Gales

Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities and versatility in NLP tasks, however they sometimes fail to maintain crucial invariances for specific tasks.

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs

no code implementations12 Nov 2023 Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of original LLM capabilities, without using any carefully curated paired data.

Question Answering

End-to-End Speech Recognition Contextualization with Large Language Models

no code implementations19 Sep 2023 Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen

Overall, we demonstrate that by only adding a handful number of trainable parameters via adapters, we can unlock contextualized speech recognition capability for the pretrained LLM while keeping the same text-only input functionality.

Language Modelling speech-recognition +1

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

no code implementations5 Sep 2023 Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Results demonstrate that our TODM Supernet either matches or surpasses the performance of manually tuned models by up to a relative of 3% better in word error rate (WER), while efficiently keeping the cost of training many models at a small constant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Prompting Large Language Models with Speech Recognition Abilities

no code implementations21 Jul 2023 Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Furthermore, we perform ablation studies to investigate whether the LLM can be completely frozen during training to maintain its original capabilities, scaling up the audio encoder, and increasing the audio encoder striding to generate fewer embeddings.

Abstractive Text Summarization Automatic Speech Recognition +3

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

1 code implementation8 Jun 2023 Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales

In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting.

Multi-Head State Space Model for Speech Recognition

no code implementations21 May 2023 Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales

State space models (SSMs) have recently shown promising results on small-scale sequence and language modelling tasks, rivalling and outperforming many attention-based approaches.

Language Modelling speech-recognition +1

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

no code implementations9 May 2023 Yassir Fathullah, Puria Radmard, Adian Liusie, Mark J. F. Gales

In these scenarios, where for example knowing the quality of a system's output to predict poor performance prevails over knowing the output itself, is it possible to bypass the autoregressive decoding?

Attribute Automatic Speech Recognition +4

Self-Distribution Distillation: Efficient Uncertainty Estimation

no code implementations15 Mar 2022 Yassir Fathullah, Mark J. F. Gales

Furthermore it is possible to build ensembles of these models and apply hierarchical ensemble distillation approaches.

Out-of-Distribution Detection

Subsequence Based Deep Active Learning for Named Entity Recognition

no code implementations ACL 2021 Puria Radmard, Yassir Fathullah, Aldo Lipani

Active Learning (AL) has been successfully applied to Deep Learning in order to drastically reduce the amount of data required to achieve high performance.

Active Learning named-entity-recognition +3

Ensemble Distillation Approaches for Grammatical Error Correction

no code implementations24 Nov 2020 Yassir Fathullah, Mark Gales, Andrey Malinin

It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.

Grammatical Error Correction

Improved Large-margin Softmax Loss for Speaker Diarisation

no code implementations10 Nov 2019 Yassir Fathullah, Chao Zhang, Philip C. Woodland

Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers.

Cannot find the paper you are looking for? You can Submit a new open access paper.