Search Results for author: Adian Liusie

Found 17 papers, 7 papers with code

WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models

no code implementations28 Mar 2024 Piotr Molenda, Adian Liusie, Mark J. F. Gales

Watermarking generative-AI systems, such as LLMs, has gained considerable interest, driven by their enhanced capabilities across a wide range of tasks.

nlg evaluation

Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models

no code implementations20 Mar 2024 Adian Liusie, Yassir Fathullah, Mark J. F. Gales

Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities and versatility in NLP tasks, however they sometimes fail to maintain crucial invariances for specific tasks.

Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment

no code implementations21 Feb 2024 Vyas Raina, Adian Liusie, Mark Gales

Large Language Models (LLMs) are powerful zero-shot assessors and are increasingly used in real-world situations such as for written exams or benchmarking systems.

Adversarial Robustness Benchmarking

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

no code implementations4 Jan 2024 Xiaoding Lu, Zongyi Liu, Adian Liusie, Vyas Raina, Vineet Mudupalli, Yuwen Zhang, William Beauchamp

In conversational AI research, there's a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT.

Investigating the Emergent Audio Classification Ability of ASR Foundation Models

1 code implementation15 Nov 2023 Rao Ma, Adian Liusie, Mark J. F. Gales, Kate M. Knill

Text and vision foundation models can perform many tasks in a zero-shot setting, a desirable property that enables these systems to be applied in general and low-resource settings.

Audio Classification Decoder +4

Assessing Distractors in Multiple-Choice Tests

no code implementations8 Nov 2023 Vatsal Raina, Adian Liusie, Mark Gales

Specifically, we define quality in terms of the incorrectness, plausibility and diversity of the distractor options.

Multiple-choice Reading Comprehension

Zero-shot Audio Topic Reranking using Large Language Models

no code implementations14 Sep 2023 Mengjie Qian, Rao Ma, Adian Liusie, Erfan Loweimi, Kate M. Knill, Mark J. F. Gales

A key element for this process is highly rapid, flexible, search to support large archives, which in MVSE is facilitated by representing video attributes by embeddings.

Information Retrieval Retrieval

Mitigating Word Bias in Zero-shot Prompt-based Classifiers

1 code implementation10 Sep 2023 Adian Liusie, Potsawee Manakul, Mark J. F. Gales

To address this problem, it is possible to optimise classification thresholds on a labelled data set, however, this mitigates some of the advantages of prompt-based classifiers.

Zero-Shot Learning

LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models

1 code implementation15 Jul 2023 Adian Liusie, Potsawee Manakul, Mark J. F. Gales

Current developments in large language models (LLMs) have enabled impressive zero-shot capabilities across various natural language tasks.

nlg evaluation Response Generation +1

Analyzing Multiple-Choice Reading and Listening Comprehension Tests

no code implementations3 Jul 2023 Vatsal Raina, Adian Liusie, Mark Gales

Multiple-choice reading and listening comprehension tests are an important part of language assessment.

Multiple-choice Reading Comprehension +1

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

1 code implementation8 Jun 2023 Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales

In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting.

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

no code implementations9 May 2023 Yassir Fathullah, Puria Radmard, Adian Liusie, Mark J. F. Gales

In these scenarios, where for example knowing the quality of a system's output to predict poor performance prevails over knowing the output itself, is it possible to bypass the autoregressive decoding?

Attribute Automatic Speech Recognition +4

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

3 code implementations15 Mar 2023 Potsawee Manakul, Adian Liusie, Mark J. F. Gales

In this work, we propose "SelfCheckGPT", a simple sampling-based approach that can be used to fact-check the responses of black-box models in a zero-resource fashion, i. e. without an external database.

Fact Checking Hallucination +1

Rewarding Chatbots for Real-World Engagement with Millions of Users

no code implementations10 Mar 2023 Robert Irvine, Douglas Boubert, Vyas Raina, Adian Liusie, Ziyi Zhu, Vineet Mudupalli, Aliaksei Korshuk, Zongyi Liu, Fritz Cremer, Valentin Assassi, Christie-Carol Beauchamp, Xiaoding Lu, Thomas Rialan, William Beauchamp

The proposed approach uses automatic pseudo-labels collected from user interactions to train a reward model that can be used to reject low-scoring sample responses generated by the chatbot model at inference time.

Chatbot Language Modelling

MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization

2 code implementations28 Jan 2023 Potsawee Manakul, Adian Liusie, Mark J. F. Gales

In this work, we introduce an alternative scheme based on standard information-theoretic measures in which the information present in the source and summary is directly compared.

Hallucination Multiple-choice +1

World Knowledge in Multiple Choice Reading Comprehension

1 code implementation13 Nov 2022 Adian Liusie, Vatsal Raina, Mark Gales

Two metrics are described: the expected number of options, which measures whether a passage-free system can identify the answer a question using world knowledge; and the contextual mutual information, which measures the importance of context for a given question.

General Knowledge Multiple-choice +2

Cannot find the paper you are looking for? You can Submit a new open access paper.