Search Results for author: Steffi Chern

Found 4 papers, 4 papers with code

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate

1 code implementation • 30 Jan 2024 • Steffi Chern, Ethan Chern, Graham Neubig, PengFei Liu

Despite the utility of Large Language Models (LLMs) across a wide range of tasks and scenarios, developing a method for reliably evaluating LLMs across varied contexts continues to be challenging.

Paper
Code

Combating Adversarial Attacks with Multi-Agent Debate

1 code implementation • 11 Jan 2024 • Steffi Chern, Zhen Fan, Andy Liu

While state-of-the-art language models have achieved impressive results, they remain susceptible to inference-time adversarial attacks, such as adversarial prompts generated by red teams arXiv:2209. 07858.

Language Modelling

Paper
Code

Align on the Fly: Adapting Chatbot Behavior to Established Norms

1 code implementation • 26 Dec 2023 • Chunpu Xu, Steffi Chern, Ethan Chern, Ge Zhang, Zekun Wang, Ruibo Liu, Jing Li, Jie Fu, PengFei Liu

In this paper, we aim to align large language models with the ever-changing, complex, and diverse human values (e. g., social norms) across time and locations.

Chatbot

Paper
Code

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

4 code implementations • 25 Jul 2023 • I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu

With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).

Code Generation Fact Checking +1

763

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.