Search Results for author: Wenjin Yao

Found 2 papers, 2 papers with code

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

2 code implementations • 9 Apr 2024 • Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang

The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency.

Fairness Language Modelling +1

159

Paper
Code

KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

2 code implementations • 23 Feb 2024 • Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang

Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness.

20,050

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.