no code implementations • 4 May 2024 • Thomas Yu CHow Tam, Sonish Sivarajkumar, Sumit Kapoor, Alisa V Stolyar, Katelyn Polanska, Karleigh R McCarthy, Hunter Osterhoudt, Xizhi Wu, Shyam Visweswaran, Sunyang Fu, Piyush Mathur, Giovanni E. Cacciamani, Cong Sun, Yifan Peng, Yanshan Wang
This review provides a comprehensive overview of the human evaluation approaches used in diverse healthcare applications. This analysis examines the human evaluation of LLMs across various medical specialties, addressing factors such as evaluation dimensions, sample types, and sizes, the selection and recruitment of evaluators, frameworks and metrics, the evaluation process, and statistical analysis of the results.
no code implementations • 14 Sep 2022 • Hunter Osterhoudt, Courtney E. Schneider, Haneef A Mohammad, Minmei Shih, Alexandra E. Harper, Leming Zhou, Elizabeth R Skidmore, Yanshan Wang
Although the fidelity assessment for detecting guided and directed verbal cues is valid and feasible for single-site studies, it can become labor intensive, time consuming, and expensive in large, multi-site pragmatic trials.