no code implementations • 16 Nov 2023 • Ziyi Liu, Isabelle Lee, Yongkang Du, Soumya Sanyal, Jieyu Zhao
In a plethora of recent work, large language models (LLMs) demonstrated impressive reasoning ability, but many proposed downstream reasoning tasks focus on performance-wise evaluation.