no code implementations • 26 Mar 2024 • Masamune Kobayashi, Masato Mita, Mamoru Komachi
Large Language Models (LLMs) have been reported to outperform existing automatic evaluation metrics in some tasks, such as text summarization and machine translation.
1 code implementation • 5 Mar 2024 • Masamune Kobayashi, Masato Mita, Mamoru Komachi
The results of improved correlations by aligning the granularity in the sentence-level meta-evaluation, suggest that edit-based metrics may have been underestimated in existing studies.