Search Results for author: Dongfu Jiang

Found 5 papers, 3 papers with code

VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation

no code implementations • 22 Dec 2023 • Max Ku, Dongfu Jiang, Cong Wei, Xiang Yue, Wenhu Chen

We evaluate VIESCORE on seven prominent tasks in conditional image tasks and found: (1) VIESCORE (GPT4-v) achieves a high Spearman correlation of 0. 3 with human evaluations, while the human-to-human correlation is 0. 45.

Conditional Image Generation General Knowledge

Paper
Add Code

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

2 code implementations • 27 Nov 2023 • Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.

Complex Query Answering Logical Reasoning +1

7,139

Paper
Code

TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks

1 code implementation • 1 Oct 2023 • Dongfu Jiang, Yishan Li, Ge Zhang, Wenhao Huang, Bill Yuchen Lin, Wenhu Chen

To quantitatively assess our metric, we evaluate its correlation with human ratings on 5 held-in datasets, 2 held-out datasets and show that TIGERScore can achieve the open-source SoTA correlation with human ratings across these datasets and almost approaches GPT-4 evaluator.

Text Generation

Paper
Code

LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion

3 code implementations • 5 Jun 2023 • Dongfu Jiang, Xiang Ren, Bill Yuchen Lin

We present LLM-Blender, an ensembling framework designed to attain consistently superior performance by leveraging the diverse strengths of multiple open-source large language models (LLMs).

786

Paper
Code

PairReranker: Pairwise Reranking for Natural Language Generation

no code implementations • 20 Dec 2022 • Dongfu Jiang, Bill Yuchen Lin, Xiang Ren

Pre-trained language models have been successful in natural language generation (NLG) tasks.

Machine Translation Text Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.