Search Results for author: Renren Jin

Found 10 papers, 3 papers with code

LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models

no code implementations • 19 Mar 2024 • Chuang Liu, Renren Jin, Yuqi Ren, Deyi Xiong

Current datasets collect questions from Chinese examinations across different subjects and educational levels to address this issue.

Multiple-choice

Paper
Add Code

OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety

no code implementations • 18 Mar 2024 • Chuang Liu, Linhao Yu, Jiaxuan Li, Renren Jin, Yufei Huang, Ling Shi, Junhui Zhang, Xinmeng Ji, Tingting Cui, Tao Liu, Jinwang Song, Hongying Zan, Sun Li, Deyi Xiong

In addition to these benchmarks, we have implemented a phased public evaluation and benchmark update strategy to ensure that OpenEval is in line with the development of Chinese LLMs or even able to provide cutting-edge benchmark datasets to guide the development of Chinese LLMs.

Benchmarking Mathematical Reasoning

Paper
Add Code

FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models

no code implementations • 12 Mar 2024 • Yan Liu, Renren Jin, Lin Shi, Zheng Yao, Deyi Xiong

We conduct extensive experiments on a wide range of LLMs on FineMath and find that there is still considerable room for improvements in terms of mathematical reasoning capability of Chinese LLMs.

Math Mathematical Reasoning

Paper
Add Code

Do Large Language Models Mirror Cognitive Language Processing?

no code implementations • 28 Feb 2024 • Yuqi Ren, Renren Jin, Tongxuan Zhang, Deyi Xiong

We employ Representational Similarity Analysis (RSA) to mearsure the alignment between 16 mainstream LLMs and fMRI signals of the brain.

Chatbot Logical Reasoning +1

Paper
Add Code

A Comprehensive Evaluation of Quantization Strategies for Large Language Models

no code implementations • 26 Feb 2024 • Renren Jin, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong

Our experimental results indicate that LLMs with 4-bit quantization can retain performance comparable to their non-quantized counterparts, and perplexity can serve as a proxy metric for quantized LLMs on most benchmarks.

Language Modelling Quantization

Paper
Add Code

FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models

no code implementations • 16 Nov 2023 • Yimin Jing, Renren Jin, Jiahao Hu, Huishi Qiu, Xiaohua Wang, Peng Wang, Deyi Xiong

In pursuit of this goal, various benchmarks have been constructed to evaluate the instruction-following capacity of these models.

Instruction Following Logical Reasoning

Paper
Add Code

Evaluating Large Language Models: A Comprehensive Survey

1 code implementation • 30 Oct 2023 • Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong

We hope that this comprehensive overview will stimulate further research interests in the evaluation of LLMs, with the ultimate goal of making evaluation serve as a cornerstone in guiding the responsible development of LLMs.

575

Paper
Code

Large Language Model Alignment: A Survey

no code implementations • 26 Sep 2023 • Tianhao Shen, Renren Jin, Yufei Huang, Chuang Liu, Weilong Dong, Zishan Guo, Xinwei Wu, Yan Liu, Deyi Xiong

We also envision bridging the gap between the AI alignment research community and the researchers engrossed in the capability exploration of LLMs for both capable and safe LLMs.

Language Modelling Large Language Model

Paper
Add Code

M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models

1 code implementation • 17 May 2023 • Chuang Liu, Renren Jin, Yuqi Ren, Linhao Yu, Tianyu Dong, Xiaohan Peng, Shuting Zhang, Jianxiang Peng, Peiyi Zhang, Qingqing Lyu, Xiaowen Su, Qun Liu, Deyi Xiong

Comprehensively evaluating the capability of large language models in multiple tasks is of great importance.

Instruction Following Multiple-choice +1

Paper
Code

Informative Language Representation Learning for Massively Multilingual Neural Machine Translation

1 code implementation • COLING 2022 • Renren Jin, Deyi Xiong

Experiment results on two datasets for massively multilingual neural machine translation demonstrate that language-aware multi-head attention benefits both supervised and zero-shot translation and significantly alleviates the off-target translation issue.

Machine Translation Navigate +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.