TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Accuracy	48.6	# 31
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Parameters (Billions)	34	# 26
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Accuracy	41	# 52
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Parameters (Billions)	13	# 38
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Accuracy	48.8	# 29
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Parameters (Billions)	7	# 58
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Accuracy	46.8	# 36
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Parameters (Billions)	7	# 58

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/key-point-driven-data-synthesis-with-its/math-word-problem-solving-on-math)](https://paperswithcode.com/sota/math-word-problem-solving-on-math?p=key-point-driven-data-synthesis-with-its)`

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

4 Mar 2024 · Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen ·

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets. Addressing this challenge, we propose Key-Point-Driven Data Synthesis (KPDDS), a novel data synthesis framework that synthesizes question-answer pairs by leveraging key points and exemplar practices from authentic data sources. KPDDS ensures the generation of novel questions with rigorous quality control and substantial scalability. As a result, we present KPMath, an extensive synthetic dataset tailored for mathematical reasoning, comprising over 800K question-answer pairs. Utilizing KPMath and augmenting it with additional reasoning-intensive corpora, we create the comprehensive KPMath-Plus dataset. The fine-tuned DeepSeekMath model on KPMath-Plus achieves zero-shot PASS@1 accuracies of 83.9% on GSM8K and 48.8% on MATH, and also reaches promising performance on other math reasoning datasets, outperforming competitors in the 7B to 70B range.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

GSM8K

Math

Mathematical Reasoning

Math Word Problem Solving

Datasets

GSM8K

MATH

SVAMP ASDiv MAWPS

Results from the Paper

Edit

Ranked #29 on Math Word Problem Solving on MATH

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Accuracy	48.6	# 31	Compare
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Parameters (Billions)	34	# 26	Compare
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Accuracy	41	# 52	Compare
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Parameters (Billions)	13	# 38	Compare
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Accuracy	48.8	# 29	Compare
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Parameters (Billions)	7	# 58	Compare
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Accuracy	46.8	# 36	Compare
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Parameters (Billions)	7	# 58	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove