TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Automated Theorem Proving	miniF2F-test	LEGO-Prover ChatGPT	Pass@100	47.1	# 1
Automated Theorem Proving	miniF2F-valid	LEGO-Prover ChatGPT	Pass@100	57.0	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lego-prover-neural-theorem-proving-with/automated-theorem-proving-on-minif2f-test)](https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-test?p=lego-prover-neural-theorem-proving-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lego-prover-neural-theorem-proving-with/automated-theorem-proving-on-minif2f-valid)](https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-valid?p=lego-prover-neural-theorem-proving-with)`

LEGO-Prover: Neural Theorem Proving with Growing Libraries

1 Oct 2023 · Haiming Wang, Huajian Xin, Chuanyang Zheng, Lin Li, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo Li, Heng Liao, Xiaodan Liang ·

Despite the success of large language models (LLMs), the task of theorem proving still remains one of the hardest reasoning tasks that is far from being fully solved. Prior methods using language models have demonstrated promising results, but they still struggle to prove even middle school level theorems. One common limitation of these methods is that they assume a fixed theorem library during the whole theorem proving process. However, as we all know, creating new useful theorems or even new theories is not only helpful but crucial and necessary for advancing mathematics and proving harder and deeper results. In this work, we present LEGO-Prover, which employs a growing skill library containing verified lemmas as skills to augment the capability of LLMs used in theorem proving. By constructing the proof modularly, LEGO-Prover enables LLMs to utilize existing skills retrieved from the library and to create new skills during the proving process. These skills are further evolved (by prompting an LLM) to enrich the library on another scale. Modular and reusable skills are constantly added to the library to enable tackling increasingly intricate mathematical problems. Moreover, the learned library further bridges the gap between human proofs and formal proofs by making it easier to impute missing steps. LEGO-Prover advances the state-of-the-art pass rate on miniF2F-valid (48.0% to 57.0%) and miniF2F-test (45.5% to 47.1%). During the proving process, LEGO-Prover also manages to generate over 20,000 skills (theorems/lemmas) and adds them to the growing library. Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47.1% to 50.4%. We also release our code and all the generated skills.

PDF Abstract

Code

Add Remove Mark official

wiio12/LEGO-Prover official

Tasks

Add Remove

Automated Theorem Proving

Datasets

MiniF2F

Results from the Paper

Edit

Ranked #1 on Automated Theorem Proving on miniF2F-test (Pass@100 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Automated Theorem Proving	miniF2F-test	LEGO-Prover ChatGPT	Pass@100	47.1	# 1		Compare
Automated Theorem Proving	miniF2F-valid	LEGO-Prover ChatGPT	Pass@100	57.0	# 1		Compare

Methods

Add Remove

Library

Edit Social Preview

LEGO-Prover: Neural Theorem Proving with Growing Libraries

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove