Automated Theorem Proving

70 papers with code • 10 benchmarks • 8 datasets

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Benchmarks

Add a Result

These leaderboards are used to track progress in Automated Theorem Proving

Dataset	Best Model	Compare
miniF2F-test	Thor + expert iteration on autoformalised theorems	See all
miniF2F-valid	Lean GPT-f	See all
HolStep (Conditional)	MPNN-DagLSTM	See all
HOList benchmark	4-hop GNN, sub-expression sharing	See all
HolStep (Unconditional)	FormulaNet	See all
Metamath set.mm	Evariste	See all
miniF2F-curriculum	Evariste-7d	See all
CompCert	Proverbot9001	See all
CoqGym	ASTactic	See all
LeanDojo Benchmark	ReProver	See all

Libraries

Use these libraries to find Automated Theorem Proving models and implementations

eleutherai/gpt-neox

2 papers

6,574

Datasets

Most implemented papers

Most implemented Social Latest No code

Holophrasm: a neural Automated Theorem Prover for higher-order logic

dwhalen/holophrasm • 8 Aug 2016

I propose a system for Automated Theorem Proving in higher order logic using deep learning and eschewing hand-constructed features.

Paper
Code

Proof Artifact Co-training for Theorem Proving with Language Models

jesse-michael-han/lean-step-public • ICLR 2022

Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built.

Paper
Code

Llemma: An Open Language Model For Mathematics

eleutherai/gpt-neox • • 16 Oct 2023

We present Llemma, a large language model for mathematics.

Paper
Code

HOList: An Environment for Machine Learning of Higher-Order Theorem Proving

tensorflow/deepmath • • 5 Apr 2019

We present an environment, benchmark, and deep learning driven automated theorem prover for higher-order logic.

Paper
Code

MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

openai/minif2f • ICLR 2022

We present miniF2F, a dataset of formal Olympiad-level mathematics problems statements intended to provide a unified cross-system benchmark for neural theorem proving.

Paper
Code

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

facebookresearch/minif2f • 21 Oct 2022

In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems.

Paper
Code

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

lean-dojo/leandojo • NeurIPS 2023

Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.

Paper
Code

DeepMath - Deep Sequence Models for Premise Selection

JUrban/deepmath • NeurIPS 2016

We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics.

Paper
Code