AI

DeepSeek Unveils DeepSeek-Prover-V2: Advancing Neural Theorem Proving with Recursive Proof Search and a New Benchmark

Deepseek Ai has announced the release of Deepseek-PROVER-V2, a large open source language model specifically designed for the official theory that is fixed in the Lean 4 environment. This recent repetition depends on the previous work by presenting an innovative pipeline to provide theory, and benefits from the Deepseek-V3 power to create its high-quality preparation data. The resulting model achieves a newer performance in the neurological theory that is accompanied by the entry of Proverbench, a new standard for assessing mathematical thinking capabilities.

The main innovation of Deepseeek-PROVER-V2 is to perform a unique training to start cold. This process begins by pushing the strong Deepseek-V3 model to analyze complex mathematical theories into a series of more manageable sub-parts. At the same time, DeepSeek-V3 gives the official-level-level proof of the Lean 4, which effectively creates an organized sequence of sub-problems.

To deal with the search for an intense arithmetic proof of each branch, the researchers used a smaller model than 7B. Once all the decomposing steps of a difficult problem are proven, the full official proof is stepped step by step with the opposite Deepseek-V3 series. This innovative approach to the model allows learning from a collected data collection that integrates both unofficial sporting thinking at a high -level level and strict official proof, providing a strong cold start to learn subsequent reinforcement.

Based on the artificial cold starting data, Deepseek team sponsored a selection of difficult problems that the PROVER 7B model was unable to solve from end to end, but all sub -parts were successfully treated. By combining the official proofs of these sub -parts, a full guide for the original problem is created. Then this official guide is associated with the Deepseek-V3 series that defines the LEMMA analysis, which creates a unified training example of informal logic followed by the formal character.

Then the example model is set on these artificial data, followed by the reinforcement learning stage. This stage uses the correct bilateral reactions or correction as a reference to the reward, which improves the model’s ability to bridge the gap between informal sporting intuition and careful construction of official archeology.

The culmination of this innovative training process is Deepseek-PROVER-V2-671B, a model of 671 billion teachers. This model has achieved great results, which indicates the performance of the latest performance in the theory. I am impressive Passing 88.9 % on the minif2f test Successfully resolve 49 out of 658 problems from Putnambash. The proofs created by Deepseeek-PROVER-V2 are available for the MINIF2F data collection for the audience for download, allowing more audit and analysis.

In addition to the version of the form, presented Deepseek AI ProverbenchA new standard data collection that includes 325 problems. This standard is designed to provide a more comprehensive evaluation of sports thinking capabilities across different levels of difficulty.

Proverbench includes 15 problems The official nature of the AIME competitions (the American Da`wah Mathematics Exam) (AIME 24 and 25)And providing original challenges at the level of competition in high school. residual 310 problems derived from examples of textbooks and educational educational lessonsProviding a variety and educational aspect from the official sporting problems that extend various fields:

Proverbench aims to facilitate a more comprehensive evaluation of nervous theories resumed across all of the problems of difficult competition and basic mathematics at the university level.

Deepseek ai release Deepseek-prover-V2 in the sizes of two models to meet different mathematical resources: Teacher Model 7B and Teacher Model 671B. Deepseek-PROVER-V2-671B was built on the powerful basis of the Deepseek-V3 base. Deepseek-prover-V2-7B is built on the Deepseek-PROVER-V1.5 base and features a length of an extended context of up to 32 thousand symbols, allowing them to address the long and most complex thinking sequences.

The release of Deepseek-PROVER-V2 and the introduction of Proverbench represents an important step forward in the field of proving the theory. By taking advantage of the frequent search pipeline and introducing difficult new standards, Deepseek Ai enables society to develop and evaluate the most advanced and capable artificial intelligence systems capable of official mathematics.

Link : https: //hugingface.co/deepseek-ai/deepseek-prover-v2-671B

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-04-30 15:46:00

Related Articles

Back to top button