AI

Microsoft AI Released Phi-4-Reasoning: A 14B Parameter Open-Weight Reasoning Model that Achieves Strong Performance on Complex Reasoning Tasks

Despite the prominent developments in the LLMS models (LLMS), the effective performance of the extensive inference tasks-such as solving mathematical problems, algorithm planning, or coding-restricted the size of the model, training methodology, and inference time capabilities. Models that well lack NLP general criteria often lack the ability to build multiple -step thinking chains or think about intermediate problems solving. Moreover, while increasing the size of the model can improve the ability to think, it provides exorbitant calculations and publishing costs, especially for applied use in education, engineering and decision support systems.

Microsoft launches the phi-4 thinking model

Microsoft recently presented the logical PHI-4 family, which consists of three models-Phi-4-REStingand Phi-4-RASTER-PlusAnd Phi-4-MINI-RESTINING. These models are derived from the Phi-4 base (14B parameters) and are specifically trained to deal with complex thinking tasks in mathematics and scientific fields and solve problems related to programs. Each variable treats different symptoms between mathematical efficiency and output accuracy. Phi-4-CESTENING is improved by supervising control, while this PHI-4-Eracting extends through results-based reinforcement, especially targeting improved performance in high-contrast tasks such as mathematics at the level of competition.

Open weight models are released with transparent training details and evaluation records, including standard design, and they are hosted on embrace for cloning and public access.

Technical composition and systematic progress

PHI-4-RESTERING models depend on the PHI-4 structure with targeted improvements to the behavior system and the training system. The main methodological decisions include:

  • Soft regulator refinement (SFT): More than 1.4 million claims have been coordinated with a focus on “border” cases-problems on the edge of the PHI-4 foundation. The claims were obtained and nominated to emphasize multi-step thinking instead of realistic summons, and responses were created industrially using O3-MINI in a highly operational mode.
  • Coordination of Series of Ideas: To facilitate organized thinking, models have been trained to generate directing with an explicit use Signs, encourage the separation of the effects of thinking and the final answers.
  • Extensive context treatment: The cord base frequency has been adjusted to support a symbolic context window of 32 km, which allows the effects of deeper solutions, especially related to multi -turn or long -shape questions.
  • Reinforcement learning (Phi-4-RASTER-Plus): Using the Group Relative policy (GRPO), the Phi-4-Plus-Plus has been improved on a small group of mathematics-focused problems. The bonus function is designed in favor of the correct, bright and organized outputs, while punishing violations of action, repetition and violations.

The training system that focuses on data and coordination is better supports the use of inference time and the generalization of models across fields, including invisible symbolic thinking problems.

Comparative evaluation and performance

Through a wide range of thinking criteria, Phi-4-RESTENING and Phi-4-Rasing offers competitive results in relation to much more open models:

PHI-4-RASTER-Plus shows a strong performance not only on the field assessments but also depends on planning and consensual problems such as TSP and 3Sat, although not explicit training in these areas. Performance gains have also been observed in the follow -up of the IFEVAL and QA long -context (Flenqa), indicating that the formulation of the idea series improves a wider model benefit.

More importantly, Microsoft reports of full contrast distributions across the generation of more than 50 years of sensitive data groups such as AIME 2025, revealing that the Phi-4-RESTINING-Plus exceeds or exceeds the consistency of models such as O3-MINI, while continuing to dismantle smaller basic distributions such as Deepseek-R1-Distill.

AI-released-phi-4-reasoning-a-14b-parameter-open-weight-reasoning-model-that-achieves-strong-performance-on-complex-reasoning-tasks/screenshot-2025-04-30-at-11-46-10-pm-2/" data-orig-file="https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1.png" data-orig-size="1562,1218" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="Screenshot 2025-04-30 at 11.46.10 PM" data-image-description="" data-image-caption="" data-medium-file="https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-300x234.png" data-large-file="https://xnn24.com/wp-content/uploads/2025/05/1746088172_514_Microsoft-AI-Released-Phi-4-Reasoning-A-14B-Parameter-Open-Weight-Reasoning-Model.png" src="https://xnn24.com/wp-content/uploads/2025/05/1746088172_514_Microsoft-AI-Released-Phi-4-Reasoning-A-14B-Parameter-Open-Weight-Reasoning-Model.png" alt="" class="wp-image-71003" style="width:764px;height:auto" srcset="https://xnn24.com/wp-content/uploads/2025/05/1746088172_514_Microsoft-AI-Released-Phi-4-Reasoning-A-14B-Parameter-Open-Weight-Reasoning-Model.png 1024w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-300x234.png 300w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-768x599.png 768w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-1536x1198.png 1536w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-539x420.png 539w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-150x117.png 150w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-696x543.png 696w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1-1068x833.png 1068w, https://www.marktechpost.com/wp-content/uploads/2025/04/Screenshot-2025-04-30-at-11.46.10 PM-1.png 1562w" sizes="(max-width: 1024px) 100vw, 1024px"/>

Conclusion and effects of the consequences

Phi-4 thinking models are systematically strictly to enhance the capabilities of small models in organized thinking. By combining data -centered training, architectural control, and a well -targeted but targeted learning, Microsoft explains that 14B models can coincide or outperform much greater systems in tasks that require multi -step conclusion and generalization.

The availability of the open weight of models and transparent standards shows a precedent for future development in small LLMS, especially for applied areas where the ability to interpret, cost and reliability is very important. Future action is expected to expand the capabilities of thinking to additional STEM fields, improving decoding strategies, and exploring developmental reinforcement learning on longer horizons.


verify Paper, Hugingface and Microsoft blog. Also, do not forget to follow us twitter And join us Telegram channel and LinkedIn GrOup. Don’t forget to join 90k+ ml subreddit.

🔥 [Register Now] The virtual Minicon Conference on Agency AI: Free Registration + attendance Certificate + 4 hours short (May 21, 9 am- Pacific time)


Asif Razzaq is the CEO of Marktechpost Media Inc .. As a pioneer and vision engineer, ASIF is committed to harnessing the potential of artificial intelligence for social goodness. His last endeavor is to launch the artificial intelligence platform, Marktechpost, which highlights its in -depth coverage of machine learning and deep learning news, which is technically sound and can be easily understood by a wide audience. The platform is proud of more than 2 million monthly views, which shows its popularity among the masses.

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-05-01 06:53:00

Related Articles

Back to top button