Together AI Releases DeepSWE: A Fully Open-Source RL-Trained Coding Agent Based on Qwen3-32B and Achieves 59% on SWEBench

0 3 minutes read

1751532348 Together AI Releases DeepSWE A Fully Open Source RL Trained Coding Agent.png

AI TOGETHER Deepswe, a fully open and fully open software engineering agent that is fully trained through reinforcement learning (RL). Deepswe is designed at the top of the QWEN3-32B language model, which achieves 59 % of accuracy on the normative standard for SWEBENCH and 42.2 % Pass@1, and topped the leading plate between open-weight models. This launch represents a major shift in the relationship of artificial intelligence together, starting from the traditional pre -traditional pipelines towards creating independent language factors that learn and constantly improve through reactions in the real world.

Reinforce learning meets the generation of the code

Deepswe is a result of the training of the QWEN3-32B Foundation model using RLM, standard standard learning framework for Agentica specially designed for language agents. Unlike the traditional control curricula, RLM enables the agents to adapt to the progress of work in the real world through experience. DeepsWe has been specially trained on solving complex software engineering tasks using a comments -based loop instead of fixed data collections.

The training pipeline includes the Agentica-R2EGYM data set, which is the software engineering index designed to develop the agent similar to the RL. Framework focuses on training language models with goals directed towards work, such as fixing errors, completing jobs, and editing code, rather than just predicting Textken distributions. This corresponds to DeepsWe more closely with how human beings are repeated and learning from results.

Performance standards and capabilities

On Swebench-Veristeed, the most stringent standard for software engineering agents, DEPSWE 59 % saves the test time. This greatly outperforms the previous models of open weight. In Pass@1 reviews – which measures the possibility of the worker to solve properly problem in the first attempt – Deepswe reaches 42.2 %.

These results emphasize the RL -based training strength in promoting agent behavior, especially in areas that require repetitive thinking and fine outputs, such as creating code. The structure of the model, inherited from QWEN3-32B, allows to expand effectively while surviving applications in the real world.

Open source and cloning in its essence

One of the prominent features of this version is its full transparency. Together from AI and Agentica, not only Deepswe Model but also the entire training recipe, including the RLM frame, the R2EGYM data set, and the training of training. This enhances the cloning and calls for the broader research communities and developers to extend or build on Deepswe without restrictions.

Developers can access Deepswe and RLM with the following:

From the reasons for the language to the language agents

Deepswe is a philosophical and practical transformation: from building models that cause language to construction agents who learn through interaction. Traditional LLMS has shown strong thinking capabilities, but often lacks the ability to adapt to counterfeit nutrition or improve with use. Reinforcement learning enables these models not only good performance at all but to improve over time, and adapt to new distributions and fields of problems.

This approach also opens the door for local publication. Since Deepswe is completely open source and normative, it can be extended and re -trained for organization’s use cases. Developers and researchers can build their own agents at the head of Deepswe using RLM to serve various fields such as web mobility, robots or independent research assistance.

conclusion

Deepswe is a milestone sign in the development of obstetric intelligence of software engineering. By applying learning reinforcement on large linguistic models such as QWEN3-32B and issuing the entire training infrastructure, AI together enables a future as agents are not included only and published, but coach and constantly improved. This jump from understanding the language to the agency directed towards work has significant effects through programming, automation and smart system design.

All the credit for this research goes to researchers in this project. Also, do not hesitate to follow us twitter And do not forget to join 100K+ ML Subreddit And subscribe to Our newsletter.

Asif Razzaq is the CEO of Marktechpost Media Inc .. As a pioneer and vision engineer, ASIF is committed to harnessing the potential of artificial intelligence for social goodness. His last endeavor is to launch the artificial intelligence platform, Marktechpost, which highlights its in -depth coverage of machine learning and deep learning news, which is technically sound and can be easily understood by a wide audience. The platform is proud of more than 2 million monthly views, which shows its popularity among the masses.