HOLY SMOKES! A new, 200% faster DeepSeek R1-0528 variant appears from German lab TNG Technology Consulting GmbH

0 6 minutes read

1751592895 HOLY SMOKES A new 200 faster DeepSeek R1 0528 variant appears.png

Want more intelligent visions of your inbox? Subscribe to our weekly newsletters to get what is concerned only for institutions AI, data and security leaders. Subscribe now

A little more than a month has passed since the Chinese Deepseek AI, a subsidiary of Hong Kong’s high capital management, has released its latest open-style version, R1-0528.

Like its predecessor, Deepseek-R1-which rocked Amnesty International and Worldwriting with its license, and the extent of its performance in thinking tasks, all developers and developers are already adapted, due to a large part of R1-0528.

This week, the 24-year-old TNG Technology Consulting GmbH released one-year-old: Deepseek-tNG R1T2 Chimera, the latest model in the Grand Language Family (LLM). R1T2 provides a noticeable batch of efficiency and speed, and a registration above 90 % of the intelligence standard R1-0528While generating answers with Less than 40 % of the distinctive number of R1-0528.

This means that it produces shorter responses, and translated directly into The fastest conclusion and low account costs. On the model card released for the new R1T2 on the face of the Society of Artificial Intelligence Code, the company states that it is “20 % faster than the regular R1” (which was released in January) and “more than twice the R1-0528” (the official update of Deepseek).

Indeed, the response was incredibly positive from the community of artificial intelligence developer. “Damn! Deepseek R1T2-200 % faster than R1-0528 and 20 % faster than R1”, Vaibhav (VB) books Srivastav, a great leader in the face of embrace, on X.

This gain has become possible through the TNE-assembly assembly method (AOE) is a technique for building LLMS by merging the weight of weight selectively (internal parameters) from multiple models previously trained in a paper published in May on ARXIV, which is an open online access magazine that has reviewed an unpopular review.

The R1T2 successor to the original R1T Chimera introduces a new composition “TI-Mind” that merges three models of parents: Deepseek-R1-0528, Deepseek-R1 and Deepseek-V3-0324. The result is an engineering model to maintain high thinking capacity while significantly reduces the cost of reasoning.

R1T2 is built without further conversion or re -training. It inherits the power of thinking from R1-0528, patterns of regulating R1, and brief behavior directed towards the instructions V3-0324- presenting a more efficient and able model that is able to use institutions and research.

How expert assembly (AOE) differs from the mixture experts (MEE)

The experience of experience (MEE) is an architectural design in which different ingredients, or “experts” are activated, conditionally. In Moe Llms like Deepseek-V3 or Mixtral, only a sub-set of model experts (for example, 8 of 256) active during the given symbol gave corridors. This allows very large models to achieve higher census and devote parameters while maintaining management inference costs – because only a small part of the network is evaluated for each symbol.

Expert assembly (AOE) is the technique of combining the model, not structure. It is used to create a new model of multiple MEE models pre -training by selectively to meet weight tensions.

“Experts” in AOE refers to the components of the form that are combined – usually experts in the MEE layers – experts are not stimulated dynamically at the time of operation.

The AOE app mainly focuses on integrating the tensioner of the directed experts-part of the most responsible model for specialized thinking-with the maintenance of common layers and the most efficient attention of faster models such as V3-0324. This approach enables illusion models that inherit the strength of thinking without repeating a process or cumin for the strongest parental models.

Performance and speed: What the standards actually show

According to the standard comparisons provided by TNG, R1T2 achieves between 90 % and 92 % Among the logical performance of her most intelligent parents, Deepseek-R1-0528, as it was measured by AIME-24, AIME-25 and GPQA-Diamond test groups.

However, unlike Deepseek-R1-0528-which tends to produce detailed long answers due to the extended thinking chain-R1T2 is designed to be more brief. It offers similar smart responses while using much lower words.

Instead of focusing on the time of raw treatment or symbols per second, it measures the “speed” TNG in terms of The number of the distinctive symbol for each answer – A practical agent for both costs and cumin. According to the criteria for the TNG, R1T2 creates responses using About 40 % of the symbols Required by R1-0528.

This translates into a 60 % decrease in the length of the outputIt reduces directly from the time of reasoning and pregnancy account, and the speed of responses by 2X, or 200 %.

When compared to the original Deepseek-R1, R1T2 is also present 20 % more brief on averageOffering significant gains in efficiency for highly productive or sensitive publishing operations.

This efficiency does not come at the expense of intelligence. As shown in the standard scheme submitted in the TNG art paper, R1T2 sits in a desirable area on the intelligence cost curve against. It maintains the quality of thinking while reducing the minimum act – a decisive result of institutions’ applications where the speed of reasoning and productivity is all.

Publishing considerations and availability

R1T2 is released under the Massachusetts Institute of Technology Licensing and is now available in embrace, which means that it is open source and available for its use and its content in commercial applications.

TNG notes that although the model is completely suitable for general thinking tasks, it is currently not recommended for use cases that require summoning the job or using the tool, due to its inherited restrictions from Deepseek-R1 lineage. These can be addressed in future updates.

The company also advises European users to assess compliance with European Union AI law, which enters into force on August 2, 2025.

The institutions working in the European Union must review the relevant provisions or consider stopping the use of the form after that date if it is not possible to meet the requirements.

However, American companies operate locally and serve users in the United States, or those in other countries no Taking into account the conditions of the European Union’s International Union Law, which must give them great flexibility when using and publishing this open open source open source. If they are serving European Union users, some provisions of the European Union law will remain.

TNG has already provided previous Chimera variables through platforms like OpenROUTER and Chues, where she was said to have treated billions of symbols per day. The R1T2 version is an additional development in this audience’s availability.

About Tng Technology Consulting GmbH

Tng Technology Consulting GmbH was founded in January 2001, in Bavaria, Germany, and hires more than 900 people, with a great focus of doctorate and technical specialists.

The company focuses on developing software, artificial intelligence and Devops/Cloud, as it serves customers of major institutions through industries such as communications, insurance, cars, e -commerce and logistics.

TNG acts as a valuable consultative partnership. Its unique structure, which is based on the principles of operational research and self -management, supports the culture of artistic innovation.

It actively contributes to open source societies and research, as shown through public publications such as R1T2 and publishing expert assembly methodology.

What does this mean for the technical decision makers

For CTOS, AI platform owners, engineering strings, IT purchase teams, R1T2 offers concrete benefits and strategic options:

Low inference costs: With fewer distinctive symbols for each task, R1T2 reduces the time of graphics processing and energy consumption unit, and translates directly into infrastructure savings-especially in high productivity environments or actual time.
Quality of higher thinking without public expendituresIt maintains a lot of logical power of models with a higher level such as R1-0528, but without its long age. This is ideal for organized tasks (mathematics, programming, logic) where the brief answers are the best.
Open and modifiedThe Massachusetts Institute’s license allows fully controlling and allocating it, enabling special hosting, aligning models, or further training within organized or organized environments.
The emerging modelAOE approach proposes in the future in which the models are built normally, allowing institutions to assemble specialized variables by reinforcing the strengths in the existing models, instead of re -training from scratch.
Warnings: Institutions that rely on summoning jobs, using tools or the advanced agent coinciding with current restrictions, although future illusion updates may address these gaps.

TNG encourages researchers, developers and institutions of the Foundation to explore the model, test its behavior and make notes. R1T2 Chimera is available at Hugingface.co/tngtech/deeeek-tng-R1T2-chimra, technical inquiries can be directed to Research@tngtech.com.

For standard technical background, TNG search sheet is available in Arxiv: 2506.14794.

Daily visions about business use cases with VB daily

If you want to persuade your boss at work, you have covered VB Daily. We give you the internal journalistic precedence over what companies do with obstetric artificial intelligence, from organizational transformations to practical publishing operations, so that you can share visions of the maximum return on investment.

Read our privacy policy

Thanks for subscribing. Check more VB newsletters here.

An error occurred.