The initial reactions to OpenAI’s landmark open source gpt-oss models are highly varied and mixed

0 6 minutes read

1754530796 The initial reactions to OpenAIs landmark open source gpt oss models.png

Want more intelligent visions of your inbox? Subscribe to our weekly newsletters to get what is concerned only for institutions AI, data and security leaders. Subscribe now

The long -awaited Openai’s return to “open” the same name yesterday with the release of two new big languages (LLMS): GPT -SS-120B and GPT -SS-20B.

But despite the achievement of technical standards equally with the offers of other powerful artificial intelligence model in Openai, the wider artificial intelligence developer and the first user community The response has so far been throughout the map. If this version is a movie that was first shown and is classified on spoiled tomatoes, we will look at the division of approximately 50 %, based on my notes.

First, some background: Openai has released these two new two models (without generating photos or analysis) Both are under the permissible APache 2.0 license – The first time since 2019 (before ChatGPT) The company has done this with a sophisticated language model.

the The era of the entire ChatGPT from the past 2.7 years has been supported by ownership or closed source modelsThose controlled by Openai that users had to pay to reach (or use a free -bounded layer), with limited allocation and there is no way to run it in non -communication mode or on special computing devices.

Artificial intelligence limits its limits

Power caps, high costs of the symbol, and inference delay are reshaped. Join our exclusive salon to discover how the big difference:

Transforming energy into a strategic advantage

Teaching effective reasoning for real productivity gains

Opening the return on competitive investment with sustainable artificial intelligence systems

Securing your place to stay in the foreground: https://bit.ly/4mwngngo

But all this changed thanks to the release of a pair of GPT-SS yesterday, which is one larger and more powerful to use on the NVIDIA H100 graphics processing unit for example, which is a small or medium-sized data center or a server, and the smallest size that works on one laptop for the consumer or desktop like the type in your home.

Of course, the models are very new, which takes several hours for the community of artificial intelligence energy users to operate and test them independently on their individual standards (measurements) and tasks.

and Now we get a wave of comments ranging from optimistic enthusiasm About the capabilities of these strong, free and effective models To hidden from dissatisfaction and dismay what some users see important problems and restrictionsEspecially compared to the similar Apache 2.0 wave A strong open source, multimedia LLMS from Chinese startups (Which can also be taken, customized or operated locally on American devices for free by American companies or companies elsewhere around the world).

High standards, but still behind Chinese source leaders

Intelligence standards put the GPT -SS models before most of the American open source offers. According to AI AI’s independent artificial analysis, GPT -SS-120B is “the most intelligent open weight model”, although it is It is still less than Chinese weights like Deepseek R1 and QWEN3 235B.

“On thinking, that’s all they did. The standards were given to the criteria,” Deepseek “Stan” Ttoraxestex. “Good derivative models will not be trained … No new mobilization has been created … a boundary demand for the rights of bragging.”

This doubts are repeated by AI Open Source Open They wrote: “In general, with great disappointment, and I have opened legally for this.”

The bench on mathematics and coding at the expense of writing?

Other criticism focused on A clear narrow benefit for GPT-SS.

“Lisan Al Gaib (@scaling01) indicated that models excel in mathematics and coding, but” completely lacking taste and good sense. He added, “So it is just a model of mathematics?”

In creative writing tests, some users found equations to inject the model into poetic outputs. “This is what happens when you consolidate the MostMarkmax,” Teknium, and he shares a screenshot where the model added medium medium format.

@Kalomaze, a researcher at Prime Intellect, a researcher at Prime Intellect, wrote a researcher at GPT-SS-120B, about what 32B does well. Perhaps they wanted to avoid copyright problems, so they are likely to be trained in the majority of the majority. Very devastating things.

Kyle Corbitt, the former Googler developer and independent artificial intelligence developer, agreed that the GPT -SS models have been mainly trained in artificial data-that is, the data created by the artificial intelligence model specifically for new training purposes-which makes it “very thorny”.

Corbitt wrote, “It is great in the tasks that have been trained, very bad in everything else.” Wonderful in coding and mathematics problems, and badly in linguistic tasks such as creative writing or reports generation.

In other words, the accusation is that Openai has deliberately trained the model on more escalating data than the real facts of the world and numbers to avoid the use of copyright data that has been embodied from web sites and other warehouses that they do not have or have a license to use, which has been accused of many leading GEN AI companies in the past and face continuous laws continuously.

Others may have trained the model on artificial data primarily to avoid safety and security problems, which leads to worse quality than if he was trained in the world’s realistic data (and is supposed to be protected by copyrights).

Regarding the standard results of an external party

Moreover, the evaluation of the models appeared in an external body’s measurement test on standards in the eyes of some users.

Speech letter-which measures LLMS performance in compliance with users of user to create unprocessed, biased or politically sensitive outputs-compliance degrees of GPT -S 120b that hover under 40 %, Near the bottom models of the counterpart, This indicates resistance to the follow -up of the user requests and the disappearance in the handrails, and perhaps at the expense of providing accurate information.

In Aider’s Polyglot evaluation, GPT -SS-120B 41.8 % in multi-language thinking-Var is less competitors such as Kimi-K2 (59.1 %) and Deepseek-R1 (56.9 %).

Some users also said that their tests indicate that the model is strange to generate criticism of China or Russia, which contradicts its treatment of the United States and the European Union, raising questions about bias and liquidating training data.

Other experts applaud the release and what it refers to

To be fair, not all comments are negative. Software engineer and artificial intelligence observer, Simon Wilison, described a “truly impressive” version on X, clarified in a blog post on The efficiency of the models and their ability to achieve parity with OPENAI models for O3-MINI and O4-MINI.

He praised their strong performance on logical and secondary standards, and praised the coordination of the new “harmony” template-which provides developers with more organized terms to direct model responses-and support the use of the third party tool as meaningful contributions.

In a lengthy X Publication, Clem Delangue, CEO and co -founder of the participation of artificial intelligence code and the face of the open community on the face of users not to rush to the ruling, noting that the inference of these models is complicated, and the early issues may be due to the instability of the infrastructure and the lack of an actor between the hosts.

“The power of the open source is that there is no fraud,” Delangue wrote. “We will reveal all the strengths and restrictions … gradually.”

The most cautious is the Warton Business College at the University of Pennsylvania, Professor Ethan Malik, who wrote on X that “the United States is likely to have the leading open weight models (or close to it)”, but wondered whether this was once from Openai. “The introduction will evaporate quickly while catching others,” He pointed out that it is unclear what the incentives are Openai to maintain the update of the models.

Nathan Lambert, pioneer in Amnesty International at the Allen Institute, the opposing source of AI (AI2) and suspended, praised the symbolic importance of issuance on his intermediary blog, and called it “A huge step for the open ecosystem, especially for the West and its allies, The most famous brand in the artificial intelligence space has returned to launch models publicly. ”

But he warned about X that GPT -SS “It is unlikely to slow down [Chinese e-commerce giant Aliaba’s AI team] QWEN, Quoted from its ability to use, perform, and diversity.

He said the release represents an important shift in the United States towards open models, but Openai still has a “long path” to catch up with practice.

Divided

The ruling, at the present time, is divided.

Openai GPT -SS models are a permit a license and easy access.

But while the criteria seem solid, “feelings” in the real world-as many users describe-prove less convincing.

Whether the developers can build strong applications and derivatives at the head of GPT-SS, it will determine whether the version is remembered as a penetration or as a hole.

Daily visions about business use cases with VB daily

If you want to persuade your boss at work, you have covered VB Daily. We give you the internal journalistic precedence over what companies do with obstetric artificial intelligence, from organizational transformations to practical publishing operations, so that you can share visions of the maximum return on investment.

Read our privacy policy

Thanks for subscribing. Check more VB newsletters here.

An error occurred.