Technology

Cerebras just announced 6 new AI datacenters that process 40M tokens per second — and it could be bad news for Nvidia


Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more


Cerebras Systems, a start -up company for artificial intelligence devices that steadily challenge the dominance of NVIDIA in the artificial intelligence market, announced on Tuesday that a large expansion in the fingerprint of the data center and two main partnerships of the institutions that put the company to become the leading company in the field of high -speed inference services in artificial intelligence.

The company will add six new data centers for Amnesty International across North America and Europe, which increases its ability to infer twenty times more than 40 million symbols per second. The expansion includes facilities in Dallas, Minynabolis, Akllahoma City, Montreal, New York and France, with 85 % of the total capacity in the United States.

“This year, our goal is to really meet all the demand, and all the new demand that we expect will come online as a result of new models such as Lama 4 and New Deepsek Models,” said James Wang, the director of product marketing at Cerebras, in an interview with Venturebeat. “This is our huge growth initiative this year to meet almost unlimited demand that we see in all fields on the symbols of reasoning.”

The expansion of the company’s ambitious betting data center represents that the high-speed-process inference market that creates trained artificial intelligence models for applications in the real world-will grow significantly as companies seek faster alternatives to GPU solutions from NVIDIA.

Kerbras plans to expand from two million to more than 40 million symbols per second by the fourth quarter of 2025 through eight data centers in North America and Europe. (Credit: Brain)

Strategic partnerships that bring high -speed artificial intelligence to developers and financial analysts

In addition to expanding the infrastructure, the brain announced partnerships with the embrace face, the famous artificial intelligence developers platform, and alphasense, a market intelligence platform widely used in the financial services industry.

Face Lugging Face integration of its five million developers will allow developers to reach the brain in one click, without the need to participate in the brain separately. This represents a major brain distribution channel, especially for developers who work with open source models such as Llama 3.3 70B.

“The embracing face is a type of GitHub of artificial intelligence and the center of all the development of open source artificial intelligence,” Wang explained. “The integration is very nice and citizen. You just appear in the list of their inference providers. You can only select the square and then you can use the brain immediately.”

The Alphasense partnership is a great victory for the institution’s customer, with the transformation of the financial intelligence platform, which Wang described as “a model of the three closed artificial intelligence of the sources.” The company, which serves approximately 85 % of the Fortune 100 companies, uses the brain to accelerate search capabilities operating in Amnesty International for market intelligence.

“This is a massive client victory and a very big contract for us,” said Wang. “We rushed it 10x, so it would not take five seconds or more, and it becomes immediately on the brain.”

Mistral’s Le Chat, backed by the brain, process 1100 icons per second – outperform important competitors such as Google’s Gemini, Chatgpt and Claude. (Credit: Brain)

How the brain wins the race for the speed of inference from artificial intelligence while slowing down the thinking forms

The brain is placed in its own position as a specialist in high-speed inference, as the AI’s WSE-3 engine processor can claim faster than 10 to 70 times than GPU. This speed feature has become increasingly value with the development of artificial intelligence models towards the most complex thinking capabilities.

“If you listen to Jensen’s comments, thinking is the next big thing, even according to Nvidia,” Wang said: “If you listen to Jensen’s comments, thinking is the next big thing, even according to Nvidia,” referring to the CEO of Nvidia Jensen Huang. “But what he does not tell you is that logic makes the whole matter work slowly 10 times because the model must think and generate a group of internal monologue before it gives you the final answer.”

This slowdown creates an opportunity for the brain, whose specialized devices are designed to accelerate the burdens of this artificial intelligence. The company has already secured high -level customers including Ai Perplexity and Mistral AI, who use the brain to operate their products and AI products assistants, respectively.

“We are helping to become confused the fastest search engine in the world in the world. This is not possible.” We are helping Mistral to achieve the same fence. Now they have a reason to subscribe to the Le Chat Pro, while before, your model may not be the same as the advanced level as GPT-4. “

CEREBRAS devices provide conclusion up to 13x faster than GPU solutions through famous artificial intelligence models such as Lama 3.3 70B and Deepseek R1 70B. (Credit: Brain)

The convincing economy behind the brain challenge in Openai and Nvidia

The brain is betting that the speed and cost mix will make the inferition services attractive even for companies that already use leading models such as GPT-4.

Wang noted that Lama’s Llama 3.3 70B, an open source model, has improved the brain of its devices, and now records the same intelligence tests as the GPT-4 from Openai, with its cost much less.

“Anyone uses GPT-4 today can move to Llama 3.3 70B as an alternative to replacement,” explained. “GPT-4 price is [about] 4.40 dollars in terms of mixture. And Llama 3.3 is similar to 60 cents. We are about 60 cents, right? So you can reduce the cost in almost in size. If you use the brain, you increase the speed in another arrangement in size. “

Inside the interior brain data centers, it was designed for artificial intelligence flexibility

The company makes large investments in flexible infrastructure as part of its expansion. The Oklahoma City facility, scheduled to come online in June 2025, is designed to withstand harsh weather events.

“Oklahoma, as you know, is a type of hurricane. So this data center has already classified and designed to be completely resistant to hurricanes and seismic activity.” “It will stand up to the strongest hurricane ever. If this thing continues, this thing will only continue to send the Lama icon to the developers.”

The Oklahoma City facility, which is operating in partnership with the data center on a large scale, will operate on more than 300 CS-3 systems in the brain and features an overloading generation stations and specially designed water-based water cooling solutions.

This facility is designed to withstand severe weather, and hosts more than 300 CS-3 CS-3 systems when it opens in June 2025, and features extra power and specialized cooling systems. (Credit: Brain)

From doubts to market leadership: How the brain proves its value

The expansion and partnerships announced today represent an important milestone for a brain, which is working to prove itself in the NVIDIA dominated intelligence market.

“I think reasonable doubts about clients’ absorption, perhaps when we first launched them, I think this has been completely placed in bed, only given the diversity of slogans we have,” said Wang.

The company targets three specific areas, as it provides a more valuable inference: audio and video processing in actual time, thinking forms, and coding applications.

“The coding is one of this type between thinking, questions and regular answers that take 30 seconds to a minute to create all the code,” Wang explained. “The speed is directly proportional to the developer productivity. So having a speed there.”

By focusing on high -speed reasoning instead of competing in all the burdens of artificial intelligence, CEREBRAS has found a position as it can claim to drive on the largest cloud service providers.

“Nobody is generally competing against AWS and Azure on their scale. It is clear that we do not reach a full scope like them, but to be able to repeat a major part … on the high -speed reasoning front, we will have greater ability than them.”

Why does the United States’s expansion in the United States have the sovereignty of artificial intelligence and future work burdens

The expansion comes at a time when the artificial intelligence industry is increasingly focused on the capabilities of inference, as companies move from the experience of obstetric intelligence to spreading them in production applications where speed and efficiency in cost are very important.

With 85 % of the inference capabilities in the United States, the brain also places itself as a major player in the progress of local infrastructure in artificial intelligence at a time when technological sovereignty has become a national priority.

“The brain is a turbocharged turbocharged for the future of the American Organization’s leadership with the performance, size and unparalleled efficacy – these new global centers will be the backbone of the next wave of creating artificial intelligence.”

Since thinking models such as Deepseek R1 and Openai’s O3 have become more prevalent, it is likely that the demand for faster inference solutions will grow. These models, which can take minutes to create answers to traditional devices, are almost constantly working on brain systems, according to the company.

For technical decision makers who evaluate Amnesty International’s infrastructure options, brain expansion is an important new alternative to GPU solutions, especially for applications where response time is necessary for the user experience.

Whether the company can really challenge the dominance of NVIDIA in the broader AI hardware market, its focus on high -speed inferred and investment in large infrastructure shows a clear strategy to publish a valuable segment of the sophisticated rapidly sophisticated intelligence scene.



2025-03-11 12:30:00

Related Articles

Back to top button