Technology

Cutting cloud waste at scale: Akamai saves 70% using AI agents orchestrated by kubernetes


Join the event that the leaders of the institutions have been trusted for nearly two decades. VB Transform combines people who build AI’s strategy for real institutions. Learn more


In particular in the dawn era of artificial intelligence, the cloud costs are at the highest levels ever. But this is not only because institutions use more account – they do not use them efficiently. In fact, this year only, institutions are expected to waste $ 44.5 billion on unnecessary cloud spending.

This is a inflated problem for Akamai Technologies: The company has a large and complex cloud infrastructure on multiple clouds, not to mention many strict security requirements.

to The solution to this, the cyberspace security provider and the delivery of the content to the Kubernetes AI automation platform, which helps artificial intelligence agents improve cost and security And speed through cloud environments.

Ultimately, the Akamai platform helped reduce between 40 % to 70 % of the cloud costs, depending on the work burden.

“We needed a continuous way to improve our infrastructure and reduce our cloud costs without sacrificing performance,” Dickel Shavet, chief cloud engineering manager in Akamay, told Venturebeat. “We are the treatment of safety events. Delay is not an option. If we cannot respond to a security attack in an actual time, we have failed.”

Specialized agents who monitor and analyze and act

Kubernetes manages the infrastructure that runs applications, which is easy to spread and expand and manage, especially in the original cloud structure and precise services.

Cast AI was combined into the Kubernetes system to help customers expand their groups and work burdens, define the best infrastructure and manage mathematical life courses, explained by the founder and CEO Laurent Gil. Its primary platform is the APA (APA), which works through a team of specialized agents who are constantly monitoring, analyzing and taking measures to improve the performance of the application, safety, efficiency and cost. The companies only offer the account they need from AWS, Microsoft, Google, or others.

APA is operated by many automatic learning models (ML) with reinforcement learning (RL) based on historical data and styles learned, which are enhanced by a phalist phase of observation and reasoning. It is associated with infrastructure tools (IC) on many clouds, making it a completely automatic platform.

Gil explained that APA was built on the principle that the observation is just a starting point; As it was called, the observation is “the basis, not the goal.” Cast AI also supports additional adoption, so customers do not have to remove and replace them; They can integrate into the current tools and workflow. Moreover, nothing leaves the customer infrastructure; All analyzes and procedures occur within custom Kubernetes groups, providing more security and control.

Generation also emphasized the importance of humans. He said: “Automation completes human decisions,” as APA maintains the functioning of human action in the middle.

Unique challenges from akamai

Shaffet explained that the large and complex content delivery network in AKAMAI cloudy forces (CDN) and cybersecurity services are delivered to some of the world’s most customers and industries with compliance with the strict level service level agreements (SLAS) and performance requirements.

He pointed out that for some of the services they consume, they are likely to be the largest customer of their seller, adding that they had “tons of basic engineering and restoring engineering” with hyperactivity to support their needs.

Moreover, Akamai serves customers of various sizes and industries, including large financial institutions and credit card companies. The company’s services are directly related to its customer safety position.

Ultimately, Akamai needed to balance all this complexity with the cost. Shavit noted that realistic attacks on customers can push 100x or 1000X on specific components of their infrastructure. But “the limitation of our cloud energy by 1000x in advance is not financially possible.”

His team thought about improving the code side, but the inherent complexity of their business model requires focusing on the same basic infrastructure.

Cubernetes infrastructure improvement automatically

What AKAMAI really needs is a kubernetes automation platform that can improve the costs of operating the entire basic infrastructure in the actual time on many clouds, Shavit explained, expanding the range of applications up and down based on constantly changing demand. But all this had to do without sacrificing the application of the application.

Before the actors’ implementation, Shavit indicated that the Devops team in Akamai manually set all the burdens of Kubernetes several times a month. Looking at the scale and complexity of its infrastructure, it was difficult and costly. By analyzing the work burdens intermittently, he clearly accused any improved potential in actual time.

“Now, hundreds of pasture agents are the same as controlling, but they do this every second of every day,” Shavet said.

APA Core that AKAMI uses is automatic automation, in -depth kubernetes with Bin filling (Reducing the number of funds used), automatic selection of more cost -effective calculations, work burden rights, topical conditioning throughout the life of the entire life and cost analysis capabilities.

“We got an insightful look at cost analyzes for two minutes, which is something we haven’t seen before,” Shavet said. “Once the active agents were published, the improvement began automatically, and the savings began to appear.”

Shaffet pointed out that it is clear that immediate cases – where institutions can reach the unused cloud capacity at reduced – logical prices for men, but it turns out that they are complicated due to the complex burdens of work in Akamai, especially Apache Spark. This means that they need to either increase work burdens or put more hands on them, which turns out to be financially intuitive.

With Cast AI, they were able to use topical counterparts on Spark with “Zero Investment” from the engineering team or operations. The value of the staining counterparts was “very clear”; They only needed to find the right tool to be able to use it. Shavitt indicated that this was one of the reasons that made them move forward.

While saving 2x or 3x on the cloud bill is great, Shavit pointed out that automation without manual intervention is “invaluable”. It has led to “huge” time savings.

Before Cast AI executed, his team was “constantly moving around handles and keys” to ensure that production environments and their customers were equally with the service they needed to invest in.

“The biggest benefit is the fact that we do not need to manage our infrastructure anymore,” said Shavet. “The actors’ clients team is now for us. Our team has been to focus on what matters more: launch faster features for our customers.”

Editor’s note: In the VB Transform for this month, Google Cloud CTO and HighMark Health SVP and chief analyst Richard Clark will discuss new artificial intelligence in the field of health care and realistic challenges in spreading AI multi -mode in a complex environment. Today.


Don’t miss more hot News like this! Click here to discover the latest in Technology news!


2025-06-16 23:11:00

Related Articles

Back to top button