AI

Don’t let hype about AI agents get ahead of reality

Let’s start with the term “agent” itself. Currently, it is slapped on everything from simple textual programs to advanced workflow tasks of artificial intelligence. There is no common definition, which leaves a lot of fields for companies to market basic automation as something more advanced. This type of “agent washing” does not confuse customers; He calls disappointment. We do not necessarily need a rigid standard, but we need clearer expectations about what these systems are supposed to do, how they work independently, and the extent of their performance.

Reliability is the next big challenge. Most of the today’s agents are operated by large language models (LLMS), which generate probable responses. These systems are strong, but they are also unpredictable. They can manufacture matters, get out of the right track or fail in hidden ways – especially when they are asked to complete multi -step tasks, withdraw external tools and LLM responses together. A recent example: Cursor users, a famous AI programming assistant, have been informed by an automatic support agent that they cannot use the program on more than one device. There were large -scale complaints and reports of users who cancel their subscriptions. But it turned out that the policy did not exist. Invented artificial intelligence.

In the Foundation’s settings, this type of error may create tremendous damage. We need to stop processing LLMS as independent products and start building complete systems around them – expected systems representing uncertainty, output monitoring, cost management, and layer in handrails for safety and accuracy. These measures can help ensure that the output adheres to the requirements that the user expresses, obeys the company’s policies regarding access to information, and respects privacy problems, etc. Some companies, including AI21 (which have been established and that have received Google’s funding), are already moving in this direction, and language models are fragmented in a more baptized structure. Our latest launch, MASTRO, for Enterprises Football, combines LLMS, company data, general information and other tools to ensure reliable outputs.

However, even the smartest agent will not be useful in a vacuum. In order for the agent model to work, different agents need cooperation (booking your travel, checking the weather, and submitting your expenses) without continuous human supervision. This is where the Google A2A protocol comes. It is supposed to be a global language that allows agents to share what they can do and divide tasks. In principle, it is a great idea.

In practice, A2A is still short. It determines how the agents talk to each other, but not what they really concern. If one of the factors says that he can provide “wind conditions”, another person must guess whether this is useful for evaluating the weather on the road. Without vocabulary or a common context, coordination becomes brittle. We have seen this problem before in the distributed computing. Solve it beyond trivial.

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-07-03 09:00:00

Related Articles

Back to top button