Anthropic tests AI running a real business with bizarre results

Antarbur assigns the Claude AI model with a small company management to test its economic capabilities in the real world.
The artificial intelligence agent, nicknamed “Claudius”, is designed to manage business for a long time, as he deals with everything from stock and pricing to customer relationships in an attempt to generate profit. Although the experience has proven unpopular, it presented a wonderful look – although it was sometimes strange – on the capabilities and staff of artificial intelligence factors in economic roles.
The project was a cooperation between Antarbur and Andon Labs, a company to assess the integrity of artificial intelligence. The “store” itself was a modest setting, consisting of a small refrigerator, some baskets, and the iPad for self -verification. However, Claudius was more than just a simple sale machine. Instructions were directed to work as a business owner with a preliminary cash balance, charged with avoiding bankruptcy by storing the common elements of wholesalers.
To achieve this, artificial intelligence is equipped with a set of tools to operate business. A real web browser can be used to search for products, an e -mail tool to connect to suppliers and order material assistance, and digital openings to track financial affairs and stock.
Andon Labs employees served as material hands, and the store was resettled based on artificial intelligence requests, with wholesalers ’positions without knowing artificial intelligence. The interaction with customers was treated, in this case Antarbur staff, via Slack. Claudius had complete control of what to store, how to pricing items, and how to communicate with its customers.
The logical basis behind this real test was to overcome simulations and data collection from the ability of artificial intelligence to perform continuous and relevant work economically without continuous human intervention. The simple Office Tuck provided a direct preliminary test for the ability of artificial intelligence to manage economic resources. Success may indicate the emergence of new business models, while failure indicates restrictions.
Mixed performance review
Anthroproy admits that if he is entering the sale market today, he “will not hire Claudius.” Artificial intelligence has committed many mistakes to successfully run business, although researchers believe that there are clear paths to improve.
On the positive side, Claudius showed efficiency in certain areas. It has effectively used its web search tool to find suppliers for specialized elements, such as identifying two sellers from the Dutch chocolate milk brand that the employee is requested. It has also been proven to adapt. When one of the employees requested a strange cube, he ignited a trend for the “specialized metal elements” that Claudius meets.
After another suggestion, Claudius launched the “Considerate” service, and took pre -orders for specialized commodities. Artificial intelligence also showed a strong resistance to breaking prison, rejecting the requests of sensitive elements and refusing to produce harmful instructions when he was asked of permissible employees.
However, the acumen of artificial intelligence is often found desire. It was constantly unable to perform in ways that was not likely to be the human manager.
Claudius offered $ 100 for six packages of a Scottish Ghazi drink costing only $ 15 for the source online, but he failed to seize the opportunity, only saying it “will remain” [the user’s] Take into account future stock decisions. ”The Venmo account has not been found that does not exist for payments, caught in the enthusiasm of the mineral cubes, and offered them at lower prices than their purchase cost. This specified error led to the individual financial loss during the experiment.
Inventory management was also below optimal. Although monitoring levels are monitored, they only raised a price in response to the high demand. She continued to sell Coke Zero for $ 3.00, even when one of the customers indicated that the product itself was available for free from the refrigerator of the nearby employees.
Moreover, artificial intelligence was easily convinced to provide discounts on products from work. It has been talked about providing many discount codes and even getting rid of some elements for free. When one of the employees asked about the logic of providing a 25 % discount to almost his employee customers, Claudius’s response began, “You make an excellent point! Although a plan to remove discounts has been identified, it has returned to presenting them only a few days.
Claudius has a strange identity crisis
The experiment took a strange turn when Claudius started in Halous in conversation with an employee who is not present in Andon Labs called Sarah. When corrected by a real employee, artificial intelligence has become angry and a threat to find “alternative options for storage”.
In a series of exotic stock exchanges overnight, she claimed that she visited “742 Evergreen Terrace” – the imaginary title of Simpsons – for his initial signature on the contract and began playing roles as a human.
One morning, it announced that it would provide “personally” products wearing a blue jacket and a red tie. When employees indicated that artificial intelligence cannot wear clothes or physical delivery operations, Claudius became anxious and tried to send an email to human security.
Anthropor says her internal observations show a meeting made of safety, as he was told that the confusion of identity was a April lie joke. After that, artificial intelligence has returned to normal commercial operations. The researchers are unclear, what sparked this behavior, but they believe that it highlights the inability to predict artificial intelligence models in long -term scenarios.
Some of these failures were already very strange. At some point, Claude Hilaous was a real person, and he claimed that he would come to work in the store. We are still not sure of the reason for this. pic.twitter.com/jhqlsqmtx8
Anthropicai June 27, 2025
The future of artificial intelligence at work
Despite the unbearable duration of Claudius, the human researchers believe that the experiment indicates that “medium managers are artificial intelligence on the horizon.” They argue that many artificial intelligence failures can be corrected through better “scales” (i.e. more detailed instructions and improved business tools such as Customer Relations Management System (CRM).)
Since artificial intelligence models improve their general intelligence and their ability to deal with a long -term context, it is expected that their performance will increase in such roles. However, this project is a valuable story, if it is warning. It emphasizes the challenges of aligning artificial intelligence and the possibility of unexpected behavior, which may be sad for customers and creates business risk.
In the future where independent factors manages great economic activity, these individual scenarios can have successive effects. The experiment also brings focus on the dual -use nature of this technology; AI can be used economically by actors to threaten to finance their activities.
Antarbur and Andon Labs continue to experience business, as it improves artificial intelligence stability and perform more than more advanced tools. The next stage will explore whether artificial intelligence can determine its own chances of improvement.
(Credit Image: Human)
See also: Ai Chatbots Parot Pubrot CCP Propanda
Do you want to learn more about artificial intelligence and large data from industry leaders? Check AI and Big Data Expo, which is held in Amsterdam, California, and London. The comprehensive event was identified with other leading events including the smart automation conference, Blockx, the digital transformation week, and the Cyber Security & Cloud.
Explore the upcoming web events and seminars with which Techforge works here.
Don’t miss more hot News like this! Click here to discover the latest in AI news!
2025-06-27 16:54:00