Anthropic says some Claude models can now end ‘harmful or abusive’ conversations

1 2 minutes read

Anthropic says some Claude models can now end ‘harmful or.jpg

Anthropor announced new potential that allows some of the latest models to end the conversations, as the company describes as “rare or extreme cases of constantly harmful user’s user reactions.” It is amazing, that Antarbur says it does this not to protect the human user, but rather the artificial intelligence model itself.

In order to be clear, the company does not claim that Claude AI models have distant or can be affected by its conversations with users. With her own words, Antarbur is still “very sure about the potential ethical mode of Claude and other LLMS, now or in the future.”

However, his announcement refers to a modern program created to study what he calls “typical luxury” and says that Antarubor mainly takes a fair approach, “working to determine and implement low -cost interventions to mitigate the risks of typical luxury, if this luxury can be able to.”

This last change is currently limited to Claude Obus 4 and 4.1. Once again, it is supposed to only occur in “extremist edge cases”, such as “user requests for sexual content that includes minors and attempts to seek information that would enable violence on a large scale or terrorist acts.”

Although these types of requests can create legal problems or propaganda for Athrubor itself (witness recent reports on how to enhance Chatgpt or contribute to its users’ thinking), the company says that in the pre -publication test, Claude OPUS 4 showed a strong preference “against these requests and a” pattern of apparent hardship “when it did so.

As for these new capabilities at the end of the conversation, the company says: “In all cases, Claude is only to use its ability to end the conversation as a last resort when multiple attempts to re -guidance failed, and the productive interaction has been exhausted, or when the user explicitly asks Claude to finish chatting.”

Man also says that Claude has been directed not to use this ability in cases where users may be at an imminent risk to harm themselves or others.

TECHRUNCH event

San Francisco
|
27-29 October, 2025

When Claud ends a conversation, man says that users will remain able to start new conversations from the same account, and create new branches of annoying conversation by editing their responses.

The company says: “We are dealing with this feature as a continuous experience and we will continue to improve our approach,” the company says.

Don’t miss more hot News like this! Click here to discover the latest in Technology news!

2025-08-16 15:50:00

1 2 minutes read