Anthropic Shredded Millions of Physical Books to Train its AI

Today in the Schnozz metaphors that separate the nose from the destruction of the attractive AI industry for the arts: exactly how the Antarbur collected the data you need to train the Claud AI model.
like Art Technica ReportsGoogle -backed startup company was not just a bed of millions of books protected by copyright, a practice of morally and legally on its own. No – she cut the pages of the book from their links, and wiped them lightly to make digital files, then delivered all millions of pages of the original texts. Saying that artificial intelligence “accusations” of these books will not be just a colorful language.
This practice was revealed in the rule of copyright on Monday, which turned out to be a great victory in the anthropologist and the data of the data separating the data as a whole. The judge who heads the case, the American boycott judge, William Alsoub, found that Antarbur can train his great language models on the books she bought legally, even without explicit permission for the authors.
It is a debtor, partially, of the Antarbur method in wiping destroyed books – which cannot be used away from the first company to be used, according to what I mentioned. ArsBut it is widely noticeable. In short, it benefits from a legal concept known as the principle of the first sale, which allows the buyer to do what they want through purchase without the intervention of the copyright holder. this The rule is what allows the existence of the used market – otherwise the publisher of the book, for example, may demand the re -sale of their books or prevent them from reselling them.
Leave it to artificial intelligence companies, though, to use this in bad faith. According to the court’s report, Anthropor rented the former head of the Joogle Tom TURVEY books in February 2024 to obtain “all books in the world” without running to a “chip/practice/commercial”, as described by the CEO of anthropologist Amodei, according to the file. Torfi reached an alternative solution. By purchasing material books, Antarubor will be protected through the first sale doctrine and is no longer forced to obtain a license. Installing the permitted pages of the licensed and easier survey. Since the anthropologist used only internal optical books and ejaculating the copies after that, the judge found that this process is closer to “memorization[ing] space,” Ars Note, and this means that it was the transformation. Ergo, is okay from a legal point of view.
It is an alternative solution and a severe hypocrisy, of course. When Antarubor got up for the first time, the startup process went to the most conscientious rod to download millions of pirated books to feed artificial intelligence. This dead verb With millions of pirated booksAlso, which you currently get Prosecute by a group of authors.
It is lazy and unacceptable. like Ars Notes, many archives have rejected different ways to collect books collectively without the need to destroy or change the original copies, including the Google Internet Archives (which was not very long has its own topic Main copyright battle.))
But anything to save a few dollars – and to get very precious training data. In fact, the artificial intelligence industry is depleted from high-quality food sources to feed artificial intelligence-not the least of which is because it provides it in short at all time in terms of eating it-so the installation of some authors and sending some books to tearing, for large technology, a small payment price.
More about artificial intelligence: Microsoft faces an incredibly embarrassing problem with artificial intelligence
Don’t miss more hot News like this! Click here to discover the latest in AI news!
2025-06-29 17:30:00