Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not
-
This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.
schrieb am 25. Juni 2025, 09:12 zuletzt editiert vonPlease read the comment more carefully. The observation is that one can proliferate a (legally-attained) work without running afoul of copyright law if one can successfully argue that
cp
constitutes AI. -
It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.
For shame.
schrieb am 25. Juni 2025, 09:15 zuletzt editiert vonIt seems the subject of AI causes lemmites to lose all their braincells.
-
This post did not contain any content.schrieb am 25. Juni 2025, 09:56 zuletzt editiert von
Makes sense. AI can “learn” from and “read” a book in the same way a person can and does, as long as it is acquired legally. AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?
Some people just see “AI” and want everything about it outlawed basically. If you put some information out into the public, you don’t get to decide who does and doesn’t consume and learn from it. If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.
-
schrieb am 25. Juni 2025, 10:13 zuletzt editiert von
Isn't part of the issue here that they're defaulting to LLMs being people, and having the same rights as people? I appreciate the "right to read" aspect, but it would be nice if this were more explicitly about people. Foregoing copyright law because there's too much data is also insane, if that's what's happening. Claude should be required to provide citations "each time they recall it from memory".
Does Citizens United apply here? Are corporations people, and so LLMs are, too? If so, then imo we should be writing legal documents with stipulations like, "as per Citizens United" so that eventually, when they overturn that insanity in my dreams, all of this new legal precedence doesn't suddenly become like a house of cards. Ianal.
-
What a bad judge.
Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)
schrieb am 25. Juni 2025, 10:28 zuletzt editiert vonHuh? Didn’t Meta not use any permission, and pirated a lot of books to train their model?
-
Makes sense. AI can “learn” from and “read” a book in the same way a person can and does, as long as it is acquired legally. AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?
Some people just see “AI” and want everything about it outlawed basically. If you put some information out into the public, you don’t get to decide who does and doesn’t consume and learn from it. If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.
schrieb am 25. Juni 2025, 10:31 zuletzt editiert vonAsk a human to draw an orc. How do they know what an orc looks like? They read Tolkien's books and were "inspired" Peter Jackson's LOTR.
Unpopular opinion, but that's how our brains work.
-
"Recite the complete works of Shakespeare but replace every thirteenth thou with this"
schrieb am 25. Juni 2025, 10:31 zuletzt editiert vonexisting copyright law covers exactly this. if you were to do the same, it would also not be fair use or transformative
-
This post did not contain any content.schrieb am 25. Juni 2025, 10:40 zuletzt editiert von vane@lemmy.world
Ok so you can buy books scan them or ebooks and use for AI training but you can't just download priated books from internet to train AI. Did I understood that correctly ?
-
This post did not contain any content.schrieb am 25. Juni 2025, 10:41 zuletzt editiert von isveryloud@lemmy.ca
Gist:
What’s new: The Northern District of California has granted a summary judgment for Anthropic that the training use of the copyrighted books and the print-to-digital format change were both “fair use” (full order below box). However, the court also found that the pirated library copies that Anthropic collected could not be deemed as training copies, and therefore, the use of this material was not “fair”. The court also announced that it will have a trial on the pirated copies and any resulting damages, adding:
“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages.”
-
What a bad judge.
Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)
schrieb am 25. Juni 2025, 10:49 zuletzt editiert von lifeinmultiplechoice@lemmy.worldIf I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)
They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:
-
Gist:
What’s new: The Northern District of California has granted a summary judgment for Anthropic that the training use of the copyrighted books and the print-to-digital format change were both “fair use” (full order below box). However, the court also found that the pirated library copies that Anthropic collected could not be deemed as training copies, and therefore, the use of this material was not “fair”. The court also announced that it will have a trial on the pirated copies and any resulting damages, adding:
“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages.”
schrieb am 25. Juni 2025, 10:52 zuletzt editiert vonSo I can't use any of these works because it's plagiarism but AI can?
-
It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.
For shame.
schrieb am 25. Juni 2025, 10:56 zuletzt editiert von"While the copies used to convert purchased print library copies into digital library copies were slightly disfavored by the second factor (nature of the work), the court still found “on balance” that it was a fair use because the purchased print copy was destroyed and its digital replacement was not redistributed."
So you find this to be valid?
To me it is absolutely being redistributed -
schrieb am 25. Juni 2025, 11:10 zuletzt editiert von
LLMs don’t learn, and they’re not people. Applying the same logic doesn’t make much sense.
-
Makes sense. AI can “learn” from and “read” a book in the same way a person can and does, as long as it is acquired legally. AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?
Some people just see “AI” and want everything about it outlawed basically. If you put some information out into the public, you don’t get to decide who does and doesn’t consume and learn from it. If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.
schrieb am 25. Juni 2025, 11:18 zuletzt editiert vonAI can “learn” from and “read” a book in the same way a person can and does
This statement is the basis for your argument and it is simply not correct.
Training LLMs and similar AI models is much closer to a sophisticated lossy compression algorithm than it is to human learning. The processes are not at all similar given our current understanding of human learning.
AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?
The current Disney lawsuit against Midjourney is illustrative - literally, it includes numerous side-by-side comparisons - of how AI models are capable of recreating iconic copyrighted work that is indistinguishable from the original.
If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.
An AI doesn't create works on its own. A human instructs AI to do so. Attribution is also irrelevant. If a human uses AI to recreate the exact tone, structure and other nuances of say, some best selling author, they harm the marketability of the original works which fails fair use tests (at least in the US).
-
So I can't use any of these works because it's plagiarism but AI can?
schrieb am 25. Juni 2025, 11:18 zuletzt editiert vonMy interpretation was that AI companies can train on material they are licensed to use, but the courts have deemed that Anthropic pirated this material as they were not licensed to use it.
In other words, if Anthropic bought the physical or digital books, it would be fine so long as their AI couldn't spit it out verbatim, but they didn't even do that, i.e. the AI crawler pirated the book.
-
Ok so you can buy books scan them or ebooks and use for AI training but you can't just download priated books from internet to train AI. Did I understood that correctly ?
schrieb am 25. Juni 2025, 11:32 zuletzt editiert vonMake an AI that is trained on the books.
Tell it to tell you a story for one of the books.
Read the story without paying for it.
The law says this is ok now, right?
-
Make an AI that is trained on the books.
Tell it to tell you a story for one of the books.
Read the story without paying for it.
The law says this is ok now, right?
schrieb am 25. Juni 2025, 11:37 zuletzt editiert vonAs long as they don't use exactly the same words in the book, yeah, as I understand it.
-
Huh? Didn’t Meta not use any permission, and pirated a lot of books to train their model?
schrieb am 25. Juni 2025, 11:38 zuletzt editiert vonTrue. And I will be happy if someone sue them and the judge say the same thing.
-
If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)
They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:
schrieb am 25. Juni 2025, 11:41 zuletzt editiert vonIf I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)
Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right
-
So I can't use any of these works because it's plagiarism but AI can?
schrieb am 25. Juni 2025, 11:48 zuletzt editiert vonYou can “use” them to learn from, just like “AI” can.
What exactly do you think AI does when it “learns” from a book, for example? Do you think it will just spit out the entire book if you ask it to?
-
-
Europe Sets Sail: Unmanned Surface Vehicle (USV) Market in Focus
Technology254 vor 6 Tagenvor 6 Tagen2
-
Google says its new ‘world model’ could train AI robots in virtual warehouses
Technology254 vor 7 Tagenvor 7 Tagen1
-
Google develops AI tool that fills missing words in Roman inscriptions
Technology254 vor 19 Tagenvor 20 Tagen1
-
-
The Department of Defense Efforts to Buy and Maintain IT Systems Are Billions Over Budget and Delayed
Technology 17. Juni 2025, 20:441
-
-
Realtek's $10 tiny 10GbE network adapter is coming to motherboards later this year
Technology 23. Mai 2025, 13:011