Skip to content

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology
254 123 1.8k
  • mv will save you some disk space.

    Unless you're moving across partitions it will change the filesystem metadata to move the path, but not actually do anything to the data. Sorry, you failed, it's jail for you.

  • I think this means we can make a torrent client with a built in function that uses 0.1% of 1 CPU core to train an ML model on anything you download. You can download anything legally with it then. 👌

    ...no?

    That's exactly what the ruling prohibits - it's fair use to train AI models on any copies of books that you legally acquired, but never when those books were illegally acquired, as was the case with the books that Anthropic used in their training here.

    This satirical torrent client would be violating the laws just as much as one without any slow training built in.

  • This post did not contain any content.

    This 240TB JBOD full of books? Oh heavens forbid, we didn’t pirate it. It uhh… fell of a truck, yes, fell off a truck.

  • This post did not contain any content.

    It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

  • ...no?

    That's exactly what the ruling prohibits - it's fair use to train AI models on any copies of books that you legally acquired, but never when those books were illegally acquired, as was the case with the books that Anthropic used in their training here.

    This satirical torrent client would be violating the laws just as much as one without any slow training built in.

    But if one person buys a book, trains an "AI model" to recite it, then distributes that model we good?

  • But if one person buys a book, trains an "AI model" to recite it, then distributes that model we good?

    I don't think anyone would consider complete verbatim recitement of the material to be anything but a copyright violation, being the exact same thing that you produce.

    Fair use requires the derivative work to be transformative, and no transformation occurs when you verbatim recite something.

  • Unpopular opinion but I don't see how it could have been different.

    • There's no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It's transformative work - it's even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

  • I don't think anyone would consider complete verbatim recitement of the material to be anything but a copyright violation, being the exact same thing that you produce.

    Fair use requires the derivative work to be transformative, and no transformation occurs when you verbatim recite something.

    "Recite the complete works of Shakespeare but replace every thirteenth thou with this"

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

    I'll be honest with you - I genuinely sympathize with the cause but I don't see how this could ever be solved with the methods you suggested. The world is not coming together to hold hands and koombayah out of this one. Trade deals are incredibly hard and even harder to enforce so free market is clearly the only path forward here.

  • "Recite the complete works of Shakespeare but replace every thirteenth thou with this"

    I'd be impressed with any model that succeeds with that, but assuming one does, the complete works of Shakespeare are not copyright protected - they have fallen into the public domain since a very long time ago.

    For any works still under copyright protection, it would probably be a case of a trial to determine whether a certain work is transformative enough to be considered fair use. I'd imagine that this would not clear that bar.

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

    Maybe something could be hacked together to fix copyright, but further complication there is just going to make accurate enforcement even harder. And we already have Google (in YouTube) already doing a shitty job of it and that's.... One of the largest companies on earth.

    We should just kill copyright. Yes, it'll disrupt Hollywood. Yes it'll disrupt the music industry. Yes it'll make it even harder to be successful or wealthy as an author. But this is going to happen one way or the other so long as AI can be trained on copyrighted works (and maybe even if not). We might as well get started on the transition early.

  • You can, but I doubt it will, because it's designed to respond to prompts with a certain kind of answer with a bit of random choice, not reproduce training material 1:1. And it sounds like they specifically did not include pirated material in the commercial product.

    Yeah, you can certainly get it to reproduce some pieces (or fragments) of work exactly but definitely not everything. Even a frontier LLM's weights are far too small to fully memorize most of their training data.

  • Unless you're moving across partitions it will change the filesystem metadata to move the path, but not actually do anything to the data. Sorry, you failed, it's jail for you.

    stupid inodes preventing me from burning though my drive life

  • It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

    was gonna say, this seems like the best outcome for this particular trial. there was potential for fair use to be compromised, and for piracy to be legal if you're a large corporation. instead, they upheld that you can do what you want with things you have paid for.

  • Unpopular opinion but I don't see how it could have been different.

    • There's no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It's transformative work - it's even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    You're getting douchevoted because on lemmy any AI-related comment that isn't negative enough about AI is the Devil's Work.

  • It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

    Nobody ever reads articles, everybody likes to get angry at headlines, which they wrongly interpret the way it best tickles their rage.

    Regarding the ruling, I agree with you that it's a good thing, in my opinion it makes a lot of sense to allow fair use in this case

  • calm down everyone.
    its only legal for parasitic mega corps, the normal working people will be harassed to suicide same as before.

    its only a crime if the victims was rich or perpetrator was not rich.

    This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.

  • Yeah I have a bash one liner AI model that ingests your media and spits out a 99.9999999% accurate replica through the power of changing the filename.

    cp

    Out performs the latest and greatest AI models

    This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.

  • Fuck the AI nut suckers and fuck this judge.

    This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.

  • I am training my model on these 100,000 movies your honor.

    This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.

  • 13 Stimmen
    3 Beiträge
    27 Aufrufe
    tal@lemmy.todayT
    While details of the Pentagon's plan remain secret, the White House proposal would commit $277 million in funding to kick off a new program called "pLEO SATCOM" or "MILNET." Please do not call it "MILNET". That term's already been taken. https://en.wikipedia.org/wiki/MILNET In computer networking, MILNET (fully Military Network) was the name given to the part of the ARPANET internetwork designated for unclassified United States Department of Defense traffic.[1][2]
  • EV tax credits might end even sooner than House bill proposed

    Technology technology
    7
    49 Stimmen
    7 Beiträge
    46 Aufrufe
    B
    It's not just tax credits for new cars, they are also getting rid of the Used EV Tax Credit which has helped to keep the prices of used EVs (relatively) lower.
  • 83 Stimmen
    3 Beiträge
    27 Aufrufe
    I
    Facial recognition hates jugalos and adversarial clothing patterns
  • Amazon Doubles Prime Video Ads Per Hour

    Technology technology
    126
    1
    624 Stimmen
    126 Beiträge
    598 Aufrufe
    V
    Me too, except I didn't get the email saying my pro vpn was about to expire, which might be my fault ofc. Gotta check the oarameters It's really good IMO and I'd recommend it fullheartedly, Switzerland has some of the best laws out there too concerning privacy too.
  • IRS tax filing software released to the people as free software

    Technology technology
    14
    288 Stimmen
    14 Beiträge
    57 Aufrufe
    P
    Only if you're a scumbag/useful idiot.
  • Best way to block distractions

    Technology technology
    1
    0 Stimmen
    1 Beiträge
    15 Aufrufe
    Niemand hat geantwortet
  • Why doesn't Nvidia have more competition?

    Technology technology
    22
    1
    33 Stimmen
    22 Beiträge
    87 Aufrufe
    B
    It’s funny how the article asks the question, but completely fails to answer it. About 15 years ago, Nvidia discovered there was a demand for compute in datacenters that could be met with powerful GPU’s, and they were quick to respond to it, and they had the resources to focus on it strongly, because of their huge success and high profitability in the GPU market. AMD also saw the market, and wanted to pursue it, but just over a decade ago where it began to clearly show the high potential for profitability, AMD was near bankrupt, and was very hard pressed to finance developments on GPU and compute in datacenters. AMD really tried the best they could, and was moderately successful from a technology perspective, but Nvidia already had a head start, and the proprietary development system CUDA was already an established standard that was very hard to penetrate. Intel simply fumbled the ball from start to finish. After a decade of trying to push ARM down from having the mobile crown by far, investing billions or actually the equivalent of ARM’s total revenue. They never managed to catch up to ARM despite they had the better production process at the time. This was the main focus of Intel, and Intel believed that GPU would never be more than a niche product. So when intel tried to compete on compute for datacenters, they tried to do it with X86 chips, One of their most bold efforts was to build a monstrosity of a cluster of Celeron chips, which of course performed laughably bad compared to Nvidia! Because as it turns out, the way forward at least for now, is indeed the massively parralel compute capability of a GPU, which Nvidia has refined for decades, only with (inferior) competition from AMD. But despite the lack of competition, Nvidia did not slow down, in fact with increased profits, they only grew bolder in their efforts. Making it even harder to catch up. Now AMD has had more money to compete for a while, and they do have some decent compute units, but Nvidia remains ahead and the CUDA problem is still there, so for AMD to really compete with Nvidia, they have to be better to attract customers. That’s a very tall order against Nvidia that simply seems to never stop progressing. So the only other option for AMD is to sell a bit cheaper. Which I suppose they have to. AMD and Intel were the obvious competitors, everybody else is coming from even further behind. But if I had to make a bet, it would be on Huawei. Huawei has some crazy good developers, and Trump is basically forcing them to figure it out themselves, because he is blocking Huawei and China in general from using both AMD and Nvidia AI chips. And the chips will probably be made by Chinese SMIC, because they are also prevented from using advanced production in the west, most notably TSMC. China will prevail, because it’s become a national project, of both prestige and necessity, and they have a massive talent mass and resources, so nothing can stop it now. IMO USA would clearly have been better off allowing China to use American chips. Now China will soon compete directly on both production and design too.
  • New Supermaterial: As Strong As Steel And As Light As Styrofoam

    Technology technology
    21
    1
    60 Stimmen
    21 Beiträge
    108 Aufrufe
    D
    I remember an Arthur Clarke novel where a space ship needs water from the planet below. The easiest thing is to lower cables from space and then lift some ice bergs.