Skip to content

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology
254 123 6.2k
  • Yeah I have a bash one liner AI model that ingests your media and spits out a 99.9999999% accurate replica through the power of changing the filename.

    cp

    Out performs the latest and greatest AI models

    I call this legally distinct, this is legal advice.

  • This post did not contain any content.

    Unpopular opinion but I don't see how it could have been different.

    • There's no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It's transformative work - it's even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

  • calm down everyone.
    its only legal for parasitic mega corps, the normal working people will be harassed to suicide same as before.

    its only a crime if the victims was rich or perpetrator was not rich.

    Right. Where's the punishment for Meta who admitted to pirating books?

  • mv will save you some disk space.

    Unless you're moving across partitions it will change the filesystem metadata to move the path, but not actually do anything to the data. Sorry, you failed, it's jail for you.

  • I think this means we can make a torrent client with a built in function that uses 0.1% of 1 CPU core to train an ML model on anything you download. You can download anything legally with it then. 👌

    ...no?

    That's exactly what the ruling prohibits - it's fair use to train AI models on any copies of books that you legally acquired, but never when those books were illegally acquired, as was the case with the books that Anthropic used in their training here.

    This satirical torrent client would be violating the laws just as much as one without any slow training built in.

  • This post did not contain any content.

    This 240TB JBOD full of books? Oh heavens forbid, we didn’t pirate it. It uhh… fell of a truck, yes, fell off a truck.

  • This post did not contain any content.

    It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

  • ...no?

    That's exactly what the ruling prohibits - it's fair use to train AI models on any copies of books that you legally acquired, but never when those books were illegally acquired, as was the case with the books that Anthropic used in their training here.

    This satirical torrent client would be violating the laws just as much as one without any slow training built in.

    But if one person buys a book, trains an "AI model" to recite it, then distributes that model we good?

  • But if one person buys a book, trains an "AI model" to recite it, then distributes that model we good?

    I don't think anyone would consider complete verbatim recitement of the material to be anything but a copyright violation, being the exact same thing that you produce.

    Fair use requires the derivative work to be transformative, and no transformation occurs when you verbatim recite something.

  • Unpopular opinion but I don't see how it could have been different.

    • There's no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It's transformative work - it's even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

  • I don't think anyone would consider complete verbatim recitement of the material to be anything but a copyright violation, being the exact same thing that you produce.

    Fair use requires the derivative work to be transformative, and no transformation occurs when you verbatim recite something.

    "Recite the complete works of Shakespeare but replace every thirteenth thou with this"

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

    I'll be honest with you - I genuinely sympathize with the cause but I don't see how this could ever be solved with the methods you suggested. The world is not coming together to hold hands and koombayah out of this one. Trade deals are incredibly hard and even harder to enforce so free market is clearly the only path forward here.

  • "Recite the complete works of Shakespeare but replace every thirteenth thou with this"

    I'd be impressed with any model that succeeds with that, but assuming one does, the complete works of Shakespeare are not copyright protected - they have fallen into the public domain since a very long time ago.

    For any works still under copyright protection, it would probably be a case of a trial to determine whether a certain work is transformative enough to be considered fair use. I'd imagine that this would not clear that bar.

    1. Idgaf about China and what they do and you shouldn't either, even if US paranoia about them is highly predictable.
    2. Depending on the outputs it's not always that transformative.
    3. The moat would be good actually. The business model of LLMs isn't good, but it's not even viable without massive subsidies, not least of which is taking people's shit without paying.

    It's a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can't afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can't fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

    Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There's all sorts of ways to fix it, but this isn't it.

    ETA: Anthropic are some bitches. "Oh no the fines would ruin us, our business would go under and we'd never maka da money :*-(" Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing "low-level offenders". If they wanted to they could crush you.

    Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

    Maybe something could be hacked together to fix copyright, but further complication there is just going to make accurate enforcement even harder. And we already have Google (in YouTube) already doing a shitty job of it and that's.... One of the largest companies on earth.

    We should just kill copyright. Yes, it'll disrupt Hollywood. Yes it'll disrupt the music industry. Yes it'll make it even harder to be successful or wealthy as an author. But this is going to happen one way or the other so long as AI can be trained on copyrighted works (and maybe even if not). We might as well get started on the transition early.

  • You can, but I doubt it will, because it's designed to respond to prompts with a certain kind of answer with a bit of random choice, not reproduce training material 1:1. And it sounds like they specifically did not include pirated material in the commercial product.

    Yeah, you can certainly get it to reproduce some pieces (or fragments) of work exactly but definitely not everything. Even a frontier LLM's weights are far too small to fully memorize most of their training data.

  • Unless you're moving across partitions it will change the filesystem metadata to move the path, but not actually do anything to the data. Sorry, you failed, it's jail for you.

    stupid inodes preventing me from burning though my drive life

  • It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

    was gonna say, this seems like the best outcome for this particular trial. there was potential for fair use to be compromised, and for piracy to be legal if you're a large corporation. instead, they upheld that you can do what you want with things you have paid for.

  • Unpopular opinion but I don't see how it could have been different.

    • There's no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It's transformative work - it's even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    You're getting douchevoted because on lemmy any AI-related comment that isn't negative enough about AI is the Devil's Work.

  • It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

    Nobody ever reads articles, everybody likes to get angry at headlines, which they wrongly interpret the way it best tickles their rage.

    Regarding the ruling, I agree with you that it's a good thing, in my opinion it makes a lot of sense to allow fair use in this case

  • calm down everyone.
    its only legal for parasitic mega corps, the normal working people will be harassed to suicide same as before.

    its only a crime if the victims was rich or perpetrator was not rich.

    This ruling stated that corporations are not allowed to pirate books to use them in training. Please read the headlines more carefully, and read the article.

  • Scam recovery hack and spy

    Technology technology
    6
    2
    1 Stimmen
    6 Beiträge
    32 Aufrufe
    D
    He keeping that one in his back pocket. Way in there.
  • 66 Stimmen
    2 Beiträge
    33 Aufrufe
    W
    In April, Nigeria asked Google, Microsoft, and Amazon to set concrete deadlines for opening data centers in the country. Nigeria has been making this demand for about four years, but the companies have so far failed to fulfill their promises. Now, Nigeria has set up a working group with the companies to ensure that data is stored within its shores. Just onshoring the data center does not solve the problems. You can't be sure no data travels to the US servers, some data does need to travel to the US servers, and the entire DC is still subject to US software and certificate keychains. It's better, but not good or safe. I need to channel my inner Mike Ehrmantrout to the US tech companies and government: you had a good thing going you stupid son of a bitch. You had everything you needed and it all ran like clockwork. You could have shut your mouth, cooked, and made as much money as you needed, but you just had to blow it up, you and your pride and your ego. Seriously, this is a massive own goal by the US government. This is a massive loss to US hegemony and influence around the world that's never coming back. It has never been easier to build sovereign clouds with off the shelf and open source tooling. The best practices are largely documented, software is commoditized, and there are plenty of qualified people out there these days and governments staring down the barrel of existential risk have finally got the incentive to fund these efforts.
  • Uploading The Human Mind Could Become a Reality, Expert Says

    Technology technology
    12
    1
    6 Stimmen
    12 Beiträge
    117 Aufrufe
    r3d4ct3d@midwest.socialR
    what mustard is best for the human body?
  • 8 Stimmen
    2 Beiträge
    30 Aufrufe
    roofuskit@lemmy.worldR
    Meta? Isn't that owned by alleged pedophile Mark Zuckerberg? I heard he was a pedo on Facebook.
  • 168 Stimmen
    11 Beiträge
    110 Aufrufe
    A
    Law enforcement officer
  • 12 Stimmen
    7 Beiträge
    72 Aufrufe
    C
    Sure, he wasn't an engineer, so no, Jobs never personally "invented" anything. But Jobs at least knew what was good and what was shit when he saw it. Under Tim Cook, Apple just keeps putting out shitty unimaginative products, Cook is allowing Apple to stagnate, a dangerous thing to do when they have under 10% market share.
  • 0 Stimmen
    7 Beiträge
    70 Aufrufe
    F
    It's an actively hostile act, regardless of what your beliefs are on the copyright system.
  • 0 Stimmen
    4 Beiträge
    45 Aufrufe
    K
    Only way I'll want a different phone brand is if it comes with ZERO bloatware and has an excellent internal memory/storage cleanse that has nothing to do with Google's Files or a random app I'm not sure I can trust without paying or rooting. So far my A series phones do what I need mostly and in my opinion is superior to the Motorola's my fiancé prefers minus the phone-phone charge ability his has, everything else I'm just glad I have enough control to tweak things to my liking, however these days Samsungs seem to be infested with Google bloatware and apps that insist on opening themselves back up regardless of the widespread battery restrictions I've assigned (even was sent a "Stop Closing my Apps" notif that sent me to an article ) short of Disabling many unnecessary apps bc fully rooting my devices is something I rarely do anymore. I have a random Chinese brand tablet where I actually have more control over the apps than either of my A series phones whee Force Stopping STAYS that way when I tell them to! I hate being listened to for ads and the unwanted draining my battery life and data (I live off-grid and pay data rates because "Unlimited" is some throttled BS) so my ability to control what's going on in the background matters a lot to me, enough that I'm anti Meta-apps and avoid all non-essential Google apps. I can't afford topline phones and the largest data plan, so I work with what I can afford and I'm sad refurbished A lines seem to be getting more expensive while giving away my control to companies. Last A line I bought that was supposed to be my first 5G phone was network locked, so I got ripped off, but it still serves me well in off-grid life. Only app that actually regularly malfunctions when I Force Stop it's background presence is Roku, which I find to have very an almost insidious presence in our lives. Google Play, Chrome, and Spotify never acts incompetent in any way no matter how I have to open the setting every single time I turn Airplane Mode off. Don't need Gmail with Chrome and DuckDuckGo has been awesome at intercepting self-loading ads. I hope one day DDG gets better bc Google seems to be terrible lately and I even caught their AI contradicting itself when asking about if Homo Florensis is considered Human (yes) and then asked the oldest age of human remains, and was fed the outdated narrative of 300,000 years versus 700,000+ years bipedal pre-humans have been carbon dated outside of the Cradle of Humanity in South Africa. SO sorry to go off-topic, but I've got a big gripe with Samsung's partnership with Google, especially considering the launch of Quantum Computed AI that is still being fine-tuned with company-approved censorships.