linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology

254 Beiträge 123 Kommentatoren 1.8k Aufrufe

A alphane_moon@lemmy.world

And this is how you know that the American legal system should not be trusted.

Mind you I am not saying this an easy case, it's not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.
T This user is from outside of this forum
T This user is from outside of this forum
themeatbridge@lemmy.world

schrieb zuletzt editiert von

#4

This is an easy case. Using published works to train AI without paying for the right to do so is piracy. The judge making this determination is an idiot.
A N 2 Antworten Letzte Antwort

78
T themeatbridge@lemmy.world

This is an easy case. Using published works to train AI without paying for the right to do so is piracy. The judge making this determination is an idiot.
A This user is from outside of this forum
A This user is from outside of this forum
abidanyre@lemmy.world

schrieb zuletzt editiert von

#5

You're right. When you're doing it for commercial gain, it's not fair use anymore. It's really not that complicated.
T 1 Antwort Letzte Antwort

44
P pro@programming.dev

This post did not contain any content.
J This user is from outside of this forum
J This user is from outside of this forum
jrockwar@feddit.uk

schrieb zuletzt editiert von

#6

I think this means we can make a torrent client with a built in function that uses 0.1% of 1 CPU core to train an ML model on anything you download. You can download anything legally with it then.
B G 2 Antworten Letzte Antwort

138
J jrockwar@feddit.uk

I think this means we can make a torrent client with a built in function that uses 0.1% of 1 CPU core to train an ML model on anything you download. You can download anything legally with it then.
B This user is from outside of this forum
B This user is from outside of this forum
bjoern_tantau@swg-empire.de

schrieb zuletzt editiert von

#7

And thus the singularity was born.
S I 2 Antworten Letzte Antwort

33
A alphane_moon@lemmy.world

And this is how you know that the American legal system should not be trusted.

Mind you I am not saying this an easy case, it's not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.
C This user is from outside of this forum
C This user is from outside of this forum
catloaf@lemm.ee

schrieb zuletzt editiert von catloaf@lemm.ee

#8

The order seems to say that the trained LLM and the commercial Claude product are not linked, which supports the decision. But I'm not sure how he came to that conclusion. I'm going to have to read the full order when I have time.

This might be appealed, but I doubt it'll be taken up by SCOTUS until there are conflicting federal court rulings.
T 1 Antwort Letzte Antwort

5
P pro@programming.dev

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
prox@lemmy.world

schrieb zuletzt editiert von

#9

FTA:

Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.
K A I L P 9 Antworten Letzte Antwort

282
P pro@programming.dev

This post did not contain any content.
H This user is from outside of this forum
H This user is from outside of this forum
hendrik@palaver.p3x.de

schrieb zuletzt editiert von hendrik@palaver.p3x.de

#10

That almost sounds right, doesn't it? If you want 5 million books, you can't just steal/pirate them, you need to buy 5 million copies. I'm glad the court ruled that way.

I feel that's a good start. Now we need some more clear regulation on what fair use is and what transformative work is and what isn't. And how that relates to AI. I believe as it's quite a disruptive and profitable business, we should maybe make those companies pay some extra. Not just what I pay for a book. But the first part, that "stealing" can't be "fair" is settled now.
W 1 Antwort Letzte Antwort

6
P prox@lemmy.world

FTA:

Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.
K This user is from outside of this forum
K This user is from outside of this forum
krashmo@lemmy.world

schrieb zuletzt editiert von

#11

Funny how that kind of thing only works for rich people
1 Antwort Letzte Antwort

136
P pro@programming.dev

This post did not contain any content.
D This user is from outside of this forum
D This user is from outside of this forum
dragomus@lemmy.world

schrieb zuletzt editiert von

#12

So, let me see if I get this straight:

Books are inherently an artificial construct.
If I read the books I train the A(rtificially trained)Intelligence in my skull.
Therefore the concept of me getting them through "piracy" is null and void...
J 1 Antwort Letzte Antwort

2
P prox@lemmy.world

FTA:

Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.
A This user is from outside of this forum
A This user is from outside of this forum
artifex@lemmy.zip

schrieb zuletzt editiert von

#13

Ah the old “owe $100 and the bank owns you; owe $100,000,000 and you own the bank” defense.
1 Antwort Letzte Antwort

104
C catloaf@lemm.ee

The order seems to say that the trained LLM and the commercial Claude product are not linked, which supports the decision. But I'm not sure how he came to that conclusion. I'm going to have to read the full order when I have time.

This might be appealed, but I doubt it'll be taken up by SCOTUS until there are conflicting federal court rulings.
T This user is from outside of this forum
T This user is from outside of this forum
tagger@lemmy.world

schrieb zuletzt editiert von

#14

If you are struggling for time, just put the opinion into chat GPT and ask for a summary. it will save you tonnes of time.
1 Antwort Letzte Antwort

5
P pro@programming.dev

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
pattymcb@lemmy.world

schrieb zuletzt editiert von

#15

Can I not just ask the trained AI to spit out the text of the book, verbatim?
C K B 3 Antworten Letzte Antwort

4
A abidanyre@lemmy.world

You're right. When you're doing it for commercial gain, it's not fair use anymore. It's really not that complicated.
T This user is from outside of this forum
T This user is from outside of this forum
tabular@lemmy.world

schrieb zuletzt editiert von

#16

If you're using the minimum amount, in a transformative way that doesn't compete with the original copyrighted source, then it's still fair use even if it's commercial. (This is not saying that's what LLM are doing)
1 Antwort Letzte Antwort

9
P pro@programming.dev

This post did not contain any content.
S This user is from outside of this forum
S This user is from outside of this forum
snekerpimp@lemmy.snekerpimp.space

schrieb zuletzt editiert von

#17

“I torrented all this music and movies to train my local ai models”
W V B V 4 Antworten Letzte Antwort

47
H hendrik@palaver.p3x.de

That almost sounds right, doesn't it? If you want 5 million books, you can't just steal/pirate them, you need to buy 5 million copies. I'm glad the court ruled that way.

I feel that's a good start. Now we need some more clear regulation on what fair use is and what transformative work is and what isn't. And how that relates to AI. I believe as it's quite a disruptive and profitable business, we should maybe make those companies pay some extra. Not just what I pay for a book. But the first part, that "stealing" can't be "fair" is settled now.
W This user is from outside of this forum
W This user is from outside of this forum
windyrebel@lemmy.world

schrieb zuletzt editiert von windyrebel@lemmy.world

#18

If you want 5 million books, you can't just steal/pirate them, you need to buy 5 million copies. I'm glad the court ruled that way.

If you want 5 million books to train your AI to make you money, you can just steal them and reap benefits of other’s work. No need to buy 5 million copies!

/s

Jesus, dude. And for the record, I’m not suggesting people steal things. I am saying that companies shouldn’t get away with shittiness just because.
H 1 Antwort Letzte Antwort

4
P pro@programming.dev

This post did not contain any content.
F This user is from outside of this forum
F This user is from outside of this forum
facedeer@fedia.io

schrieb zuletzt editiert von

#19

This was a preliminary judgment, he didn't actually rule on the piracy part. That part he deferred to an actual full trial.

The part about training being a copyright violation, though, he ruled against.
B 1 Antwort Letzte Antwort

4
A alphane_moon@lemmy.world

And this is how you know that the American legal system should not be trusted.

Mind you I am not saying this an easy case, it's not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.
F This user is from outside of this forum
F This user is from outside of this forum
facedeer@fedia.io

schrieb zuletzt editiert von

#20

You should read the ruling in more detail, the judge explains the reasoning behind why he found the way that he did. For example:

Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.

This isn't "oligarch interests and demands," this is affirming a right to learn and that copyright doesn't allow its holder to prohibit people from analyzing the things that they read.
R 1 Antwort Letzte Antwort

17
P pro@programming.dev

This post did not contain any content.
K This user is from outside of this forum
K This user is from outside of this forum
kryptoniancodemonkey@lemmy.world

schrieb zuletzt editiert von kryptoniancodemonkey@lemmy.world

#21

It's pretty simple as I see it. You treat AI like a person. A person needs to go through legal channels to consume material, so piracy for AI training is as illegal as it would be for personal consumption. Consuming legally possessed copywritten material for "inspiration" or "study" is also fine for a person, so it is fine for AI training as well. Commercializing derivative works that infringes on copyright is illegal for a person, so it should be illegal for an AI as well. All produced materials, even those inspired by another piece of media, are permissible if not monetized, otherwise they need to be suitably transformative. That line can be hard to draw even when AI is not involved, but that is the legal standard for people, so it should be for AI as well. If I browse through Deviant Art and learn to draw similarly my favorite artists from their publically viewable works, and make a legally distinct cartoon mouse by hand in a style that is similar to someone else's and then I sell prints of that work, that is legal. The same should be the case for AI.

But! Scrutiny for AI should be much stricter given the inherent lack of true transformative creativity. And any AI that has used pirated materials should be penalized either by massive fines or by wiping their training and starting over with legally licensed or purchased or otherwise public domain materials only.
K 1 Antwort Letzte Antwort

4
W windyrebel@lemmy.world

If you want 5 million books, you can't just steal/pirate them, you need to buy 5 million copies. I'm glad the court ruled that way.

If you want 5 million books to train your AI to make you money, you can just steal them and reap benefits of other’s work. No need to buy 5 million copies!

/s

Jesus, dude. And for the record, I’m not suggesting people steal things. I am saying that companies shouldn’t get away with shittiness just because.
H This user is from outside of this forum
H This user is from outside of this forum
hendrik@palaver.p3x.de

schrieb zuletzt editiert von hendrik@palaver.p3x.de

#22

I'm not sure whose reading skills are not on par... But that's what I get from the article. They'll face consequences for stealing them. Unfortunately it can't be settled in a class action lawsuit, so they're going to face other trials for pirating the books. And they won't get away with this.
N 1 Antwort Letzte Antwort

4
F facedeer@fedia.io

You should read the ruling in more detail, the judge explains the reasoning behind why he found the way that he did. For example:

Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.

This isn't "oligarch interests and demands," this is affirming a right to learn and that copyright doesn't allow its holder to prohibit people from analyzing the things that they read.
R This user is from outside of this forum
R This user is from outside of this forum
realitista@lemmy.world

schrieb zuletzt editiert von

#23

But AFAIK they actually didn't acquire the legal rights even to read the stuff they trained from. There were definitely cases of pirated books used to train models.
F 1 Antwort Letzte Antwort

1

Anmelden zum Antworten

T

Twitter opens up to Community Notes written by AI bots
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

1

44 Stimmen

9 Beiträge

55 Aufrufe

G

Stop fucking using twitter. Stop posting about it, stop posting things that link to it. Delete your account like you should have already.
Z

The loopholes in US immigration law enforcement and the erosion of immigration rights
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1 Stimmen

1 Beiträge

12 Aufrufe

Niemand hat geantwortet
T

NO KINGS! Tomorrow on Trump's birthday, we protest across the entire nation. Check the website for No Kings events near you!
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
13

2

179 Stimmen

13 Beiträge

18 Aufrufe

S

I will be there. I will be armed. I will carry a gas mask. I will carry water and medical for my compatriots. I will not start shit. I will fight back if it comes to it.
D

Power-Hungry Data Centers Are Warming Homes in Nordic Countries
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

12 Stimmen

3 Beiträge

25 Aufrufe

T

This is also a thing in Denmark. It's required by law to even build a data center.
W

Trans former Wikimedia employee says abuse at the nonprofit is “organization wide”
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
12

19 Stimmen

12 Beiträge

76 Aufrufe

Q

PSA OP "wikipediasuckscoop" seems to have a personal vendetta against wikipedia. All their posts are various articles bashing the site.
P

OpenAI sees human interaction as a competitor to ChatGPT's super assistant ambitions
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
27

1

50 Stimmen

27 Beiträge

146 Aufrufe

S

Brother I live in western Europe and of the 6 supermarkets in my smallish city, 4 offer the handscanner. It's incredibly common here, and very convenient.
O

Where hyperscale hardware goes to retire: Ars visits a very big ITAD site
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

21 Stimmen

3 Beiträge

28 Aufrufe

B

We have to do this ourselves in the government for every decommissioned server/appliance/end user device. We have to fill out paperwork for every single storage drive we destroy, and we can only destroy them using approved destruction tools (e.g. specific degaussers, drive shredders/crushers, etc). Appliances can be kind of a pain, though. It can be tricky sometimes finding all the writable memory in things like switches and routers. But, nothing is worse than storage arrays... destroying hundreds of drives is incredibly tedious.
S

Snapchat Reserves the Right to Use AI-Generated Images of Your Face in Ads
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

0 Stimmen

3 Beiträge

27 Aufrufe

J

I deleted the snapchat now.