linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology

213 Beiträge 113 Kommentatoren 0 Aufrufe

W wraithgear@lemmy.world

If a human did that it’s still plagiarism.
L This user is from outside of this forum
L This user is from outside of this forum
lifeinmultiplechoice@lemmy.world

schrieb zuletzt editiert von lifeinmultiplechoice@lemmy.world

#128

Oh I agree it should be, but following the judges ruling, I don't see how it could be. You trained an LLM on textbooks that were purchased, not pirated. And the LLM distributed the responses.

(Unless you mean the human reworded them, then yeah, we aren't special apparently)
W 1 Antwort Letzte Antwort

1
D deathsembrace@lemmy.world

So I can't use any of these works because it's plagiarism but AI can?
E This user is from outside of this forum
E This user is from outside of this forum
enkimaru@lemmy.world

schrieb zuletzt editiert von

#129

Why would it be plagiarism if you use the knowledge you gain from a book?
1 Antwort Letzte Antwort

4
D devils_advocate@sh.itjust.works

Does buying the book give you license to digitise it?

Does owning a digital copy of the book give you license to convert it into another format and copy it into a database?

Definitions of "Ownership" can be very different.
E This user is from outside of this forum
E This user is from outside of this forum
enkimaru@lemmy.world

schrieb zuletzt editiert von

#130

You can digitize the books you own. You do not need a license for that. And of course you could put that digital format into a database. As databases are explicit exceptions from copyright law. If you want to go to the extreme: delete first copy. Then you have only in the database. However: AIs/LLMs are not based on data bases. But on neural networks. The original data gets lost when "learned".
N 1 Antwort Letzte Antwort

3
L lifeinmultiplechoice@lemmy.world

Oh I agree it should be, but following the judges ruling, I don't see how it could be. You trained an LLM on textbooks that were purchased, not pirated. And the LLM distributed the responses.

(Unless you mean the human reworded them, then yeah, we aren't special apparently)
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von wraithgear@lemmy.world

#131

Yes, on the second part. Just rearranging or replacing words in a text is not transformative, which is a requirement. There is an argument that the ‘AI’ are capable of doing transformative work, but the tokenizing and weight process is not magic and in my use of multiple LLM’s they do not have an understanding of the material any more then a dictionary understands the material printed on its pages.

An example was the wine glass problem. Art ‘AI’s were unable to display a wine glass filled to the top. No matter how it was prompted, or what style it aped, it would fail to do so and report back that the glass was full. But it could render a full glass of water. It didn’t understand what a full glass was, not even for the water. How was this possible? Well there was very little art of a full wine glass, because society has an unspoken rule that a full wine glass is the epitome of gluttony, and it is to be savored not drunk. Where as the reference of full glasses of water were abundant. It doesn’t know what full means, just that pictures of full glass of water are tied to phrases full, glass, and water.
L 1 Antwort Letzte Antwort

0
W wraithgear@lemmy.world

If what you are saying is true, why were these ‘AI’s” incapable of rendering a full wine glass? It ‘knows’ the concept of a full glass of water, but because of humanities social pressures, a full wine glass being the epitome of gluttony, art work did not depict a full wine glass, no matter how ai prompters demanded, it was unable to link the concepts until it was literally created for it to regurgitate it out. It seems ‘AI’ doesn’t really learn, but regurgitates art out in collages of taken assets, smoothed over at the seams.
A This user is from outside of this forum
A This user is from outside of this forum
alsimoneau@lemmy.ca

schrieb zuletzt editiert von

#132

Copilot did it just fine
W A 2 Antworten Letzte Antwort

1
D derisionconsulting@lemmy.ca
Formatting thing: if you start a line in a new paragraph with four spaces, it assumes that you want to display the text as a code and won't line break.

This means that the last part of your comment is a long line that people need to scroll to see. If you remove one of the spaces, or you remove the empty line between it and the previous paragraph, it'll look like a normal comment

With an empty line of space:

1 space - and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

2 spaces - and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

3 spaces - and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.
```
4 spaces -  and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.
```
I This user is from outside of this forum
I This user is from outside of this forum
isveryloud@lemmy.ca

schrieb zuletzt editiert von

#133

Thanks, I had copy-pasted it from the website
1 Antwort Letzte Antwort

2
A alsimoneau@lemmy.ca

Copilot did it just fine
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von

#134
1 it’s not full, but closer then it was.
1. I specifically said that the AI was unable to do it until someone specifically made a reference so that it could start passing the test so it’s a little bit late to prove much.
A 1 Antwort Letzte Antwort

4
S saharamaleikuhm@feddit.org

But I thought they admitted to torrenting terabytes of ebooks?
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#135

Facebook (Meta) torrented TBs from Libgen, and their internal chats leaked so we know about that, and IIRC they've been sued. Maybe you're thinking of that case?
1 Antwort Letzte Antwort

10
P prox@lemmy.world

FTA:

Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.
I This user is from outside of this forum
I This user is from outside of this forum
interdimensionalmeme@lemmy.ml

schrieb zuletzt editiert von

#136

What is means is they don't own the models. They are the commons of humanity, they are merely temporary custodians. The nightnare ending is the elites keeping the most capable and competent models for themselves as private play things. That must not be allowed to happen under any circumstances. Sue openai, anthropic and the other enclosers, sue them for trying to take their ball and go home. Disposses them and sue the investors for their corrupt influence on research.
1 Antwort Letzte Antwort

0
B bjoern_tantau@swg-empire.de

And thus the singularity was born.
I This user is from outside of this forum
I This user is from outside of this forum
interdimensionalmeme@lemmy.ml

schrieb zuletzt editiert von

#137

Yes please a singularity of intellectual property that collapses the idea of ownong ideas. Of making the infinitely freely copyableinto a scarce ressource. What corrupt idiocy this has been. Landlords for ideas and look what garbage it has been producing.
1 Antwort Letzte Antwort

2
W wraithgear@lemmy.world

Yes, on the second part. Just rearranging or replacing words in a text is not transformative, which is a requirement. There is an argument that the ‘AI’ are capable of doing transformative work, but the tokenizing and weight process is not magic and in my use of multiple LLM’s they do not have an understanding of the material any more then a dictionary understands the material printed on its pages.

An example was the wine glass problem. Art ‘AI’s were unable to display a wine glass filled to the top. No matter how it was prompted, or what style it aped, it would fail to do so and report back that the glass was full. But it could render a full glass of water. It didn’t understand what a full glass was, not even for the water. How was this possible? Well there was very little art of a full wine glass, because society has an unspoken rule that a full wine glass is the epitome of gluttony, and it is to be savored not drunk. Where as the reference of full glasses of water were abundant. It doesn’t know what full means, just that pictures of full glass of water are tied to phrases full, glass, and water.
L This user is from outside of this forum
L This user is from outside of this forum
lifeinmultiplechoice@lemmy.world

schrieb zuletzt editiert von lifeinmultiplechoice@lemmy.world

#138

Yeah, we had a fun example a while ago, let me see if I can still find it.

We would ask to create a photo of a cat with no tail.

And then tell it there was indeed a tail, and ask it to draw an arrow to point to it.

It just points to where the tail most commonly is, or was said to be in a picture it was not referencing.

Edit: granted now, it shows a picture of a cat where you just can't see the tail in the picture.
1 Antwort Letzte Antwort

1
F freedomadvocate@lemmy.net.au

Makes sense. AI can “learn” from and “read” a book in the same way a person can and does, as long as it is acquired legally. AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?

Some people just see “AI” and want everything about it outlawed basically. If you put some information out into the public, you don’t get to decide who does and doesn’t consume and learn from it. If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#139

AI can “learn” from and “read” a book in the same way a person can and does,

If it's in the same way, then why do you need the quotation marks? Even you understand that they're not the same.

And either way, machine learning is different from human learning in so many ways it's ridiculous to even discuss the topic.

AI doesn’t reproduce a work that it “learns” from

That depends on the model and the amount of data it has been trained on. I remember the first public model of ChatGPT producing a sentence that was just one word different from what I found by googling the text (from some scientific article summary, so not a trivial sentence that could line up accidentally). More recently, there was a widely reported-on study of AI-generated poetry where the model was requested to produce a poem in the style of Chaucer, and then produced a letter-for-letter reproduction of the well-known opening of the Canterbury Tales. It hasn't been trained on enough Middle English poetry and thus can't generate any of it, so it defaulted to copying a text that probably occurred dozens of times in its training data.
1 Antwort Letzte Antwort

2
L loreleisanktheship@lemmy.ml

As long as they don't use exactly the same words in the book, yeah, as I understand it.
V This user is from outside of this forum
V This user is from outside of this forum
vane@lemmy.world

schrieb zuletzt editiert von vane@lemmy.world

#140

How they don't use same words as in the book ? That's not how LLM works. They use exactly same words if the probabilities align. It's proved by this study. https://arxiv.org/abs/2505.12546
S N 2 Antworten Letzte Antwort

1
A alsimoneau@lemmy.ca

Copilot did it just fine
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#141

Bro are you a robot yourself? Does that look like a glass full of wine?
A 1 Antwort Letzte Antwort

1
G gaja@lemm.ee

I am educated on this. When an ai learns, it takes an input through a series of functions and are joined at the output. The set of functions that produce the best output have their functions developed further. Individuals do not process information like that. With poor exploration and biasing, the output of an AI model could look identical to its input. It did not "learn" anymore than a downloaded video ran through a compression algorithm.
E This user is from outside of this forum
E This user is from outside of this forum
enkimaru@lemmy.world

schrieb zuletzt editiert von

#142

You are obviously not educated on this.

It did not “learn” anymore than a downloaded video ran through a compression algorithm.
Just: LoLz.
H 1 Antwort Letzte Antwort

3
F forkdestroyer@infosec.pub

Make an AI that is trained on the books.

Tell it to tell you a story for one of the books.

Read the story without paying for it.

The law says this is ok now, right?
E This user is from outside of this forum
E This user is from outside of this forum
enkimaru@lemmy.world

schrieb zuletzt editiert von

#143

The LLM is not repeating the same book. The owner of the LLM has the exact same rights to do with what his LLM is reading, as you have to do with what ever YOU are reading.

As long as it is not a verbatim recitation, it is completely okay.

According to story telling theory: there are only roughly 15 different story types anyway.
1 Antwort Letzte Antwort

4
E enkimaru@lemmy.world

You are obviously not educated on this.

It did not “learn” anymore than a downloaded video ran through a compression algorithm.
Just: LoLz.
H This user is from outside of this forum
H This user is from outside of this forum
hoppolito@mander.xyz

schrieb zuletzt editiert von

#144

I am not sure what your contention, or gotcha, is with the comment above but they are quite correct. And additionally chose quite an apt example with video compression since in most ways current 'AI' effectively functions as a compression algorithm, just for our language corpora instead of video.
N 1 Antwort Letzte Antwort

3
L lovablesidekick@lemmy.world

Lawsuits are multifaceted. This statement isn't a a defense or an argument for innocence, it's just what it says - an assertion that the proposed damages are unreasonably high. If the court agrees, the plaintiff can always propose a lower damage claim that the court thinks is reasonable.
T This user is from outside of this forum
T This user is from outside of this forum
thistlewick@lemmynsfw.com

schrieb zuletzt editiert von

#145

You’re right, each of the 5 million books’ authors should agree to less payment for their work, to make the poor criminals feel better.

If I steal $100 from a thousand people and spend it all on hookers and blow, do I get out of paying that back because I don’t have the funds? Should the victims agree to get $20 back instead because that’s more within my budget?
L 1 Antwort Letzte Antwort

7
D devils_advocate@sh.itjust.works

Does buying the book give you license to digitise it?

Does owning a digital copy of the book give you license to convert it into another format and copy it into a database?

Definitions of "Ownership" can be very different.
V This user is from outside of this forum
V This user is from outside of this forum
voterfrog@lemmy.world

schrieb zuletzt editiert von voterfrog@lemmy.world

#146

It seems like a lot of people misunderstand copyright so let's be clear: the answer is yes. You can absolutely digitize your books. You can rip your movies and store them on a home server and run them through compression algorithms.

Copyright exists to prevent others from redistributing your work so as long as you're doing all of that for personal use, the copyright owner has no say over what you do with it.

You even have some degree of latitude to create and distribute transformative works with a violation only occurring when you distribute something pretty damn close to a copy of the original. Some perfectly legal examples: create a word cloud of a book, analyze the tone of news article to help you trade stocks, produce an image containing the most prominent color in every frame of a movie, or create a search index of the words found on all websites on the internet.

You can absolutely do the same kinds of things an AI does with a work as a human.
1 Antwort Letzte Antwort

11
B bytesonbike@discuss.online

That's legal just don't look at them or enjoy them.
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#147

Yeah, I don't think that would fly.

"Your honour, I was just hoarding that terabyte of Hollywood films, I haven't actually watched them."
R 1 Antwort Letzte Antwort

0

Anmelden zum Antworten

C

First Tesla Robotaxi Ride
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
14

37 Stimmen

14 Beiträge

4 Aufrufe

A

How do you heil a Tesla cab?....you don't. Unless you want to end up rotting in a concentration camp in El Salvador. Fuck face is exactly the type who would rape you in the morning and then walk outside the room into the balcony and shoot an innocent bystander for no reason. See "Schindler's list". So you don't.
F

Resurrecting a dead torrent tracker and finding 3 million peers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
58

321 Stimmen

58 Beiträge

10 Aufrufe

M

donating online Yeah i suppose any form of payment that you have to keep secret for some reason is a reason to use crypto, though I struggle to imagine needing that if you're not doing something dodgy avoiding scams for p2p transactions Wat. Crypto is not good at solving that, it's in fact much much worse than traditional payment methods. There's a reason scammers always want to be paid in crypto boycotting the banking system What specifically are you boycotting? The money that backs your crypto (i.e. that you bought it with) still sits in a bank account somewhere and continues to support the banks. All you're boycotting then are payments, but those are usually free for consumers (many banks lose money on them) so you're not exactly "sticking it to the man" by not using them. Evem if you were somehow hurting banks by using crypto, if you think the people that benefit from you using crypto (crypto exchange owners and billionaires that own crypto etc.) are less evil than goverment regulated banks, you're deluded. What about avoiding international payment fees? You'll spend more money using crypto for that, not less
A

Apple’s most sweeping software redesign disappoints mainland Chinese consumers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
47

1

99 Stimmen

47 Beiträge

12 Aufrufe

P

One of the greatest videos ever.
P

Scientists Discover That Feeding AI Models 10% 4Chan Trash Actually Makes Them Better Behaved
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
133

1

503 Stimmen

133 Beiträge

36 Aufrufe

J

Headlines have length constraints
P

Welcome to the web we lost
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
22

1

182 Stimmen

22 Beiträge

14 Aufrufe

C

Is it though? Its always far easier to be loud and obnoxious than do something constructive, even with the internet and LLMs, in fact those things are amplifiers which if anything make the attention imbalance even more drastic and unrepresentative of actual human behaviour. In the time it takes me to write this comment some troll can write a dozen hateful ones, or a bot can write a thousand. Doesn't mean humans are shitty in a 1000/1 ratio, just means shitty people can now be a thousand times louder.
P

Australia could tax Google, Facebook and other tech giants with a digital services tax – but don’t hold your breath
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

48 Stimmen

5 Beiträge

7 Aufrufe

L

Arguably we should be imposing 25% DST on digital products to counter the 25% tariff on aluminium and steel and then 10% on everything else. The US started it by imposing blanket tariffs in spite of our free trade agreement.
E

Realtek's $10 tiny 10GbE network adapter is coming to motherboards later this year
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

5 Stimmen

10 Beiträge

6 Aufrufe

S

You could look into automatic local caching for diles you're planning to seed, and stick that on an SSD. That way you don't hammer the HDDs in the NAS and still get the good feels of seeding. Then automatically delete files once they get to a certain seed rate or something and you're golden. How aggressive you go with this depends on your actual use case. Are you actually editing raw footage over the network while multiple other clients are streaming other stuff? Or are you just interested in having it be capable? What's the budget? But that sounds complicated. I'd personally rather just DIY it, that way you can put an SSD in there for cache and you get most of the benefits with a lot less cost, and you should be able to respond to issues with minimal changes (i.e. add more RAM or another caching drive).
J

How I use Mastodon in 2025 - fredrocha.net
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
11

1

0 Stimmen

11 Beiträge

6 Aufrufe

J

Sure. Efficiency isn't everything, though. At the end of the article there are a few people to get you started. Then you can go to your favorites in that list, and follow some of the people THEY are following. Rinse and repeat, follow boosted folks. You'll have 100 souls in no time.