linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology

213 Beiträge 113 Kommentatoren 0 Aufrufe

F forkdestroyer@infosec.pub

Make an AI that is trained on the books.

Tell it to tell you a story for one of the books.

Read the story without paying for it.

The law says this is ok now, right?
L This user is from outside of this forum
L This user is from outside of this forum
loreleisanktheship@lemmy.ml

schrieb zuletzt editiert von

#107

As long as they don't use exactly the same words in the book, yeah, as I understand it.
V 1 Antwort Letzte Antwort

5
J j0ester@lemmy.world

Huh? Didn’t Meta not use any permission, and pirated a lot of books to train their model?
G This user is from outside of this forum
G This user is from outside of this forum
gian@lemmy.grys.it

schrieb zuletzt editiert von

#108

True. And I will be happy if someone sue them and the judge say the same thing.
1 Antwort Letzte Antwort

1
L lifeinmultiplechoice@lemmy.world

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:
G This user is from outside of this forum
G This user is from outside of this forum
gian@lemmy.grys.it

schrieb zuletzt editiert von

#109

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right
F 1 Antwort Letzte Antwort

1
D deathsembrace@lemmy.world

So I can't use any of these works because it's plagiarism but AI can?
F This user is from outside of this forum
F This user is from outside of this forum
freedomadvocate@lemmy.net.au

schrieb zuletzt editiert von

#110

You can “use” them to learn from, just like “AI” can.

What exactly do you think AI does when it “learns” from a book, for example? Do you think it will just spit out the entire book if you ask it to?
D G 2 Antworten Letzte Antwort

4
E elrik@lemmy.world

AI can “learn” from and “read” a book in the same way a person can and does

This statement is the basis for your argument and it is simply not correct.

Training LLMs and similar AI models is much closer to a sophisticated lossy compression algorithm than it is to human learning. The processes are not at all similar given our current understanding of human learning.

AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?

The current Disney lawsuit against Midjourney is illustrative - literally, it includes numerous side-by-side comparisons - of how AI models are capable of recreating iconic copyrighted work that is indistinguishable from the original.

If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.

An AI doesn't create works on its own. A human instructs AI to do so. Attribution is also irrelevant. If a human uses AI to recreate the exact tone, structure and other nuances of say, some best selling author, they harm the marketability of the original works which fails fair use tests (at least in the US).
F This user is from outside of this forum
F This user is from outside of this forum
freedomadvocate@lemmy.net.au

schrieb zuletzt editiert von

#111

Your very first statement calling my basis for my argument incorrect is incorrect lol.

LLMs “learn” things from the content they consume. They don’t just take the content in wholesale and keep it there to regurgitate on command.

On your last part, unless someone uses AI to recreate the tone etc of a best selling author *and then markets their book/writing as being from said best selling author, and doesn’t use trademarked characters etc, there’s no issue. You can’t copyright a style of writing.
W E 2 Antworten Letzte Antwort

3
P pro@programming.dev

This post did not contain any content.
S This user is from outside of this forum
S This user is from outside of this forum
saharamaleikuhm@feddit.org

schrieb zuletzt editiert von

#112

But I thought they admitted to torrenting terabytes of ebooks?
A F 2 Antworten Letzte Antwort

28
F freedomadvocate@lemmy.net.au

You can “use” them to learn from, just like “AI” can.

What exactly do you think AI does when it “learns” from a book, for example? Do you think it will just spit out the entire book if you ask it to?
D This user is from outside of this forum
D This user is from outside of this forum
deathsembrace@lemmy.world

schrieb zuletzt editiert von

#113

It cant speak or use any words without it being someone elses words it learned from? Unless its giving sources everything is always from something it learned because it cannot speak or use words without that source in the first place?
N 1 Antwort Letzte Antwort

0
G gian@lemmy.grys.it

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right
F This user is from outside of this forum
F This user is from outside of this forum
fum@lemmy.world

schrieb zuletzt editiert von

#114

This was my understanding also, and why I think the judge is bad at their job.
L 1 Antwort Letzte Antwort

0
E elrik@lemmy.world

AI can “learn” from and “read” a book in the same way a person can and does

This statement is the basis for your argument and it is simply not correct.

Training LLMs and similar AI models is much closer to a sophisticated lossy compression algorithm than it is to human learning. The processes are not at all similar given our current understanding of human learning.

AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?

The current Disney lawsuit against Midjourney is illustrative - literally, it includes numerous side-by-side comparisons - of how AI models are capable of recreating iconic copyrighted work that is indistinguishable from the original.

If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.

An AI doesn't create works on its own. A human instructs AI to do so. Attribution is also irrelevant. If a human uses AI to recreate the exact tone, structure and other nuances of say, some best selling author, they harm the marketability of the original works which fails fair use tests (at least in the US).
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von

#115

Even if we accept all your market liberal premise without question... in your own rhetorical framework the Disney lawsuit should be ruled against Disney.

If a human uses AI to recreate the exact tone, structure and other nuances of say, some best selling author, they harm the marketability of the original works which fails fair use tests (at least in the US).

Says who? In a free market why is the competition from similar products and brands such a threat as to be outlawed? Think reasonably about what you are advocating... you think authorship is so valuable or so special that one should be granted a legally enforceable monopoly at the loosest notions of authorship. This is the definition of a slippery-slope, and yet, it is the status quo of the society we live in.

On it "harming marketability of the original works," frankly, that's a fiction and anyone advocating such ideas should just fucking weep about it instead of enforce overreaching laws on the rest of us. If you can't sell your art because a machine made "too good a copy" of your art, it wasn't good art in the first place and that is not the fault of the machine. Even big pharma doesn't get to outright ban generic medications (even tho they certainly tried)... it is patently fucking absurd to decry artist's lack of a state-enforced monopoly on their work. Why do you think we should extend such a radical policy towards... checks notes... tumblr artists and other commission based creators? It's not good when big companies do it for themselves through lobbying, it wouldn't be good to do it for "the little guy," either. The real artists working in industry don't want to change the law this way because they know it doesn't work in their favor. Disney's lawsuit is in the interest of Disney and big capital, not artists themselves, despite what these large conglomerates that trade in IPs and dreams might try to convince the art world writ large of.
E 1 Antwort Letzte Antwort

0
L lifeinmultiplechoice@lemmy.world

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:
F This user is from outside of this forum
F This user is from outside of this forum
freedomadvocate@lemmy.net.au

schrieb zuletzt editiert von

#116

Not at all true. AI doesn’t just reproduce content it was trained on on demand.
W 1 Antwort Letzte Antwort

0
I isveryloud@lemmy.ca

My interpretation was that AI companies can train on material they are licensed to use, but the courts have deemed that Anthropic pirated this material as they were not licensed to use it.

In other words, if Anthropic bought the physical or digital books, it would be fine so long as their AI couldn't spit it out verbatim, but they didn't even do that, i.e. the AI crawler pirated the book.
D This user is from outside of this forum
D This user is from outside of this forum
devils_advocate@sh.itjust.works

schrieb zuletzt editiert von

#117

Does buying the book give you license to digitise it?

Does owning a digital copy of the book give you license to convert it into another format and copy it into a database?

Definitions of "Ownership" can be very different.
E V 2 Antworten Letzte Antwort

6
F fum@lemmy.world

This was my understanding also, and why I think the judge is bad at their job.
L This user is from outside of this forum
L This user is from outside of this forum
lifeinmultiplechoice@lemmy.world

schrieb zuletzt editiert von

#118

I suppose someone could develop an LLM that digests textbooks, and rewords the text and spits it back out. Then distribute it for free page for page. You can't copy right the math problems I don't think.. so if the text wording is what gives it credence, that would have been changed.
W 1 Antwort Letzte Antwort

0
A ayane@lemmy.vg

I joined lemmy specifically to avoid this reddit mindset of jumping to conclusions after reading a headline

Guess some things never change...
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von

#119

Well to be honest lemmy is less prone to knee-jerk reactionary discussion but on a handful of topics it is virtually guaranteed to happen no matter what, even here. For example, this entire site, besides a handful of communities, is vigorously anti-AI; and in the words of u/jsomae@lemmy.ml elsewhere in this comment chain:

"It seems the subject of AI causes lemmites to lose all their braincells."

I think there is definitely an interesting take on the sociology of the digital age in here somewhere but it's too early in the morning to be tapping something like that out lol
1 Antwort Letzte Antwort

5
L lovablesidekick@lemmy.world

You're getting douchevoted because on lemmy any AI-related comment that isn't negative enough about AI is the Devil's Work.
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von

#120

Some communities on this site speak about machine learning exactly how I see grungy Europeans from pre-18th century manuscripts speaking about witches, Satan, and evil... as if it is some pervasive, black-magic miasma.

As someone who is in the field of machine learning academically/professionally it's honestly kind of shocking and has largely informed my opinion of society at large as an adult. No one puts any effort into learning if they see the letters "A" and "I" in all caps, next to each other. Immediately turn their brain off and start regurgitating points and responding reflexively, on Lemmy or otherwise. People talk about it so confidently while being so frustratingly unaware of their own ignorance on the matter, which, for lack of a better comparison... reminds me a lot of how historically and in fiction human beings have treated literal magic.

That's my main issue with the entire swath of "pro vs anti AI" discourse... all these people treating something that, to me, is simple & daily reality as something entirely different than my own personal notion of it.
L A 2 Antworten Letzte Antwort

3
F freedomadvocate@lemmy.net.au

You can “use” them to learn from, just like “AI” can.

What exactly do you think AI does when it “learns” from a book, for example? Do you think it will just spit out the entire book if you ask it to?
G This user is from outside of this forum
G This user is from outside of this forum
gaja@lemm.ee

schrieb zuletzt editiert von

#121

I am educated on this. When an ai learns, it takes an input through a series of functions and are joined at the output. The set of functions that produce the best output have their functions developed further. Individuals do not process information like that. With poor exploration and biasing, the output of an AI model could look identical to its input. It did not "learn" anymore than a downloaded video ran through a compression algorithm.
E 1 Antwort Letzte Antwort

1
D dojan@pawb.social

LLMs don’t learn, and they’re not people. Applying the same logic doesn’t make much sense.
F This user is from outside of this forum
F This user is from outside of this forum
facedeer@fedia.io

schrieb zuletzt editiert von

#122

The judge isn't saying that they learn or that they're people. He's saying that training falls into the same legal classification as learning.
D 1 Antwort Letzte Antwort

0
F freedomadvocate@lemmy.net.au

Your very first statement calling my basis for my argument incorrect is incorrect lol.

LLMs “learn” things from the content they consume. They don’t just take the content in wholesale and keep it there to regurgitate on command.

On your last part, unless someone uses AI to recreate the tone etc of a best selling author *and then markets their book/writing as being from said best selling author, and doesn’t use trademarked characters etc, there’s no issue. You can’t copyright a style of writing.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von wraithgear@lemmy.world

#123

If what you are saying is true, why were these ‘AI’s” incapable of rendering a full wine glass? It ‘knows’ the concept of a full glass of water, but because of humanities social pressures, a full wine glass being the epitome of gluttony, art work did not depict a full wine glass, no matter how ai prompters demanded, it was unable to link the concepts until it was literally created for it to regurgitate it out. It seems ‘AI’ doesn’t really learn, but regurgitates art out in collages of taken assets, smoothed over at the seams.
A F 2 Antworten Letzte Antwort

2
L lifeinmultiplechoice@lemmy.world

I suppose someone could develop an LLM that digests textbooks, and rewords the text and spits it back out. Then distribute it for free page for page. You can't copy right the math problems I don't think.. so if the text wording is what gives it credence, that would have been changed.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von

#124

If a human did that it’s still plagiarism.
L 1 Antwort Letzte Antwort

1
G gian@lemmy.grys.it

What a bad judge.

Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)
P This user is from outside of this forum
P This user is from outside of this forum
patatahooligan@lemmy.world

schrieb zuletzt editiert von

#125

"Fair use" is the exact opposite of what you're saying here. It says that you don't need to ask for any permission. The judge ruled that obtaining illegitimate copies was unlawful but use without the creators consent is perfectly fine.
1 Antwort Letzte Antwort

2
F freedomadvocate@lemmy.net.au

Not at all true. AI doesn’t just reproduce content it was trained on on demand.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von

#126

It can, the only thing stopping it is if it is specifically told not to, and this consideration is successfully checked for. It is completely capable of plagiarizing otherwise.
F 1 Antwort Letzte Antwort

0

Anmelden zum Antworten

D

Second study finds Uber used opaque algorithm to dramatically boost profits
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

69 Stimmen

3 Beiträge

0 Aufrufe

D

Right? The surprise would be if they weren't doing that.
V

The bizarre, dismal page you see if you open YouTube without an account.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
77

1

241 Stimmen

77 Beiträge

0 Aufrufe

J

bizarre, dismal What's bizarre and dismal is that someone is so starved for dopamine and attention from corporations that this is how they perceive what life looks like when you are not being targetted. This is my normal view and it is far better.
D

Why so much hate toward AI?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
73

38 Stimmen

73 Beiträge

11 Aufrufe

H

AI has only one problem to solve: salaries
S

Reddit sues Anthropic, alleging its bots accessed Reddit more than 100,000 times since last July
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
99

1

471 Stimmen

99 Beiträge

13 Aufrufe

J

Copyright law is messy. Thank you for the elaboration.
P

AI cheating surge pushes schools into chaos
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
25

45 Stimmen

25 Beiträge

8 Aufrufe

C

Sorry for the late reply, I had to sit and think on this one for a little bit. I think there are would be a few things going on when it comes to designing a course to teach critical thinking, nuances, and originality; and they each have their own requirements. For critical thinking: The main goal is to provide students with a toolbelt for solving various problems. Then instilling the habit of always asking "does this match the expected outcome? What was I expecting?". So usually courses will be setup so students learn about a tool, practice using the tool, then have a culminating assignment on using all the tools. Ideally, the problems students face at the end require multiple tools to solve. Nuance mainly naturally comes with exposure to the material from a professional - The way a mechanical engineer may describe building a desk will probably differ greatly compared to a fantasy author. You can also explain definitions and industry standards; but thats really dry. So I try to teach nuances via definitions by mixing in the weird nuances as much as possible with jokes. Then for originality; I've realized I dont actually look for an original idea; but something creative. In a classroom setting, you're usually learning new things about a subject so a student's knowledge of that space is usually very limited. Thus, an idea that they've never heard about may be original to them, but common for an industry expert. For teaching originality creativity, I usually provide time to be creative & think, and provide open ended questions as prompts to explore ideas. My courses that require originality usually have it as a part of the culminating assignment at the end where they can apply their knowledge. I'll also add in time where students can come to me with preliminary ideas and I can provide feedback on whether or not it passes the creative threshold. Not all ideas are original, but I sometimes give a bit of slack if its creative enough. The amount of course overhauling to get around AI really depends on the material being taught. For example, in programming - you teach critical thinking by always testing your code, even with parameters that don't make sense. For example: Try to add 123 + "skibbidy", and see what the program does.
A

New Orleans used Minority Report-like facial recognition software to monitor citizens for crime suspects: Report
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

31 Stimmen

1 Beiträge

3 Aufrufe

Niemand hat geantwortet
C

Chinese chip giants say they don't care about U.S. tariffs — many don't sell to the U.S. anyway due to existing sanctions
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

0 Stimmen

7 Beiträge

5 Aufrufe

F

It's an actively hostile act, regardless of what your beliefs are on the copyright system.
F

*deleted by creator*
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

0 Stimmen

1 Beiträge

4 Aufrufe

Niemand hat geantwortet