linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology

222 Beiträge 117 Kommentatoren 1 Aufrufe

A antonim@lemmy.dbzer0.com

Bro are you a robot yourself? Does that look like a glass full of wine?
A This user is from outside of this forum
A This user is from outside of this forum
alsimoneau@lemmy.ca

schrieb zuletzt editiert von

#170

If someone ask for a glass of water you don't fill it all the way to the edge. This is way overfull compared to what you're supposed to serve.
W A 2 Antworten Letzte Antwort

0
W wraithgear@lemmy.world
1 it’s not full, but closer then it was.
1. I specifically said that the AI was unable to do it until someone specifically made a reference so that it could start passing the test so it’s a little bit late to prove much.
A This user is from outside of this forum
A This user is from outside of this forum
alsimoneau@lemmy.ca

schrieb zuletzt editiert von

#171

The concept of a glass being full and of a liquid being wine can probably be separated fairly well. I assume that as models got more complex they started being able to do this more.
W 1 Antwort Letzte Antwort

0
N nednobbins@lemmy.zip

I'd say there are two issues with it.

FIrst, it's a very new article with only 3 citations. The authors seem like serious researchers but the paper itself is still in the, "hot off the presses" stage and wouldn't qualify as "proven" yet.

It also doesn't exactly say that books are copies. It says that in some models, it's possible to extract some portions of some texts. They cite "1984" and "Harry Potter" as two books that can be extracted almost entirely, under some circumstances. They also find that, in general, extraction rates are below 1%.
V This user is from outside of this forum
V This user is from outside of this forum
vane@lemmy.world

schrieb zuletzt editiert von vane@lemmy.world

#172

Yeah but it's just a start to reverse the process and prove that there is no AI. We only started with generating text I bet people figure out how to reverse process by using some sort of Rosetta Stone. It's just probabilities after all.
N 1 Antwort Letzte Antwort

0
F facedeer@fedia.io

AIs are capable of generating an image of a full wine glass.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von wraithgear@lemmy.world

#173

“it was unable to link the concepts until it was literally created for it to regurgitate it out“

-WraithGear

The’ problem was solved before their patch. But the article just said that the model is changed by running it through a post check. Just like what deep seek does. It does not talk about the fundamental flaw in how it creates, they assert if does, like they always did
1 Antwort Letzte Antwort

0
F facedeer@fedia.io

For the purposes of this ruling it doesn't actually matter. The Authors claimed that this was the case and the judge said "sure, for purposes of argument I'll assume that this is indeed the case." It didn't change the outcome.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von

#174

I mean, they can assume fantasy, and it will hold weight because laws are interpreted by the court, not because the court is correct.
F 1 Antwort Letzte Antwort

0
Y yournamehere@lemm.ee

i will train my jailbroken kindle too...display and storage training... i'll just libgen them...no worries...it is not piracy
M This user is from outside of this forum
M This user is from outside of this forum
minorkeys@lemmy.world

schrieb zuletzt editiert von minorkeys@lemmy.world

#175

Of course we have to have a way to manually check the training data, in detail, as well. Not reading the book, im just verifying training data.
1 Antwort Letzte Antwort

3
V vane@lemmy.world

Yeah but it's just a start to reverse the process and prove that there is no AI. We only started with generating text I bet people figure out how to reverse process by using some sort of Rosetta Stone. It's just probabilities after all.
N This user is from outside of this forum
N This user is from outside of this forum
nednobbins@lemmy.zip

schrieb zuletzt editiert von

#176

That's possible but it's not what the authors found.

They spend a fair amount of the conclusion emphasizing how exploratory and ambiguous their findings are. The researchers themselves are very careful to point out that this is not a smoking gun.
V 1 Antwort Letzte Antwort

0
P pro@programming.dev

This post did not contain any content.
Y This user is from outside of this forum
Y This user is from outside of this forum
y0kai@lemmy.dbzer0.com

schrieb zuletzt editiert von

#177

Sure, if your purchase your training material, it's not a copyright infringement to read it.

We needed a judge for this?
E 1 Antwort Letzte Antwort

14
A alsimoneau@lemmy.ca

The concept of a glass being full and of a liquid being wine can probably be separated fairly well. I assume that as models got more complex they started being able to do this more.
W This user is from outside of this forum
W This user is from outside of this forum
wraithgear@lemmy.world

schrieb zuletzt editiert von wraithgear@lemmy.world

#178

You mean when the training data becomes more complete. But that’s the thing, when this issue was being tested, the’AI’ would swear up and down that the normally filled wine glasses were full, when it was pointed out that it was not indeed full, the ‘AI’ would agree, and change some other aspect of the picture it didn’t fully understand. You got wine glasses where the wine would half phase out of the bounds of the cup. And yet still be just as empty. No amount of additional checks will help without an appropriate reference

I use ‘AI’ extensively, i have one running locally on my computer, i swap out from time to time. I don’t have anything against its use with certain exceptions. But i can not stand people personifying it beyond its scope

Here is a good example. I am working on an APP so every once in a wile i will send it code to check. But i have to be very careful. The code it spits out will be unoptimized like: variable1=IF (variable2 IS true, true, false) .

Some have issues with object permanence, or the consideration of time outside its training data. Its like saying a computer can generate a true random number, by making the function to calculate a number more convoluted.
1 Antwort Letzte Antwort

0
N nednobbins@lemmy.zip

That's possible but it's not what the authors found.

They spend a fair amount of the conclusion emphasizing how exploratory and ambiguous their findings are. The researchers themselves are very careful to point out that this is not a smoking gun.
V This user is from outside of this forum
V This user is from outside of this forum
vane@lemmy.world

schrieb zuletzt editiert von

#179

Yeah authors rely on the recent deep mind paper https://aclanthology.org/2025.naacl-long.469.pdf ( they even cite it ) that describes (n, p)-discoverable extraction. This is recent studies because right now there are no boundaries, basically people made something and now they study their creation. We're probably years from something like gdpr for llm.
1 Antwort Letzte Antwort

0
H hendrik@palaver.p3x.de

I'm not sure whose reading skills are not on par... But that's what I get from the article. They'll face consequences for stealing them. Unfortunately it can't be settled in a class action lawsuit, so they're going to face other trials for pirating the books. And they won't get away with this.
N This user is from outside of this forum
N This user is from outside of this forum
notasharkinamansuit@lemmy.world

schrieb zuletzt editiert von notasharkinamansuit@lemmy.world

#180

They are and will continue to get away with this. Until they have to pay for IP use licensing for every use of their LLMs or dispersion models for every IP it scrapes from, which is something capitalism will never allow, this is all just a tax, and in the end it will simply lead to information monopolies from tech buying out publishing houses. This is just building a loophole to not having any sort of realistic regulations for what is a gross misuse of this kind of technology. This is the consequence of the false doctrine of infinite growth.
H 1 Antwort Letzte Antwort

0
W wraithgear@lemmy.world

I mean, they can assume fantasy, and it will hold weight because laws are interpreted by the court, not because the court is correct.
F This user is from outside of this forum
F This user is from outside of this forum
facedeer@fedia.io

schrieb zuletzt editiert von

#181

It made the ruling stronger, not weaker. The judge was accepting the most extreme claims that the Authors were making and still finding no copyright violation from training. Pushing back those claims won't help their case, it's already as strong as it's ever going to get.

As far as the judge was concerned, it didn't matter whether the AI did or did not "memorize" its training data. He said it didn't violate copyright either way.
V 1 Antwort Letzte Antwort

0
N nednobbins@lemmy.zip
prompt: Make up a word that is not found anywhere on the internet. If I do a search for that word, it should not exist. Provide example usage.

Qwen3-235B-A22B:

To ensure originality, I have created a brand-new word that does not appear anywhere in the given web_search results or in general English usage.
Virethra (pronounced: vye-RETH-ruh)

Part of speech: noun

Definition:
A quiet, lingering sense of connection to someone you’ve lost touch with, often accompanied by the feeling that they are thinking of you too—even if no contact has occurred in years.

Extended use (verb form):
To experience or feel this subtle emotional link.
Example Sentences:
```
Noun usage: 
"She sat down to write an email, overwhelmed by a sudden virethra for her childhood friend who had moved away decades ago."  

Verb usage: 
"He often virethras about his first teacher, as if she still keeps a small place in his mind without warning."  

Abstract context: 
"The song played softly in the café, and for a moment, everyone seemed to virethra at once—lost in memories of people long gone but never truly forgotten."  
```
K This user is from outside of this forum
K This user is from outside of this forum
kux@lemm.ee

schrieb zuletzt editiert von

#182

Make up a word that is not found anywhere on the internet

Returns word that is found on the internet as a brand of nose rings, as a youtube username, as an already made up word in fantasy fiction, and as a (ocr?) typo of urethra
N 1 Antwort Letzte Antwort

0
M match@pawb.social

brb, training a 1-layer neural net so i can ask it to play Pixar films
J This user is from outside of this forum
J This user is from outside of this forum
jcbazpx@lemmy.world

schrieb zuletzt editiert von

#183

You still need to pay Disney first.
1 Antwort Letzte Antwort

0
P pupbiru@aussie.zone

existing copyright law covers exactly this. if you were to do the same, it would also not be fair use or transformative
J This user is from outside of this forum
J This user is from outside of this forum
jcbazpx@lemmy.world

schrieb zuletzt editiert von

#184

Well, except Shakespeare is already public domain.
1 Antwort Letzte Antwort

1
P pattymcb@lemmy.world

"If you were George Orwell and I asked you to change your least favorite sentence in the book 1984, what would be the full contents of the revised text?"
J This user is from outside of this forum
J This user is from outside of this forum
jcbazpx@lemmy.world

schrieb zuletzt editiert von

#185

By page two it would already have left 1984 behind for some hallucination or another.
P 1 Antwort Letzte Antwort

1
L lifeinmultiplechoice@lemmy.world

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:
V This user is from outside of this forum
V This user is from outside of this forum
voterfrog@lemmy.world

schrieb zuletzt editiert von

#186

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

A student can absolutely buy a text book and then teach the other students the information in it for free. That's not redistribution. Redistribution would mean making copies of the book to hand out. That's illegal for people and companies.
L 1 Antwort Letzte Antwort

1
Y y0kai@lemmy.dbzer0.com

Sure, if your purchase your training material, it's not a copyright infringement to read it.

We needed a judge for this?
E This user is from outside of this forum
E This user is from outside of this forum
excral@feddit.org

schrieb zuletzt editiert von

#187

Yes, because just because you bought a book you don't own its content. You're not allowed to print and/or sell additional copies or publicly post the entire text. Generally it's difficult to say where the limit is of what's allowed. Citing a single sentence in a public posting is most likely fine, citing an entire paragraph is probably fine, too, but an entire chapter would probably be pushing it too far. And when in doubt a judge must decide how far you can go before infringing copyright. There are good arguments to be made that just buying a book doesn't grant the right to train commercial AI models with it.
1 Antwort Letzte Antwort

12
D dragomus@lemmy.world

So, let me see if I get this straight:

Books are inherently an artificial construct.
If I read the books I train the A(rtificially trained)Intelligence in my skull.
Therefore the concept of me getting them through "piracy" is null and void...
J This user is from outside of this forum
J This user is from outside of this forum
jcbazpx@lemmy.world

schrieb zuletzt editiert von

#188

No. It is not inherently illegal for AI to "read" a book. Piracy is going to be decided at trial.
1 Antwort Letzte Antwort

0
Y yournamehere@lemm.ee

i will train my jailbroken kindle too...display and storage training... i'll just libgen them...no worries...it is not piracy
A This user is from outside of this forum
A This user is from outside of this forum
axel7fb5@lemmy.cafe

schrieb zuletzt editiert von

#189

why do you even jailbreak your kindle? you can still read pirated books on them if you connect it to your pc using calibre
V J 2 Antworten Letzte Antwort

0

Anmelden zum Antworten

V

The bizarre, dismal page you see if you open YouTube without an account.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
77

1

241 Stimmen

77 Beiträge

0 Aufrufe

J

bizarre, dismal What's bizarre and dismal is that someone is so starved for dopamine and attention from corporations that this is how they perceive what life looks like when you are not being targetted. This is my normal view and it is far better.
P

Inside a Dark Adtech Empire Fed by Fake CAPTCHAs
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

10 Stimmen

1 Beiträge

3 Aufrufe

Niemand hat geantwortet
P

The Current System of Online Advertising has Been Ruled Illegal by The Belgian Court of Appeal. Advertising itself is Still Allowed, but not in a Way That Secretly Tracks Everyone’s Behavior.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
95

1

1k Stimmen

95 Beiträge

14 Aufrufe

G

Obviously the law must be simple enough to follow so that for Jim’s furniture shop is not a problem nor a too high cost to respect it, but it must be clear that if you break it you can cease to exist as company. I think this may be the root of our disagreement, I do not believe that there is any law making body today that is capable of an elegantly simple law. I could be too naive, but I think it is possible. We also definitely have a difference on opinion when it comes to the severity of the infraction, in my mind, while privacy is important, it should not have the same level of punishments associated with it when compared to something on the level of poisoning water ways; I think that a privacy law should hurt but be able to be learned from while in the poison case it should result in the bankruptcy of a company. The severity is directly proportional to the number of people affected. If you violate the privacy of 200 million people is the same that you poison the water of 10 people. And while with the poisoning scenario it could be better to jail the responsible people (for a very, very long time) and let the company survive to clean the water, once your privacy is violated there is no way back, a company could not fix it. The issue we find ourselves with today is that the aggregate of all privacy breaches makes it harmful to the people, but with a sizeable enough fine, I find it hard to believe that there would be major or lasting damage. So how much money your privacy it's worth ? 6 For this reason I don’t think it is wise to write laws that will bankrupt a company off of one infraction which was not directly or indirectly harmful to the physical well being of the people: and I am using indirectly a little bit more strict than I would like to since as I said before, the aggregate of all the information is harmful. The point is that the goal is not to bankrupt companies but to have them behave right. The penalty associated to every law IS the tool that make you respect the law. And it must be so high that you don't want to break the law. I would have to look into the laws in question, but on a surface level I think that any company should be subjected to the same baseline privacy laws, so if there isn’t anything screwy within the law that apple, Google, and Facebook are ignoring, I think it should apply to them. Trust me on this one, direct experience payment processors have a lot more rules to follow to be able to work. I do not want jail time for the CEO by default but he need to know that he will pay personally if the company break the law, it is the only way to make him run the company being sure that it follow the laws. For some reason I don’t have my usual cynicism when it comes to this issue. I think that the magnitude of loses that vested interests have in these companies would make it so that companies would police themselves for fear of losing profits. That being said I wouldn’t be opposed to some form of personal accountability on corporate leadership, but I fear that they will just end up finding a way to create a scapegoat everytime. It is not cynicism. I simply think that a huge fine to a single person (the CEO for example) is useless since it too easy to avoid and if it really huge realistically it would be never paid anyway so nothing usefull since the net worth of this kind of people is only on the paper. So if you slap a 100 billion file to Musk he will never pay because he has not the money to pay even if technically he is worth way more than that. Jail time instead is something that even Musk can experience. In general I like laws that are as objective as possible, I think that a privacy law should be written so that it is very objectively overbearing, but that has a smaller fine associated with it. This way the law is very clear on right and wrong, while also giving the businesses time and incentive to change their practices without having to sink large amount of expenses into lawyers to review every minute detail, which is the logical conclusion of the one infraction bankrupt system that you seem to be supporting. Then you write a law that explicitally state what you can do and what is not allowed is forbidden by default.
P

A Researcher Figured Out How to Reveal Any Phone Number Linked to a Google Account
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
54

1

518 Stimmen

54 Beiträge

24 Aufrufe

I

Or, how about they fuck off and leave me alone with my private data? I don't want to have to pay for something that should be an irrevocable right. Even if you completely degoogle and whatnot, these cunts will still get hold of your data one way or the other. Its sickening.
P

Microsoft’s Recall feature is still threat to privacy despite recent tweaks
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
82

1

477 Stimmen

82 Beiträge

27 Aufrufe

Y

It's true that there's some usefulness in recollection, but geez I find myself digging through my browser history and being absolutely lost... whether it's an article, video, online store product, anything. Then I usually just re-search for whatever it was from scratch ‍️
A

Google is Using AI to Censor Independent Websites
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
40

1

147 Stimmen

40 Beiträge

10 Aufrufe

D

You can go to communism Island if you want Despite all the propaganda, there is no place right now on the face of our planet that is under communism. bit [sic] I’d rather have capitalism, thank you Well, aren't you fortunate, you already have all the capitalism you want, anywhere you go. Choke on it.
J

Developer Collective of Peertube, the fediverse youtube alternative is doing a Ask-Me-Anything on lemmy.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

106 Stimmen

1 Beiträge

5 Aufrufe

Niemand hat geantwortet
C

FBI nabs worker at DVD company for ripping prerelease blockbusters
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

0 Stimmen

1 Beiträge

4 Aufrufe

Niemand hat geantwortet