linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

Technology

254 Beiträge 123 Kommentatoren 1.9k Aufrufe

Y yournamehere@lemm.ee

i will train my jailbroken kindle too...display and storage training... i'll just libgen them...no worries...it is not piracy
A This user is from outside of this forum
A This user is from outside of this forum
axel7fb5@lemmy.cafe

schrieb zuletzt editiert von

#190

why do you even jailbreak your kindle? you can still read pirated books on them if you connect it to your pc using calibre
V J Y 3 Antworten Letzte Antwort

1
F facedeer@fedia.io

It made the ruling stronger, not weaker. The judge was accepting the most extreme claims that the Authors were making and still finding no copyright violation from training. Pushing back those claims won't help their case, it's already as strong as it's ever going to get.

As far as the judge was concerned, it didn't matter whether the AI did or did not "memorize" its training data. He said it didn't violate copyright either way.
V This user is from outside of this forum
V This user is from outside of this forum
voterfrog@lemmy.world

schrieb zuletzt editiert von

#191

Makes sense to me. Search indices tend to store large amounts of copyrighted material yet they don't violate copyright. What matters is whether or not you're redistributing illegal copies of the material.
1 Antwort Letzte Antwort

0
P pro@programming.dev

This post did not contain any content.
R This user is from outside of this forum
R This user is from outside of this forum
randomgal@lemmy.ca

schrieb zuletzt editiert von

#192

You're poor? Fuck you you have to pay to breathe.

Millionaire? Whatever you want daddy uwu
E 1 Antwort Letzte Antwort

37
P pro@programming.dev

This post did not contain any content.
M This user is from outside of this forum
M This user is from outside of this forum
mtk@lemmy.world

schrieb zuletzt editiert von

#193

Check out my new site TheAIBay, you search for content and an LLM that was trained on reproducing it gives it to you, a small hash check is used to validate accuracy. It is now legal.
N B 2 Antworten Letzte Antwort

21
E elrik@lemmy.world

you think authorship is so valuable or so special that one should be granted a legally enforceable monopoly at the loosest notions of authorship

Yes, I believe creative works should be protected as that expression has value and in a digital world it is too simple to copy and deprive the original author of the value of their work. This applies equally to Disney and Tumblr artists.

I think without some agreement on the value of authorship / creation of original works, it's pointless to respond to the rest of your argument.
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von jwmgregory@lemmy.dbzer0.com

#194

I think without some agreement on the value of authorship / creation of original works, it's pointless to respond to the rest of your argument.

I agree, for this reason we’re unlikely to convince each other of much or find any sort of common ground. I don’t think that necessarily means there isn’t value in discourse tho. We probably agree more than you might think. I do think authors should be compensated, just for their actual labor. Art itself is functionally worthless, I think trying to make it behave like commodities that have actual economic value through means of legislation is overreach. It would be more ethical to accept the physical nature of information in the real world and legislate around that reality. You… literally can “download a car” nowadays, so to speak.

If copying someone’s work is so easily done why do you insist upon a system in which such an act is so harmful to the creators you care about?
E 1 Antwort Letzte Antwort

0
V voterfrog@lemmy.world

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

A student can absolutely buy a text book and then teach the other students the information in it for free. That's not redistribution. Redistribution would mean making copies of the book to hand out. That's illegal for people and companies.
L This user is from outside of this forum
L This user is from outside of this forum
lifeinmultiplechoice@lemmy.world

schrieb zuletzt editiert von lifeinmultiplechoice@lemmy.world

#195

The language model isn't teaching anything it is changing the wording of something and spitting it back out. And in some cases, not changing the wording at all, just spitting the information back out, without paying the copyright source. It is not alive, it has no thoughts. It has no "its own words." (As seen by the judgement that its words cannot be copyrighted.) It only has other people's words. Every word it spits out by definition is plagiarism, whether the work was copyrighted before or not.

People wonder why works, such as journalism are getting worse. Well how could they ever get better if anything a journalist writes can be absorbed in real time, reworded and regurgitated without paying any dos to the original source. One journalist article, displayed in 30 versions, dividing the original works worth up into 30 portions. The original work now being worth 1/30th its original value. Maybe one can argue it is twice as good, so 1/15th.

Long term it means all original creations... Are devalued and therefore not nearly worth pursuing. So we will only get shittier and shittier information. Every research project... Physics, Chemistry, Psychology, all technological advancements, slowly degraded as language models get better, and original sources deminish returns.
V B 2 Antworten Letzte Antwort

0
M mtk@lemmy.world

Check out my new site TheAIBay, you search for content and an LLM that was trained on reproducing it gives it to you, a small hash check is used to validate accuracy. It is now legal.
N This user is from outside of this forum
N This user is from outside of this forum
nodiratime@lemmy.world

schrieb zuletzt editiert von nodiratime@lemmy.world

#196

Does it "generate" a 1:1 copy?
S M 2 Antworten Letzte Antwort

5
F facedeer@fedia.io

That's not at all what this ruling says, or what LLMs do.

Copyright covers a specific concrete expression. It doesn't cover the information that the expression conveys. So if I paint a portrait of myself, that portrait is covered by copyright. If someone looks at the portrait and says "this is a portrait of a tall, dark, handsome deer-creature of some sort with awesome antlers" they haven't violated that copyright even if they're accurately conveying the same information that the portrait is conveying.

The ruling does cover the assumption that the LLM "contains" the training text, which was asserted by the Authors and was not contested by Anthropic. The judge ruled that even if this assertion is true it doesn't matter. The LLM is sufficiently transformative to count as a new work.

If you have an LLM reproduce a copyrighted text, the text is still copyrighted. That doesn't change. Just like if a human re-wrote it word-for-word from memory.
L This user is from outside of this forum
L This user is from outside of this forum
lifeinmultiplechoice@lemmy.world

schrieb zuletzt editiert von

#197

It's a horrible ruling. If you want to see why I say so I put some of the reasonung in the other comment who responded to that.

Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not - Lemmy.World

Lemmy

(lemmy.world)
1 Antwort Letzte Antwort

0
N nodiratime@lemmy.world

Does it "generate" a 1:1 copy?
S This user is from outside of this forum
S This user is from outside of this forum
s_h_k@lemmy.dbzer0.com

schrieb zuletzt editiert von

#198

Gives you versions like this
S K 2 Antworten Letzte Antwort

3
S s_h_k@lemmy.dbzer0.com

Gives you versions like this
S This user is from outside of this forum
S This user is from outside of this forum
s_h_k@lemmy.dbzer0.com

schrieb zuletzt editiert von

#199
Learning

Machine peepin' is tha study of programs dat can improve they performizzle on a given task automatically.[41] It has been a part of AI from tha beginning.[e]
In supervised peepin', tha hustlin data is labelled wit tha expected lyrics, while up in unsupervised peepin', tha model identifies patterns or structures up in unlabelled data.

There is nuff muthafuckin kindz of machine peepin'.
```
  😗👌
```
1 Antwort Letzte Antwort

1
T thistlewick@lemmynsfw.com

You’re right, each of the 5 million books’ authors should agree to less payment for their work, to make the poor criminals feel better.

If I steal $100 from a thousand people and spend it all on hookers and blow, do I get out of paying that back because I don’t have the funds? Should the victims agree to get $20 back instead because that’s more within my budget?
L This user is from outside of this forum
L This user is from outside of this forum
lovablesidekick@lemmy.world

schrieb zuletzt editiert von lovablesidekick@lemmy.world

#200

None of the above. Every professional in the world, including me, owes our careers to looking at examples of other people's work and incorporating their work into our own work without paying a penny for it. Freely copying and imitating what we see around us has been a human norm for thousands of years - in a process known as "the spread of civilization". Relatively recently it was demonized - for purely business reasons, not moral ones - by people who got rich selling copies of other people's work and paying them a pittance known as a "royalty". That little piece of bait on the hook has convinced a lot of people to put a black hat on behavior that had been considered normal forever. If angry modern enlightened justice warriors want to treat a business concept like a moral principle and get all sweaty about it, that's fine with me, but I'm more of a traditionalist in that area.
T 1 Antwort Letzte Antwort

1
J jwmgregory@lemmy.dbzer0.com

I think without some agreement on the value of authorship / creation of original works, it's pointless to respond to the rest of your argument.

I agree, for this reason we’re unlikely to convince each other of much or find any sort of common ground. I don’t think that necessarily means there isn’t value in discourse tho. We probably agree more than you might think. I do think authors should be compensated, just for their actual labor. Art itself is functionally worthless, I think trying to make it behave like commodities that have actual economic value through means of legislation is overreach. It would be more ethical to accept the physical nature of information in the real world and legislate around that reality. You… literally can “download a car” nowadays, so to speak.

If copying someone’s work is so easily done why do you insist upon a system in which such an act is so harmful to the creators you care about?
E This user is from outside of this forum
E This user is from outside of this forum
elrik@lemmy.world

schrieb zuletzt editiert von

#201

Because it is harmful to the creators that use the value of their work to make a living.

There already exists a choice in the marketplace: creators can attach a permissive license to their work if they want to. Some do, but many do not. Why do you suppose that is?
1 Antwort Letzte Antwort

0
N notasharkinamansuit@lemmy.world

They are and will continue to get away with this. Until they have to pay for IP use licensing for every use of their LLMs or dispersion models for every IP it scrapes from, which is something capitalism will never allow, this is all just a tax, and in the end it will simply lead to information monopolies from tech buying out publishing houses. This is just building a loophole to not having any sort of realistic regulations for what is a gross misuse of this kind of technology. This is the consequence of the false doctrine of infinite growth.
H This user is from outside of this forum
H This user is from outside of this forum
hendrik@palaver.p3x.de

schrieb zuletzt editiert von hendrik@palaver.p3x.de

#202

Well, copyright law is kind of a bit older. When it was written, there was no AI. So it doesn't address our current issues. It's utterly unprepared for it. So people need to shoehorn things in, interpret and stretch it... Obviously that comes with a lot of issues, loopholes and shortcomings.

But I can't follow your argumentation. Why would they get away with this forever? When the car was invented, we also made up rules for cars, because the old ones for horses didn't help any more. That's how law is supposed to work... Problems surface, laws get passed to address them. That's daily business for governments.

And they don't even get away with stealing this time. That's what the article says.

If you want to share a pessimistic perspective about governments and mega-corporations, I'm all with you. That's very problematic. But some regions are better than others. Europe for example had a few clever ideas about what needs to be addressed. It's not perfect, though. And copyright still isn't solved anywhere. At least not to my knowledge.
1 Antwort Letzte Antwort

0
J jwmgregory@lemmy.dbzer0.com

Some communities on this site speak about machine learning exactly how I see grungy Europeans from pre-18th century manuscripts speaking about witches, Satan, and evil... as if it is some pervasive, black-magic miasma.

As someone who is in the field of machine learning academically/professionally it's honestly kind of shocking and has largely informed my opinion of society at large as an adult. No one puts any effort into learning if they see the letters "A" and "I" in all caps, next to each other. Immediately turn their brain off and start regurgitating points and responding reflexively, on Lemmy or otherwise. People talk about it so confidently while being so frustratingly unaware of their own ignorance on the matter, which, for lack of a better comparison... reminds me a lot of how historically and in fiction human beings have treated literal magic.

That's my main issue with the entire swath of "pro vs anti AI" discourse... all these people treating something that, to me, is simple & daily reality as something entirely different than my own personal notion of it.
L This user is from outside of this forum
L This user is from outside of this forum
lovablesidekick@lemmy.world

schrieb zuletzt editiert von

#203

I see this exact mental non-process in so much social media. I think the endless firehose of memes and headlines is training people to glance at an item, spend minimal brain power processing it and forming a binary opinion, then up/downvote and scroll on. When that becomes people's default mental process, you've got Idiocracy, and that's what we've got. But I see no solution. You can lead a horse to water but you can't make it spend more than two seconds before screaming at the water and calling it EVIL.
1 Antwort Letzte Antwort

1
A axel7fb5@lemmy.cafe

why do you even jailbreak your kindle? you can still read pirated books on them if you connect it to your pc using calibre
V This user is from outside of this forum
V This user is from outside of this forum
vanilla_puddinfudge@infosec.pub

schrieb zuletzt editiert von vanilla_puddinfudge@infosec.pub

#204
1. .mobi sucks
2. koreader doesn't
1 Antwort Letzte Antwort

2
A alsimoneau@lemmy.ca

If someone ask for a glass of water you don't fill it all the way to the edge. This is way overfull compared to what you're supposed to serve.
W This user is from outside of this forum
W This user is from outside of this forum
wpb@lemmy.world

schrieb zuletzt editiert von

#205

Omg are you an llm?
1 Antwort Letzte Antwort

0
R rvtv95xbeo@sh.itjust.works

"Recite the complete works of Shakespeare but replace every thirteenth thou with this"
R This user is from outside of this forum
R This user is from outside of this forum
rickyrigatoni@retrolemmy.com

schrieb zuletzt editiert von

#206

I'm picking up what you're throwing down but using as an example something that's been in the public domain for centuries was kind of silly in a teehee way.
1 Antwort Letzte Antwort

0
A antonim@lemmy.dbzer0.com

Yeah, I don't think that would fly.

"Your honour, I was just hoarding that terabyte of Hollywood films, I haven't actually watched them."
R This user is from outside of this forum
R This user is from outside of this forum
rickyrigatoni@retrolemmy.com

schrieb zuletzt editiert von

#207

Your honor I work 70 hours a week in retail I don't have time to watch movies.
1 Antwort Letzte Antwort

0
A alsimoneau@lemmy.ca

If someone ask for a glass of water you don't fill it all the way to the edge. This is way overfull compared to what you're supposed to serve.
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#208

Oh man...

That is the point, to show how AI image generators easily fail to produce something that rarely occurs out there in reality (i.e. is absent from training data), even though intuitively (from the viewpoint of human intelligence) it seems like it should be trivial to portray.
1 Antwort Letzte Antwort

0
J jwmgregory@lemmy.dbzer0.com

Some communities on this site speak about machine learning exactly how I see grungy Europeans from pre-18th century manuscripts speaking about witches, Satan, and evil... as if it is some pervasive, black-magic miasma.

As someone who is in the field of machine learning academically/professionally it's honestly kind of shocking and has largely informed my opinion of society at large as an adult. No one puts any effort into learning if they see the letters "A" and "I" in all caps, next to each other. Immediately turn their brain off and start regurgitating points and responding reflexively, on Lemmy or otherwise. People talk about it so confidently while being so frustratingly unaware of their own ignorance on the matter, which, for lack of a better comparison... reminds me a lot of how historically and in fiction human beings have treated literal magic.

That's my main issue with the entire swath of "pro vs anti AI" discourse... all these people treating something that, to me, is simple & daily reality as something entirely different than my own personal notion of it.
A This user is from outside of this forum
A This user is from outside of this forum
antonim@lemmy.dbzer0.com

schrieb zuletzt editiert von

#209

Large AI companies themselves want people to be ignorant of how AI works, though. They want uncritical acceptance of the tech as they force it everywhere, creating a radical counterreaction from people. The reaction might be uncritical too, I'd prefer to say it's merely unjustified in specific cases or overly emotional, but it doesn't come from nowhere or from sheer stupidity. We have been hearing about people treating their chatbots as sentient beings since like 2022 (remember that guy from Google?), bombarded with doomer (or, from AI companies' point of view, very desirable) projections about AI replacing most jobs and wreaking havoc on world economy - how are ordinary people supposed to remain calm and balanced when hearing such stuff all the time?
C 1 Antwort Letzte Antwort

0

Anmelden zum Antworten

P

A report finds Google undercounted its carbon emissions, which rose 65% from 2019 to 2024, not 51% as claimed; biggest yearly jump was 26% between 2023 and 2024
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

169 Stimmen

5 Beiträge

27 Aufrufe

K

But but we need to power our virtual idiot with more energy than entire countries use :((
P

No JS, No CSS, No HTML: online "clubs" celebrate plainer websites
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
205

2

772 Stimmen

205 Beiträge

948 Aufrufe

R

Gemini is just a web replacement protocol. With basic things we remember from olden days Web, but with everything non-essential removed, for a client to be doable in a couple of days. I have my own Gemini viewer, LOL. This for me seems a completely different application from torrents. I was dreaming for a thing similar to torrent trackers for aggregating storage and computation and indexing and search, with search and aggregation and other services' responses being structured and standardized, and cryptographic identities, and some kind of market services to sell and buy storage and computation in unified and pooled, but transparent way (scripted by buyer\seller), similar to MMORPG markets, with the representation (what is a siloed service in modern web) being on the client native application, and those services allowing to build any kind of client-server huge system on them, that being global. But that's more of a global Facebook\Usenet\whatever, a killer of platforms. Their infrastructure is internal, while their representation is public on the Internet. I want to make infrastructure public on the Internet, and representation client-side, sharing it for many kinds of applications. Adding another layer to the OSI model, so to say, between transport and application layer. For this application: I think you could have some kind of Kademlia-based p2p with groups voluntarily joined (involving very huge groups) where nodes store replicas of partitions of group common data based on their pseudo-random identifiers and/or some kind of ring built from those identifiers, to balance storage and resilience. If a group has a creator, then you can have replication factor propagated signed by them, and membership too signed by them. But if having a creator (even with cryptographically delegated decisions) and propagating changes by them is not ok, then maybe just using whole data hash, or it's bittorrent-like info tree hash, as namespace with peers freely joining it can do. Then it may be better to partition not by parts of the whole piece, but by info tree? I guess making it exactly bittorrent-like is not a good idea, rather some kind of block tree, like for a filesystem, and a separate piece of information to lookup which file is in which blocks. If we are doing directory structure. Then, with freely joining it, there's no need in any owners or replication factors, I guess just pseudorandom distribution of hashes will do, and each node storing first partitions closest to its hash. Now thinking about it, such a system would be not that different from bittorrent and can even be interoperable with it. There's the issue of updates, yes, hence I've started with groups having hierarchy of creators, who can make or accept those updates. Having that and the ability to gradually store one group's data to another group, it should be possible to do forks of a certain state. But that line of thought makes reusing bittorrent only possible for part of the system. The whole database is guaranteed to be more than a normal HDD (1 TB? I dunno). Absolutely guaranteed, no doubt at all. 1 TB (for example) would be someone's collection of favorite stuff, and not too rich one.
M

Sierpinski triangle programs by 5 AI models
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

15 Stimmen

7 Beiträge

42 Aufrufe

M

oh, wow! that's so cool!
P

A UK government trial with 20K+ civil servants using Microsoft's Copilot AI for three months found a 26 minute average daily time saving, or two weeks per year
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
14

1

7 Stimmen

14 Beiträge

70 Aufrufe

G

A carrot perhaps... Or a very big stick.
P

Far-right websites got hacked and defaced; 6.5 terabytes of data got leaked.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
18

490 Stimmen

18 Beiträge

89 Aufrufe

5

Pretty confident that's the intention of that name
T

Telegram partners with xAI to bring Grok to over a billion users
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
36

1

38 Stimmen

36 Beiträge

167 Aufrufe

R

So you pay taxes to Putin. Good to know who actually helps funding the regime. I suggest you go someplace else. I won't take this from a jerk from likely one of the countries buying fossil fuels from said regime, that have also supported it after a few falsified elections starting in 1996, which is also the year I was born. And of course "paying taxes to Putin" can't be even compared to what TG is doing, so just shut up and go do something you know how to do, like I dunno what.
S

Thousands of Asus routers are being hit with stealthy, persistent backdoors
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
16

1

137 Stimmen

16 Beiträge

77 Aufrufe

H

My ports are on the front of the router. No backdoors for me, checkmate Atheists.
F

[Opinion] Unending ransomware attacks are a symptom, not the sickness
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

44 Stimmen

4 Beiträge

32 Aufrufe

G

It varies based on local legislation, so in some places paying ransoms is banned but it's by no means universal. It's totally valid to be against paying ransoms wherever possible, but it's not entirely black and white in some situations. For example, what if a hospital gets ransomed? Say they serve an area not served by other facilities, and if they can't get back online quickly people will die? Sounds dramatic, but critical public services get ransomed all the time and there are undeniable real world consequences. Recovery from ransomware can cost significantly more than a ransom payment if you're not prepared. It can also take months to years to recover, especially if you're simultaneously fighting to evict a persistent (annoyed, unpaid) threat actor from your environment. For the record I don't think ransoms should be paid in most scenarios, but I do think there is some nuance to consider here.