linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

AI industry horrified to face largest copyright class action ever certified

98 Beiträge 55 Kommentatoren 0 Aufrufe

F fauxliving@lemmy.world

People cheering for this have no idea of the consequence of their copyright-maximalist position.

If using images, text, etc to train a model is copyright infringement then there will NO open models because open source model creators could not possibly obtain all of the licensing for every piece of written or visual media in the Common Crawl dataset, which is what most of these things are trained on.

As it stands now, corporations don't have a monopoly on AI specifically because copyright doesn't apply to AI training. Everyone has access to Common Crawl and the other large, public, datasets made from crawling the public Internet and so anyone can train a model on their own without worrying about obtaining billions of different licenses from every single individual who has ever written a word or drawn a picture.

If there is a ruling that training violates copyright then the only entities that could possibly afford to train LLMs or diffusion models are companies that own a large amount of copyrighted materials. Sure, one company will lose a lot of money and/or be destroyed, but the legal president would be set so that it is impossible for anyone that doesn't have billions of dollars to train AI.

People are shortsightedly seeing this as a victory for artists or some other nonsense. It's not. This is a fight where large copyright holders (Disney and other large publishing companies) want to completely own the ability to train AI because they own most of the large stores of copyrighted material.

If the copyright holders win this then the open source training material, like Common Crawl, would be completely unusable to train models in the US/the West because any person who has ever posted anything to the Internet in the last 25 years could simply sue for copyright infringement.
B This user is from outside of this forum
B This user is from outside of this forum
barryamelton@lemmy.world

schrieb zuletzt editiert von barryamelton@lemmy.world

#84

Anybody can use copyrighted works under fair use for research, more so if your LLM model is open source (I would say this fair use should only actually apply if your model is open source...).
You are wrong.

We don't need to break copyright rights that protect us from corporations in this case, or also incidentally protect open source and libre software.
1 Antwort Letzte Antwort

8
F fauxliving@lemmy.world

Distributed computing projects, large non-profits, people in the near future with much more powerful and cheaper hardware, governments which are interested in providing public services to their citizens, etc.

Look at other large technology projects. The Human Genome Project spent $3 billion to sequence the first genome but now you can have it done for around $500. This cost reduction is due to the massive, combined effort of tens of thousands of independent scientists working on the same problem. It isn't something that would have happened if Purdue Pharma owned the sequencing process and required every scientist to purchase a license from them in order to do research.

LLM and diffusion models are trained on the works of everyone who's ever been online. This work, generated by billions of human-hours, is stored in the Common Crawl datasets and is freely available to anyone who wants it. This data is both priceless and owned by everyone. We should not be cheering for a world where it is illegal to use this dataset that we all created and, instead, we are forced to license massive datasets from publishing companies.

The amount of progress on these types of models would immediately stop, there would be 3-4 corporations would could afford the licenses. They would have a de facto monopoly on LLMs and could enshittify them without worry of competition.
J This user is from outside of this forum
J This user is from outside of this forum
justaraccoon@lemmy.world

schrieb zuletzt editiert von

#85

The world you're envisioning would only have paid licenses, who's to say we can't have a "free for non commercial purposes" license style for it all?
1 Antwort Letzte Antwort

1
P pushbutton@lemmy.world

Let's go baby! The law is the law, and it applies to everybody

If the "genie doesn't go back in the bottle", make him pay for what he's stealing.
K This user is from outside of this forum
K This user is from outside of this forum
kameecoding@lemmy.world

schrieb zuletzt editiert von kameecoding@lemmy.world

#86

The law is not the law.
I am the law.

insert awesome guitar riff here

Reference: https://youtu.be/Kl_sRb0uQ7A
1 Antwort Letzte Antwort

0
S signtist@bookwyr.me

This is the real concern. Copyright abuse has been rampant for a long time, and the only reason things like the Internet Archive are allowed to exist is because the copyright holders don't want to pick a fight they could potentially lose and lessen their hold on the IPs they're hoarding. The AI case is the perfect thing for them, because it's a very clear violation with a good amount of public support on their side, and winning will allow them to crack down even harder on all the things like the Internet Archive that should be fair use. AI is bad, but this fight won't benefit the public either way.
A This user is from outside of this forum
A This user is from outside of this forum
a_wild_mimic_appears@lemmy.dbzer0.com

schrieb zuletzt editiert von a_wild_mimic_appears@lemmy.dbzer0.com

#87

I wouldn't even say AI is bad, i have currently Qwen 3 running on my own GPU giving me a course in RegEx and how to use it. It sometimes makes mistakes in the examples (we all know that chatbots are shit when it comes to the r's in strawberry), but i see it as "spot the error" type of training for me, and the instructions themself have been error free for now, since i do the lesson myself i can easily spot if something goes wrong.

AI crammed into everything because venture capitalists try to see what sticks is probably the main reason public opinion of chatbots is bad, and i don't condone that too, but the technology itself has uses and is an impressive accomplishment.

Same with image generation: i am shit at drawing, and i don't have the money to commission art if i want something specific, but i can generate what i want for myself.

If the copyright side wins, we all might lose the option to run imagegen and llms on our own hardware, there will never be an open-source llm, and resources that are important to us all will come even more under fire than they are already. Copyright holders will be the new AI companies, and without competition the enshittification will instantly start.
S 1 Antwort Letzte Antwort

1
T treczoks@lemmy.world

Well, theft has never been the best foundation for a business, has it?

While I completely agree that copyright terms are completely overblown, they are valid law that other people suffer under, so it is 100% fair to make them suffer the same. Or worse, as they all broke the law for commercial gain.
N This user is from outside of this forum
N This user is from outside of this forum
no1@aussie.zone

schrieb zuletzt editiert von

#88

Well, theft has never been the best foundation for a business, has it?

History would suggest otherwise.
1 Antwort Letzte Antwort

0
A a_wild_mimic_appears@lemmy.dbzer0.com

I wouldn't even say AI is bad, i have currently Qwen 3 running on my own GPU giving me a course in RegEx and how to use it. It sometimes makes mistakes in the examples (we all know that chatbots are shit when it comes to the r's in strawberry), but i see it as "spot the error" type of training for me, and the instructions themself have been error free for now, since i do the lesson myself i can easily spot if something goes wrong.

AI crammed into everything because venture capitalists try to see what sticks is probably the main reason public opinion of chatbots is bad, and i don't condone that too, but the technology itself has uses and is an impressive accomplishment.

Same with image generation: i am shit at drawing, and i don't have the money to commission art if i want something specific, but i can generate what i want for myself.

If the copyright side wins, we all might lose the option to run imagegen and llms on our own hardware, there will never be an open-source llm, and resources that are important to us all will come even more under fire than they are already. Copyright holders will be the new AI companies, and without competition the enshittification will instantly start.
S This user is from outside of this forum
S This user is from outside of this forum
signtist@bookwyr.me

schrieb zuletzt editiert von

#89

What you see as "spot the error" type training, another person sees as absolute fact that they internalize and use to make decisions that impact the world. The internet gave rise to the golden age of conspiracy theories, which is having a major impact on the worsening political climate, and it's because the average user isn't able to differentiate information from disinformation. AI chatbots giving people the answer they're looking for rather than the truth is only going to compound the issue.
1 Antwort Letzte Antwort

0
D davriellelouna@lemmy.world

This post did not contain any content.
C This user is from outside of this forum
C This user is from outside of this forum
crystalmerchant@lemmy.world

schrieb zuletzt editiert von

#90

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

And yet, despite 20 years of experience, the only side Ashley presents is the technologists' side.
1 Antwort Letzte Antwort

4
D davriellelouna@lemmy.world

This post did not contain any content.
L This user is from outside of this forum
L This user is from outside of this forum
lucidlyes@lemmy.world

schrieb zuletzt editiert von

#91

I hope LLMs and generative AI crash and burn.
V 1 Antwort Letzte Antwort

3
L lucidlyes@lemmy.world

I hope LLMs and generative AI crash and burn.
V This user is from outside of this forum
V This user is from outside of this forum
vacuumflower@lemmy.sdf.org

schrieb zuletzt editiert von

#92

I'm thinking, honestly, what if that's the planned purpose of this bubble.

I'm explaining - those "AI"'s involve assembling large datasets and making them available, poisoning the Web, and creating demand for for a specific kind of hardware.

When it bursts, not everything bursts.

Suddenly there will be plenty of no longer required hardware usable for normal ML applications like face recognition, voice recognition, text analysis to identify its author, combat drones with target selection, all kinds of stuff. It will be dirt cheap, compared to its current price, as it was with Sun hardware after the dotcom crash.

There still will be those datasets, that can be analyzed for plenty of purposes. Legal or not, they are already processed into usable and convenient state.

There will be the Web covered with a great wall of China tall layer of AI slop.

There will likely be a bankrupt nation which will have a lot of things failing due to that.

And there will still be all the centralized services. Suppose on that day you go search something in Google, and there's only the Google summary present, no results list (or maybe even a results list, whatever, but suddenly weighed differently), saying that you've been owned by domestic enemies yadda-yadda and the patriotic corporations are implementing a popular state of emergency or something like that. You go to Facebook, and when you write something there, your messages are premoderated by an AI so that you'd not be able to god forbid say something wrong. An LLM might not be able to support a decent enough conversation, but to edit out things you say, or PGP keys you send, in real time without anything appearing strange - easily. Or to change some real person's style of speech to yours.

Suppose all of not-degoogled Android installations start doing things like that, Amazon's logistics suddenly start working to support a putsch, Facebook and WhatsApp do what I described or just fail, Apple makes a presentation of a new, magnificent, ingenious, miraculous, patriotic change to a better system of government, maybe even with Johnny Ive as the speaker, and possibly does the same unnoticeable censorship, Microsoft pushes one malicious update 3 months earlier with a backdoor to all Windows installations doing the same, and commits its datacenters to the common effort, and let's just say it's possible that a similar thing is done by some Linux developer believing in an idea and some of the major distributions - don't need it doing much, just to provide a backdoor usable remotely.

I don't list Twitter because honestly it doesn't seem to work well enough or have coverage good enough.

So - this seems a pretty possible apocalypse scenario which does lead to a sudden installation of a dictatorial regime with all the necessary surveillance, planning, censorship and enforcement already being functioning systems.

So - of course apocalypse scenarios were a normal thing in movies for many years and many times, but it's funny how the more plausible such become, the less often they are described in art.
1 Antwort Letzte Antwort

0
D davriellelouna@lemmy.world

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
plurrbear@lemmy.world

schrieb zuletzt editiert von

#93

Fucking good!! Let the AI industry BURN!
1 Antwort Letzte Antwort

3
M magikmw@piefed.social

IA doesn't make any money off the content. Not that LLM companies do, but that's what they'd want.
C This user is from outside of this forum
C This user is from outside of this forum
cosmonova@lemmy.world

schrieb zuletzt editiert von

#94

And this is exactly the reason why I think the IA will be forced to close down while AI companies that trained their models on it will not only stay but be praised for preserving information in an ironic twist. Because one side does participate in capitalism and the other doesn’t. They will claim AI is transformative enough even when it isn’t because the overly rich invested too much money into the grift.
1 Antwort Letzte Antwort

0
R rivalarrival@lemmy.today

Ah yes. "Public Domain" == "Theft"
S This user is from outside of this forum
S This user is from outside of this forum
smoogs@lemmy.world

schrieb zuletzt editiert von

#95

Not everything is public domain, thief scum.
R 1 Antwort Letzte Antwort

0
R rooskie91@discuss.online

I propose that anyone defending themselves in court over AI stealing data must be represented exclusively by AI.
C This user is from outside of this forum
C This user is from outside of this forum
chaoscruiser@futurology.today

schrieb zuletzt editiert von

#96

That would be glorious. If the future of your company depends on the LLM keeping track of hundreds of details and drawing the right conclusions, it’s game over during the first day.
1 Antwort Letzte Antwort

0
D davriellelouna@lemmy.world

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
plurrbear@lemmy.world

schrieb zuletzt editiert von

#97

Good!!! Let the AI industry fucking burn!!!
1 Antwort Letzte Antwort

0
S smoogs@lemmy.world

Not everything is public domain, thief scum.
R This user is from outside of this forum
R This user is from outside of this forum
rivalarrival@lemmy.today

schrieb zuletzt editiert von

#98

Do they even teach the constitution anymore?
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

A

Peripheral Intravenous (IV) Catheter Market
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

2

0 Stimmen

1 Beiträge

4 Aufrufe

Niemand hat geantwortet
P

Why autonomous systems should mirror the structure of biological intelligence
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

17 Stimmen

4 Beiträge

12 Aufrufe

Z

That's because it's mostly blah, blah.
D

AI-backed medical debt company claims payment plans can help US healthcare costs
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

44 Stimmen

10 Beiträge

57 Aufrufe

M

Hospitals would likely be fine with it. The health insurance industry would not though and would pressure the hospital to cut you off. It’s illegal. But they would do it anyway. You would need serious fuck you money to change this. And even then, probably a lot of Luigis too.
C

Russland geht gegen die deutschen Entwickler des E2E-verschlüsselten Messengers Delta Chat vor
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

1 Stimmen

1 Beiträge

16 Aufrufe

Niemand hat geantwortet
P

Federal judge declines to order Trump officials to recover deleted Signal messages
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

112 Stimmen

2 Beiträge

31 Aufrufe

W

...the ruling stopped short of ordering the government to recover past messages that may already have been lost. How would somebody be meant to comply with an order to recover a message that has been deleted? Or is that the point? Can't comply and you're in contempt of court.
S

Acute Leukemia Burden Trends and Future Predictions
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

5 Stimmen

5 Beiträge

57 Aufrufe

G

Looks like the delay in 2011 was so big the data became available after the 2017 one
P

Selling Surveillance as Convenience
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
13

1

112 Stimmen

13 Beiträge

104 Aufrufe

E

Trying to get my peers to care about their own privacy is exhausting. I wish their choices don't effect me, but like this article states.. They do in the long run. I will remain stubborn and only compromise rather than give in.
P

Why doesn't Nvidia have more competition?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
22

1

33 Stimmen

22 Beiträge

242 Aufrufe

B

It’s funny how the article asks the question, but completely fails to answer it. About 15 years ago, Nvidia discovered there was a demand for compute in datacenters that could be met with powerful GPU’s, and they were quick to respond to it, and they had the resources to focus on it strongly, because of their huge success and high profitability in the GPU market. AMD also saw the market, and wanted to pursue it, but just over a decade ago where it began to clearly show the high potential for profitability, AMD was near bankrupt, and was very hard pressed to finance developments on GPU and compute in datacenters. AMD really tried the best they could, and was moderately successful from a technology perspective, but Nvidia already had a head start, and the proprietary development system CUDA was already an established standard that was very hard to penetrate. Intel simply fumbled the ball from start to finish. After a decade of trying to push ARM down from having the mobile crown by far, investing billions or actually the equivalent of ARM’s total revenue. They never managed to catch up to ARM despite they had the better production process at the time. This was the main focus of Intel, and Intel believed that GPU would never be more than a niche product. So when intel tried to compete on compute for datacenters, they tried to do it with X86 chips, One of their most bold efforts was to build a monstrosity of a cluster of Celeron chips, which of course performed laughably bad compared to Nvidia! Because as it turns out, the way forward at least for now, is indeed the massively parralel compute capability of a GPU, which Nvidia has refined for decades, only with (inferior) competition from AMD. But despite the lack of competition, Nvidia did not slow down, in fact with increased profits, they only grew bolder in their efforts. Making it even harder to catch up. Now AMD has had more money to compete for a while, and they do have some decent compute units, but Nvidia remains ahead and the CUDA problem is still there, so for AMD to really compete with Nvidia, they have to be better to attract customers. That’s a very tall order against Nvidia that simply seems to never stop progressing. So the only other option for AMD is to sell a bit cheaper. Which I suppose they have to. AMD and Intel were the obvious competitors, everybody else is coming from even further behind. But if I had to make a bet, it would be on Huawei. Huawei has some crazy good developers, and Trump is basically forcing them to figure it out themselves, because he is blocking Huawei and China in general from using both AMD and Nvidia AI chips. And the chips will probably be made by Chinese SMIC, because they are also prevented from using advanced production in the west, most notably TSMC. China will prevail, because it’s become a national project, of both prestige and necessity, and they have a massive talent mass and resources, so nothing can stop it now. IMO USA would clearly have been better off allowing China to use American chips. Now China will soon compete directly on both production and design too.

1
2
3
4
5