linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge dismisses authors' copyright lawsuit against Meta over AI training

Technology

24 Beiträge 14 Kommentatoren 127 Aufrufe

D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
T This user is from outside of this forum
T This user is from outside of this forum
tabular@lemmy.world

schrieb zuletzt editiert von tabular@lemmy.world

#12

"hallucination refers to the generation of plausible-sounding but factually incorrect or nonsensical information"

Is an output an hallucination when the training data involved in that output included factually incorrect data? Suppose my input is "is the would flat" and then an LLM, allegedly, accurately generates a flat-eather's writings saying it is.
1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

Your last paragraph would be ideal solution in ideal world but I don't think ever like this could happen in the current political and economical structures.

First its super easy to hide all of this and enforcement would be very difficult even domestically. Second, because we're in AI race no one would ever put themselves in such disadvantage unless its real damage not economical copyright juggling.

People need to come to terms with these facts so we can address real problems rather than blow against the wind with all this whining we see on Lemmy. There are actual things we can do.
D This user is from outside of this forum
D This user is from outside of this forum
deathmetal27@lemmy.world

schrieb zuletzt editiert von

#13

One way I could see this being enforced is by mandating that AI models not respond to questions that could result in speaking about a copyrighted work. Similar to how mainstream models don't speak about vulgar or controversial topics.

But yeah, realistically, it's unlikely that any judge would rule in that favour.
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
P This user is from outside of this forum
P This user is from outside of this forum
pattymcb@lemmy.world

schrieb zuletzt editiert von

#14

It sounds like the precedent has been set
1 Antwort Letzte Antwort

3
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
A This user is from outside of this forum
A This user is from outside of this forum
amosburton_thatguy@lemmy.ca

schrieb zuletzt editiert von

#15

Grab em by the intellectual property! When you're a multi-billion dollar corporation, they just let you do it!
1 Antwort Letzte Antwort

4
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
B This user is from outside of this forum
B This user is from outside of this forum
blametheantifa@lemmy.world

schrieb zuletzt editiert von

#16

I’ll leave this here from another post on this topic…
1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
S This user is from outside of this forum
S This user is from outside of this forum
squaresinger@lemmy.world

schrieb zuletzt editiert von

#17

Accuracy and hallucination are two ends of a spectrum.

If you turn hallucinations to a minimum, the LLM will faithfully reproduce what's in the training set, but the result will not fit the query very well.

The other option is to turn the so-called temperature up, which will result in replies fitting better to the query but also the hallucinations go up.

In the end it's a balance between getting responses that are closer to the dataset (factual) or closer to the query (creative).
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb zuletzt editiert von

#18

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D 1 Antwort Letzte Antwort

0
O ocassionallyaduck@lemmy.world

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb zuletzt editiert von

#19

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J O 2 Antworten Letzte Antwort

1
D drmoose@lemmy.world

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J This user is from outside of this forum
J This user is from outside of this forum
josefo@leminal.space

schrieb zuletzt editiert von

#20

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D 1 Antwort Letzte Antwort

1
J josefo@leminal.space

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb zuletzt editiert von

#21

How do you think you're making money with your work? Did your knowledge appear from a vacuum? Ethically speaking nothing is "original creation of your own merit only" - everything we make is transformative by nature.

Either way, the talks are moot as we'll never agree on what is transformative enough to be harmful to our society unless its a direct 1:1 copy with direct goal to displace the original. But thats clearly not the case with LLMs.
1 Antwort Letzte Antwort

1
P petter1@lemm.ee

In an ideal world, there would be something like a universal basic income, which would reduce the pressure on artists that they have to generate enough income with their art, this would allow artists to make art less for mainstream but more unique and thus would, in my opinion, allow to weaken copyright laws

Well, that would be the way I would try to start change.
F This user is from outside of this forum
F This user is from outside of this forum
fishface@lemmy.world

schrieb zuletzt editiert von

#22

I would go a step further and have creative grants to people. It would work in a way similar to the BBC and similar broadcasters, where a body gets government money and then picks creative projects it thinks are worthwhile, with a remit that goes beyond the lowest common denominator. UBI ensures that this system doesn't have a monopoly on creative output.
P 1 Antwort Letzte Antwort

1
F fishface@lemmy.world

I would go a step further and have creative grants to people. It would work in a way similar to the BBC and similar broadcasters, where a body gets government money and then picks creative projects it thinks are worthwhile, with a remit that goes beyond the lowest common denominator. UBI ensures that this system doesn't have a monopoly on creative output.
P This user is from outside of this forum
P This user is from outside of this forum
petter1@lemm.ee

schrieb zuletzt editiert von

#23

Agree 100%!

We need more Kulturförderung!
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb zuletzt editiert von ocassionallyaduck@lemmy.world

#24

Ingesting all the artwork you ever created by obtaining it illegally and feeding it into my plagarism remix machine is theft of your work, because I did not pay for it.

Separately, keeping a copy of this work so I can do this repeatedly is also stealing your work.

The judge ruled the first was okay but the second was not because the first is "transformative", which sadly means to me that the judge despite best efforts does not understand how a weighted matrix of tokens works and that while they may have some prevention steps in place now, early models showed the tech for what it was as it regurgitated text with only minor differences in word choice here and there.

Current models have layers on top to try and prevent this user input, but escaping those safeguards is common, and it's also only masking the fact that the entire model is built off of the theft of other's work.
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

K

I was wrong about robots.txt
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
17

1

51 Stimmen

17 Beiträge

0 Aufrufe

G

You look up what Googlebot does. No AI. You want to know what crawlers do AI? Just search for "AI", or "training", or some such, or skim through. It's not long. Google-Extended collects training data. Note that Google-Extended is explicitly not used to rank pages. Did that help?
P

Social media can support or undermine democracy — it comes down to how it’s designed: Platform design is a silent pilot steering human behavior.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

70 Stimmen

5 Beiträge

36 Aufrufe

R

Yep It is a design choice to offer a news feed that combines verified news sources with tankie memes — interspersed with photos generated by AI I've really tried to provide tools to tame the meme flood and put them into effect on https://PieFed.social - compare that with the front-page (or All feed) of any Lemmy instance (or most PieFed instances, to be fair). Gen AI filter is coming.
A

Palantir partners to develop AI software for nuclear construction
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

33 Stimmen

4 Beiträge

30 Aufrufe

T

The grift goes nuclear. No surprise.
I

Apple sued by shareholders for allegedly overstating AI progress
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
75

500 Stimmen

75 Beiträge

396 Aufrufe

F

For this comment, I want to be absolutely clear that I do not give a shit about AI, and that it in no way factored into my decision to buy this iPhone 16 Pro Max. With that disclaimer out of the way: I very much look forward to a class action lawsuit. Apple advertised specific features as coming ‘very soon’ and gave short timeframes when asked directly. And they basically did not deliver on those advertising promises. Basically, I think there’s a good case to be made here that Apple knowingly engaged in false advertising in order to sell a phone that otherwise would not have sold as well. Those promised AI features WERE a deciding factor for a lot of people to upgrade to an iPhone 16. So, I’ll be looking forward to some form of compensation. It’s the principle of it.
P

Cory Doctorow on how we lost the internet
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
19

146 Stimmen

19 Beiträge

89 Aufrufe

F

This is going to be my goto example of why people need to care about data privacy. This is fucking insane. I'd fire someone for even throwing that out as a suggestion.
P

Li-Cycle’s quest to recycle lithium-ion batteries ends in bankruptcy
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

22 Stimmen

1 Beiträge

14 Aufrufe

Niemand hat geantwortet
S

Valve CEO Gabe Newell’s Neuralink competitor is expecting its first brain chip this year
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
175

1

241 Stimmen

175 Beiträge

721 Aufrufe

N

I think a generic plug would be great but look at how fragmented USB specifications are. Add that to biology and it's a whole other level of difficulty. Brain implants have great potential but the abandonment issue is a problem that exists now that we have to solve for. It's also not really a tech issue but a societal one on affordability and accountability of medical research. Imagine if a company held the patents for the brain device and just closed down without selling or leasing the patent. People with that device would have no support unless a government body forced the release of the patent. This has already happened multiple times to people in clinical trials and scaling up deployment with multiple versions will make the situation worse. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2818077 I don't really have a take on your personal desires. I do think if anyone can afford one they should make sure it's not just the up front cost but also the long term costs to be considered. Like buying an expensive car, it's not if you can afford to purchase it but if you can afford to wreck it.
D

Chromium Blog: Fighting Unwanted Notifications with Machine Learning in Chrome
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

44 Stimmen

3 Beiträge

26 Aufrufe

V

I use it for my self hosted apps, but yeah, it's rarely useful for websites in the wild.