linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge dismisses authors' copyright lawsuit against Meta over AI training

Technology

24 Beiträge 14 Kommentatoren 126 Aufrufe

D drmoose@lemmy.world

This post did not contain any content.
A This user is from outside of this forum
A This user is from outside of this forum
amosburton_thatguy@lemmy.ca

schrieb zuletzt editiert von

#15

Grab em by the intellectual property! When you're a multi-billion dollar corporation, they just let you do it!
1 Antwort Letzte Antwort

4
D drmoose@lemmy.world

This post did not contain any content.
B This user is from outside of this forum
B This user is from outside of this forum
blametheantifa@lemmy.world

schrieb zuletzt editiert von

#16

I’ll leave this here from another post on this topic…
1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
S This user is from outside of this forum
S This user is from outside of this forum
squaresinger@lemmy.world

schrieb zuletzt editiert von

#17

Accuracy and hallucination are two ends of a spectrum.

If you turn hallucinations to a minimum, the LLM will faithfully reproduce what's in the training set, but the result will not fit the query very well.

The other option is to turn the so-called temperature up, which will result in replies fitting better to the query but also the hallucinations go up.

In the end it's a balance between getting responses that are closer to the dataset (factual) or closer to the query (creative).
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb zuletzt editiert von

#18

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D 1 Antwort Letzte Antwort

0
O ocassionallyaduck@lemmy.world

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb zuletzt editiert von

#19

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J O 2 Antworten Letzte Antwort

1
D drmoose@lemmy.world

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J This user is from outside of this forum
J This user is from outside of this forum
josefo@leminal.space

schrieb zuletzt editiert von

#20

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D 1 Antwort Letzte Antwort

1
J josefo@leminal.space

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb zuletzt editiert von

#21

How do you think you're making money with your work? Did your knowledge appear from a vacuum? Ethically speaking nothing is "original creation of your own merit only" - everything we make is transformative by nature.

Either way, the talks are moot as we'll never agree on what is transformative enough to be harmful to our society unless its a direct 1:1 copy with direct goal to displace the original. But thats clearly not the case with LLMs.
1 Antwort Letzte Antwort

1
P petter1@lemm.ee

In an ideal world, there would be something like a universal basic income, which would reduce the pressure on artists that they have to generate enough income with their art, this would allow artists to make art less for mainstream but more unique and thus would, in my opinion, allow to weaken copyright laws

Well, that would be the way I would try to start change.
F This user is from outside of this forum
F This user is from outside of this forum
fishface@lemmy.world

schrieb zuletzt editiert von

#22

I would go a step further and have creative grants to people. It would work in a way similar to the BBC and similar broadcasters, where a body gets government money and then picks creative projects it thinks are worthwhile, with a remit that goes beyond the lowest common denominator. UBI ensures that this system doesn't have a monopoly on creative output.
P 1 Antwort Letzte Antwort

1
F fishface@lemmy.world

I would go a step further and have creative grants to people. It would work in a way similar to the BBC and similar broadcasters, where a body gets government money and then picks creative projects it thinks are worthwhile, with a remit that goes beyond the lowest common denominator. UBI ensures that this system doesn't have a monopoly on creative output.
P This user is from outside of this forum
P This user is from outside of this forum
petter1@lemm.ee

schrieb zuletzt editiert von

#23

Agree 100%!

We need more Kulturförderung!
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb zuletzt editiert von ocassionallyaduck@lemmy.world

#24

Ingesting all the artwork you ever created by obtaining it illegally and feeding it into my plagarism remix machine is theft of your work, because I did not pay for it.

Separately, keeping a copy of this work so I can do this repeatedly is also stealing your work.

The judge ruled the first was okay but the second was not because the first is "transformative", which sadly means to me that the judge despite best efforts does not understand how a weighted matrix of tokens works and that while they may have some prevention steps in place now, early models showed the tech for what it was as it regurgitated text with only minor differences in word choice here and there.

Current models have layers on top to try and prevent this user input, but escaping those safeguards is common, and it's also only masking the fact that the entire model is built off of the theft of other's work.
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

D

Grok 4 has been so badly neutered that it's now programmed to see what Elon says about the topic at hand and blindly parrot that line.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
67

2

738 Stimmen

67 Beiträge

279 Aufrufe

K

That has always been the two big problems with AI. Biases in the training, intentional or not, will always bias the output. And AI is incapable of saying "I do not have suffient training on this subject or reliable sources for it to give you a confident answer". It will always give you its best guess, even if it is completely hallucinating much of the data. The only way to identify the hallucinations if it isn't just saying absurd stuff on the face of it, it to do independent research to verify it, at which point you may as well have just researched it yourself in the first place. AI is a tool, and it can be a very powerful tool with the right training and use cases. For example, I use it at a software engineer to help me parse error codes when googling working or to give me code examples for modules I've never used. There is no small number of times it has been completely wrong, but in my particular use case, that is pretty easy to confirm very quickly. The code either works as expected or it doesn't, and code is always tested before releasing it anyway. In research, it is great at helping you find a relevant source for your research across the internet or in a specific database. It is usually very good at summarizing a source for you to get a quick idea about it before diving into dozens of pages. It CAN be good at helping you write your own papers in a LIMITED capacity, such as cleaning up your writing in your writing to make it clearer, correctly formatting your bibliography (with actual sources you provide or at least verify), etc. But you have to remember that it doesn't "know" anything at all. It isn't sentient, intelligent, thoughtful, or any other personification placed on AI. None of the information it gives you is trustworthy without verification. It can and will fabricate entire studies that do not exist even while attributed to real researcher. It can mix in unreliable information with reliable information becuase there is no difference to it. Put simply, it is not a reliable source of information... ever. Make sure you understand that.
D

Dubai to debut restaurant operated by an AI chef
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

26 Stimmen

6 Beiträge

34 Aufrufe

G

Huh, looks like my days of having absolutely zero interest in going to Dubai are coming to a middle
A

VMware’s rivals ramp efforts to create alternative stacks
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

77 Stimmen

10 Beiträge

82 Aufrufe

B

I don't use any GUI... I use terraform in the terminal or via CI/CD. There is an API and also a Terraform provider for Proxmox, and I can use that, together with Ansible and shell scripts to manage VMs, but I was looking for k8s support. Again, it works fine for small environments, with a bit of manual work and human intervention, but for larger ones, I need a bit more. I moved away from a few VMs acting as k8s nodes, to k8s as a service (at work).
A

Stephen Miller owns stock in ICE contractor Palantir — a company powering deportations
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

4 Stimmen

1 Beiträge

13 Aufrufe

Niemand hat geantwortet
P

Why Decentralized Social Media Matters
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
45

1

388 Stimmen

45 Beiträge

204 Aufrufe

F

Yeah we're kinda doing well. Retaining 50k mau from the initial user burst is really good and Lemmy was technologically really bad at the time. Its a lot more developed today. I think next time reddit fucks uo we spike to over 100k users and steadily grow from there.
F

New Supermaterial: As Strong As Steel And As Light As Styrofoam
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21

1

60 Stimmen

21 Beiträge

108 Aufrufe

D

I remember an Arthur Clarke novel where a space ship needs water from the planet below. The easiest thing is to lower cables from space and then lift some ice bergs.
N

Color-correcting algorithm removes the effect of water in underwater scenes
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

2 Stimmen

2 Beiträge

23 Aufrufe

Q

I give it 5 years before this is on our phones.
P

What does it mean to ‘accept’ or ‘reject’ all cookies, and which should I choose?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
30

1

143 Stimmen

30 Beiträge

142 Aufrufe

J

You do not need to ask for consent to use functional cookies, only for ones that are used for tracking, which is why you'll still have some cookies left afterwards and why properly coded sites don't break from the rejection. Most websites could strip out all of the 3rd party spyware and by doing so get rid of the popup entirely. They'll never do it because money, obviously, and sometimes instead cripple their site to blackmail you into accepting them.