linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge dismisses authors' copyright lawsuit against Meta over AI training

Technology

24 Beiträge 14 Kommentatoren 337 Aufrufe

D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
F This user is from outside of this forum
F This user is from outside of this forum
faizalr@fedia.io

schrieb am zuletzt editiert von

#3

Bad judgement.
F 1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

This is the notorious lawsuit from a year ago:

a group of well-known writers that includes comedian Sarah Silverman and authors Jacqueline Woodson and Ta-Nehisi Coates

The judge enforces that AI training is fair use:

But the actual process of an AI system distilling from thousands of written works to be able to produce its own passages of text qualified as “fair use” under U.S. copyright law because it was “quintessentially transformative,” Alsup wrote.

This is a second judgement of this type this week.
D This user is from outside of this forum
D This user is from outside of this forum
deathmetal27@lemmy.world

schrieb am zuletzt editiert von deathmetal27@lemmy.world

#4

Alsup? Is this the same judge who also presided over Oracle v. Google over the use of Java in Android? That guy really does his homework over cases he presides on, he learned how to code to see if APIs are copyrightable.

As for the ruling, I'm not in favour of AI training on copyrighted material, but I can see where the judgement is coming from. I think it's a matter of what's really copyrightable: the actual text or images or the abstract knowledge in the material. In other words, if you were to read a book and then write a summary of a section of it in your own words or orally described what you learned from the book to someone else, does that mean copyright infringement? Or if you watch a movie and then describe your favourite scenes to your friends?

Perhaps a case could be made that AI training on copyrighted materials is not the same as humans consuming the copyrighted material and therefore it should have a different provision in copyright law. I'm no lawyer, but I'd assume that current copyright law works on the basis that humans do not generally have perfect recall of the copyrighted material they consume. But then again a counter argument could be that neither does the AI due to its tendency to hallucinate sometimes. However, it still has superior recall compared to humans and perhaps could be the grounds for amending copyright law about AI training?
D P 2 Antworten Letzte Antwort

20
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb am zuletzt editiert von

#5

Terrible judgement.

Turn the K value down on the model and it reproduces text near verbatim.
D 1 Antwort Letzte Antwort

10
D deathmetal27@lemmy.world

Alsup? Is this the same judge who also presided over Oracle v. Google over the use of Java in Android? That guy really does his homework over cases he presides on, he learned how to code to see if APIs are copyrightable.

As for the ruling, I'm not in favour of AI training on copyrighted material, but I can see where the judgement is coming from. I think it's a matter of what's really copyrightable: the actual text or images or the abstract knowledge in the material. In other words, if you were to read a book and then write a summary of a section of it in your own words or orally described what you learned from the book to someone else, does that mean copyright infringement? Or if you watch a movie and then describe your favourite scenes to your friends?

Perhaps a case could be made that AI training on copyrighted materials is not the same as humans consuming the copyrighted material and therefore it should have a different provision in copyright law. I'm no lawyer, but I'd assume that current copyright law works on the basis that humans do not generally have perfect recall of the copyrighted material they consume. But then again a counter argument could be that neither does the AI due to its tendency to hallucinate sometimes. However, it still has superior recall compared to humans and perhaps could be the grounds for amending copyright law about AI training?
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb am zuletzt editiert von

#6

Your last paragraph would be ideal solution in ideal world but I don't think ever like this could happen in the current political and economical structures.

First its super easy to hide all of this and enforcement would be very difficult even domestically. Second, because we're in AI race no one would ever put themselves in such disadvantage unless its real damage not economical copyright juggling.

People need to come to terms with these facts so we can address real problems rather than blow against the wind with all this whining we see on Lemmy. There are actual things we can do.
D 1 Antwort Letzte Antwort

4
O ocassionallyaduck@lemmy.world

Terrible judgement.

Turn the K value down on the model and it reproduces text near verbatim.
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb am zuletzt editiert von

#7

Ah the Schrödinger's LLM - always hallucinating and also always accurate
T S O 3 Antworten Letzte Antwort

6
D deathmetal27@lemmy.world

Alsup? Is this the same judge who also presided over Oracle v. Google over the use of Java in Android? That guy really does his homework over cases he presides on, he learned how to code to see if APIs are copyrightable.

As for the ruling, I'm not in favour of AI training on copyrighted material, but I can see where the judgement is coming from. I think it's a matter of what's really copyrightable: the actual text or images or the abstract knowledge in the material. In other words, if you were to read a book and then write a summary of a section of it in your own words or orally described what you learned from the book to someone else, does that mean copyright infringement? Or if you watch a movie and then describe your favourite scenes to your friends?

Perhaps a case could be made that AI training on copyrighted materials is not the same as humans consuming the copyrighted material and therefore it should have a different provision in copyright law. I'm no lawyer, but I'd assume that current copyright law works on the basis that humans do not generally have perfect recall of the copyrighted material they consume. But then again a counter argument could be that neither does the AI due to its tendency to hallucinate sometimes. However, it still has superior recall compared to humans and perhaps could be the grounds for amending copyright law about AI training?
P This user is from outside of this forum
P This user is from outside of this forum
petter1@lemm.ee

schrieb am zuletzt editiert von

#8

Acree 100%

Hope we can refactor this whole copyright/patent concept soon..

It is more a pain for artists, creators, releasers etc.

I see it with EDM, I work as a Label, and do sometimes produce a bit

Most artists will work with samples and presets etc. And keeping track of who worked on what and who owns how much percent of what etc. just takes the joy out of creating..

Same for game design: You have a vision for your game, make a poc, and then have to change the whole game because of stupid patent shit not allowing you e.g. not land on a horse and immediately ride it, or throwing stuff at things to catch them…
A 1 Antwort Letzte Antwort

7
P petter1@lemm.ee

Acree 100%

Hope we can refactor this whole copyright/patent concept soon..

It is more a pain for artists, creators, releasers etc.

I see it with EDM, I work as a Label, and do sometimes produce a bit

Most artists will work with samples and presets etc. And keeping track of who worked on what and who owns how much percent of what etc. just takes the joy out of creating..

Same for game design: You have a vision for your game, make a poc, and then have to change the whole game because of stupid patent shit not allowing you e.g. not land on a horse and immediately ride it, or throwing stuff at things to catch them…
A This user is from outside of this forum
A This user is from outside of this forum
anarchistartificer@slrpnk.net

schrieb am zuletzt editiert von

#9

I'm inclined to agree. I hate AI, and I especially hate artists and other creatives being shafted, but I'm increasingly doubtful that copyright is an effective way to ensure that they get their fair share (whether we're talking about AI or otherwise).
P 1 Antwort Letzte Antwort

4
F faizalr@fedia.io

Bad judgement.
F This user is from outside of this forum
F This user is from outside of this forum
facedeer@fedia.io

schrieb am zuletzt editiert von

#10

Any reason to say that other than that it didn't give the result you wanted?
1 Antwort Letzte Antwort

5
A anarchistartificer@slrpnk.net

I'm inclined to agree. I hate AI, and I especially hate artists and other creatives being shafted, but I'm increasingly doubtful that copyright is an effective way to ensure that they get their fair share (whether we're talking about AI or otherwise).
P This user is from outside of this forum
P This user is from outside of this forum
petter1@lemm.ee

schrieb am zuletzt editiert von

#11

In an ideal world, there would be something like a universal basic income, which would reduce the pressure on artists that they have to generate enough income with their art, this would allow artists to make art less for mainstream but more unique and thus would, in my opinion, allow to weaken copyright laws

Well, that would be the way I would try to start change.
F 1 Antwort Letzte Antwort

6
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
T This user is from outside of this forum
T This user is from outside of this forum
tabular@lemmy.world

schrieb am zuletzt editiert von tabular@lemmy.world

#12

"hallucination refers to the generation of plausible-sounding but factually incorrect or nonsensical information"

Is an output an hallucination when the training data involved in that output included factually incorrect data? Suppose my input is "is the would flat" and then an LLM, allegedly, accurately generates a flat-eather's writings saying it is.
1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

Your last paragraph would be ideal solution in ideal world but I don't think ever like this could happen in the current political and economical structures.

First its super easy to hide all of this and enforcement would be very difficult even domestically. Second, because we're in AI race no one would ever put themselves in such disadvantage unless its real damage not economical copyright juggling.

People need to come to terms with these facts so we can address real problems rather than blow against the wind with all this whining we see on Lemmy. There are actual things we can do.
D This user is from outside of this forum
D This user is from outside of this forum
deathmetal27@lemmy.world

schrieb am zuletzt editiert von

#13

One way I could see this being enforced is by mandating that AI models not respond to questions that could result in speaking about a copyrighted work. Similar to how mainstream models don't speak about vulgar or controversial topics.

But yeah, realistically, it's unlikely that any judge would rule in that favour.
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
P This user is from outside of this forum
P This user is from outside of this forum
pattymcb@lemmy.world

schrieb am zuletzt editiert von

#14

It sounds like the precedent has been set
1 Antwort Letzte Antwort

3
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
A This user is from outside of this forum
A This user is from outside of this forum
amosburton_thatguy@lemmy.ca

schrieb am zuletzt editiert von

#15

Grab em by the intellectual property! When you're a multi-billion dollar corporation, they just let you do it!
1 Antwort Letzte Antwort

4
D drmoose@lemmy.world

This post did not contain any content.

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology.

AP News (apnews.com)
B This user is from outside of this forum
B This user is from outside of this forum
blametheantifa@lemmy.world

schrieb am zuletzt editiert von

#16

I’ll leave this here from another post on this topic…
1 Antwort Letzte Antwort

2
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
S This user is from outside of this forum
S This user is from outside of this forum
squaresinger@lemmy.world

schrieb am zuletzt editiert von

#17

Accuracy and hallucination are two ends of a spectrum.

If you turn hallucinations to a minimum, the LLM will faithfully reproduce what's in the training set, but the result will not fit the query very well.

The other option is to turn the so-called temperature up, which will result in replies fitting better to the query but also the hallucinations go up.

In the end it's a balance between getting responses that are closer to the dataset (factual) or closer to the query (creative).
1 Antwort Letzte Antwort

1
D drmoose@lemmy.world

Ah the Schrödinger's LLM - always hallucinating and also always accurate
O This user is from outside of this forum
O This user is from outside of this forum
ocassionallyaduck@lemmy.world

schrieb am zuletzt editiert von

#18

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D 1 Antwort Letzte Antwort

0
O ocassionallyaduck@lemmy.world

There is nothing intelligent about "AI" as we call it. It parrots based on probability. If you remove the randomness value from the model, it parrots the same thing every time based on it's weights, and if the weights were trained on Harry Potter, it will consistently give you giant chunks of harry potter verbatim when prompted.

Most of the LLM services attempt to avoid this by adding arbitrary randomness values to churn the soup. But this is also inherently part of the cause of hallucinations, as the model cannot preserve a single correct response as always the right way to respond to a certain query.

LLMs are insanely "dumb", they're just lightspeed parrots. The fact that Meta and these other giant tech companies claim it's not theft because they sprinkle in some randomness is just obscuring the reality and the fact that their models are derivative of the work of organizations like the BBC and Wikipedia, while also dependent on the works of tens of thousands of authors to develop their corpus of language.

In short, there was a ethical way to train these models. But that would have been slower. And the court just basically gave them a pass on theft. Facebook would have been entirely in the clear had it not stored the books in a dataset, which in itself is insane.

I wish I knew when I was younger that stealing is wrong, unless you steal at scale. Then it's just clever business.
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb am zuletzt editiert von

#19

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J O 2 Antworten Letzte Antwort

1
D drmoose@lemmy.world

Except that breaking copyright is not stealing and never was. Hard to believe that you'd ever see Copyright advocates on foss and decentralized networks like Lemmy - its like people had their minds hijacked because "big tech is bad".
J This user is from outside of this forum
J This user is from outside of this forum
josefo@leminal.space

schrieb am zuletzt editiert von

#20

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D 1 Antwort Letzte Antwort

1
J josefo@leminal.space

What name do you have for the activity of making money using someone else work or data, without their consent or giving compensation? If the tech was just tech, it wouldn't need any non consenting human input for it to work properly. This are just companies feeding on various types of data, if justice doesn't protects an author, what do you think it would happen if these same models started feeding of user data instead? Tech is good, ethics are not
D This user is from outside of this forum
D This user is from outside of this forum
drmoose@lemmy.world

schrieb am zuletzt editiert von

#21

How do you think you're making money with your work? Did your knowledge appear from a vacuum? Ethically speaking nothing is "original creation of your own merit only" - everything we make is transformative by nature.

Either way, the talks are moot as we'll never agree on what is transformative enough to be harmful to our society unless its a direct 1:1 copy with direct goal to displace the original. But thats clearly not the case with LLMs.
1 Antwort Letzte Antwort

1
P petter1@lemm.ee

In an ideal world, there would be something like a universal basic income, which would reduce the pressure on artists that they have to generate enough income with their art, this would allow artists to make art less for mainstream but more unique and thus would, in my opinion, allow to weaken copyright laws

Well, that would be the way I would try to start change.
F This user is from outside of this forum
F This user is from outside of this forum
fishface@lemmy.world

schrieb am zuletzt editiert von

#22

I would go a step further and have creative grants to people. It would work in a way similar to the BBC and similar broadcasters, where a body gets government money and then picks creative projects it thinks are worthwhile, with a remit that goes beyond the lowest common denominator. UBI ensures that this system doesn't have a monopoly on creative output.
P 1 Antwort Letzte Antwort

1

Anmelden zum Antworten

D

Microsoft investigates Israeli military’s use of Azure cloud storage
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
16

1

144 Stimmen

16 Beiträge

2 Aufrufe

F

Microsoft fired all parties responsible for getting their complicity noticed.
A

Mental Health and Substance Abuse Services Market Insights: Growth, Share, Value, Size, and Trends
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1 Stimmen

1 Beiträge

3 Aufrufe

Niemand hat geantwortet
R

Anker recalls over a million power banks due to fire and burn hazards
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
8

1

119 Stimmen

8 Beiträge

83 Aufrufe

A

Interesting! Python and Bash do the same as British.
X

Generative AI's most prominent skeptic doubles down
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
14

1

43 Stimmen

14 Beiträge

113 Aufrufe

Z

I don't think so, and I believe not even the current technology used for neural network simulations will bring us to AGI, yet alone LLMs.
P

AI cheating surge pushes schools into chaos
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
25

45 Stimmen

25 Beiträge

247 Aufrufe

C

Sorry for the late reply, I had to sit and think on this one for a little bit. I think there are would be a few things going on when it comes to designing a course to teach critical thinking, nuances, and originality; and they each have their own requirements. For critical thinking: The main goal is to provide students with a toolbelt for solving various problems. Then instilling the habit of always asking "does this match the expected outcome? What was I expecting?". So usually courses will be setup so students learn about a tool, practice using the tool, then have a culminating assignment on using all the tools. Ideally, the problems students face at the end require multiple tools to solve. Nuance mainly naturally comes with exposure to the material from a professional - The way a mechanical engineer may describe building a desk will probably differ greatly compared to a fantasy author. You can also explain definitions and industry standards; but thats really dry. So I try to teach nuances via definitions by mixing in the weird nuances as much as possible with jokes. Then for originality; I've realized I dont actually look for an original idea; but something creative. In a classroom setting, you're usually learning new things about a subject so a student's knowledge of that space is usually very limited. Thus, an idea that they've never heard about may be original to them, but common for an industry expert. For teaching originality creativity, I usually provide time to be creative & think, and provide open ended questions as prompts to explore ideas. My courses that require originality usually have it as a part of the culminating assignment at the end where they can apply their knowledge. I'll also add in time where students can come to me with preliminary ideas and I can provide feedback on whether or not it passes the creative threshold. Not all ideas are original, but I sometimes give a bit of slack if its creative enough. The amount of course overhauling to get around AI really depends on the material being taught. For example, in programming - you teach critical thinking by always testing your code, even with parameters that don't make sense. For example: Try to add 123 + "skibbidy", and see what the program does.
D

I just came across an AI called Sesame that appears to have been explicitly trained to deny and lie about the Palestinian genocide
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

36 Stimmen

9 Beiträge

90 Aufrufe

T

It's also much easier to implement.
T

New York Mayor Eric Adams to Crypto Industry: Come Build an Empire in NYC
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
26

1

88 Stimmen

26 Beiträge

261 Aufrufe

M

I really can't stand this guy. What a slag.
I

Dear Brother Printers: Eat a [Sponsor friendly words here]
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1 Stimmen

2 Beiträge

28 Aufrufe

A

Why doesn't Amazon just sell a generic printer that works with generic toner or pigment or ink. I would buy.