linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Judge backs AI firm over use of copyrighted books

Technology

59 Beiträge 34 Kommentatoren 547 Aufrufe

X xthexder@l.sw0.com

C could still bankrupt the company depending on how trial goes. They pirated a lot of books.
A This user is from outside of this forum
A This user is from outside of this forum
artisian@lemmy.world

schrieb am zuletzt editiert von

#44

As a civil matter, the publishing houses are more likely to get the full money if anthropic stays in business (and does well). So it might be bad, but I'm really skeptical about bankruptcy (and I'm not hearing anyone seriously floating it?)
X 1 Antwort Letzte Antwort

4
D davriellelouna@lemmy.world

This post did not contain any content.
B This user is from outside of this forum
B This user is from outside of this forum
blametheantifa@lemmy.world

schrieb am zuletzt editiert von blametheantifa@lemmy.world

#45

Anakin: “Judge backs AI firm over use of copyrighted books”
Padme: “But they’ll be held accountable when they reproduce parts of those works or compete with the work they were trained on, right?”
Anakin: “…”
Padme: “Right?”
1 Antwort Letzte Antwort

7
G grimy@lemmy.world

Because of the vast amount of data needed, there will be no competitive viable open source solution if half the data is kept in a walled garden.

This is about open weights vs closed weights.
H This user is from outside of this forum
H This user is from outside of this forum
hendrik@palaver.p3x.de

schrieb am zuletzt editiert von hendrik@palaver.p3x.de

#46

I agree that we need open-source and emancipate ourselves. The main issue I see is: The entire approach doesn't work. I'd like to give the internet as an example. It's meant to be very open, connect everyone and enable them to share information freely. It is set up to be a level playing field... Now look what that leads to. Trillion dollar mega-corporations, privacy issues everywhere and big data silos. That's what the approach promotes. I agree with the goal. But in my opinion the approach will turn out to lead to less open source and more control by rich companies. And that's not what we want.

Plus nobody even opens the walled gardes. Last time I looked, Reddit wanted money for data. Other big platforms aren't open either. And there's kind of a small war going on with the scrapers and crawlers and anti-measures. So it's not as if it's open as of now.
G 1 Antwort Letzte Antwort

0
D davriellelouna@lemmy.world

This post did not contain any content.
F This user is from outside of this forum
F This user is from outside of this forum
fingolfinz@lemmy.world

schrieb am zuletzt editiert von

#47

Pirate everything!
1 Antwort Letzte Antwort

4
S sculptuspoe@lemmy.world

If you try to sell "the new adventures of Doctor Strange, Jonathan Strange and Magic Man." existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine, but pulling data that is not freely accessible should be theft, as it is already.
I This user is from outside of this forum
I This user is from outside of this forum
imgonnatrythis@sh.itjust.works

schrieb am zuletzt editiert von

#48

I have a freely accessible document that I have a cc license for that states it is not to be used for commercial use. This is commercial use. Your policy would allow for that document to be used though since it is accessible. This kind of policy discourages me from easily sharing my works as others profit from my efforts and my works are more likely to be attributed to a corporate beast I want nothing to do with then to me.

I'm all for copyright reform and simpler copyright law, but these companies need to be held to standard copyright rules and not just made up modifications.
I'm convinced a perfectly decent LLM could be built without violating copyrights.

I'd also be ok sharing works with a not for profit open source LLM and I think others might as well.
1 Antwort Letzte Antwort

4
H hendrik@palaver.p3x.de

I agree that we need open-source and emancipate ourselves. The main issue I see is: The entire approach doesn't work. I'd like to give the internet as an example. It's meant to be very open, connect everyone and enable them to share information freely. It is set up to be a level playing field... Now look what that leads to. Trillion dollar mega-corporations, privacy issues everywhere and big data silos. That's what the approach promotes. I agree with the goal. But in my opinion the approach will turn out to lead to less open source and more control by rich companies. And that's not what we want.

Plus nobody even opens the walled gardes. Last time I looked, Reddit wanted money for data. Other big platforms aren't open either. And there's kind of a small war going on with the scrapers and crawlers and anti-measures. So it's not as if it's open as of now.
G This user is from outside of this forum
G This user is from outside of this forum
grimy@lemmy.world

schrieb am zuletzt editiert von grimy@lemmy.world

#49

A lot of our laws are indeed obsolete. I think the best solution would be to force copy left licenses on anything using public created data.

But I'll take the wild west we have now with no walls then any kind of copyright dystopia. Reddit did successfully sell it's data to Google for 60 million. Right now, you can legally scrape anything you want off reddit, it is an open garden in every sense of the word (even if they dont like it). It's a lot more legal then using pirated books, but Google still bet 60 million that copyright laws would swing broadly in their favor.

I think it's very foolhardy to even hint at a pro copyright stance right now. There is a very real chance of AI getting monopolized and this is how they will do it.
H 1 Antwort Letzte Antwort

1
G grimy@lemmy.world

A lot of our laws are indeed obsolete. I think the best solution would be to force copy left licenses on anything using public created data.

But I'll take the wild west we have now with no walls then any kind of copyright dystopia. Reddit did successfully sell it's data to Google for 60 million. Right now, you can legally scrape anything you want off reddit, it is an open garden in every sense of the word (even if they dont like it). It's a lot more legal then using pirated books, but Google still bet 60 million that copyright laws would swing broadly in their favor.

I think it's very foolhardy to even hint at a pro copyright stance right now. There is a very real chance of AI getting monopolized and this is how they will do it.
H This user is from outside of this forum
H This user is from outside of this forum
hendrik@palaver.p3x.de

schrieb am zuletzt editiert von hendrik@palaver.p3x.de

#50

I agree a copyright dystopia wouldn't be any good. Just mind that wild west or law of the jungle is the "right of the strongest". You're advantaging big companies and disadvantaging smaller players or people with ethics or who are more open/transparent.

And I don't think legality with web scraping is the biggest issue. Sure I maybe could do it if it were possible. But I'm occasionally doing some weird stuff and most services have countermeasures in place. In reality I just can't scrape Reddit. Lot's of bots and crawlers just don't work any more. I'm getting rate limited left and right from all big platforms. Lots of things require an account these days, and services are quick banning me for "suspicious activity". It's barely possible to download Youtube videos these days. So, no. I can't. While Google can just pay for it and have the data.

Also Reddit isn't really the benevolent underdog here. They're a big company as well. And they're not selling their data... They're selling their user's data. They're mainly monetizing other people's creations.
1 Antwort Letzte Antwort

0
S sculptuspoe@lemmy.world

If you try to sell "the new adventures of Doctor Strange, Jonathan Strange and Magic Man." existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine, but pulling data that is not freely accessible should be theft, as it is already.
K This user is from outside of this forum
K This user is from outside of this forum
kate@lemmy.uhhoh.com

schrieb am zuletzt editiert von kate@lemmy.uhhoh.com

#51

as it is already

Copies of copyrighted works cannot be regarded as "stolen property" for the purposes of a prosecution under the National Stolen Property Act of 1934.

https://en.m.wikipedia.org/wiki/Dowling_v.United_States(1985)
1 Antwort Letzte Antwort

0
S sentient_loom@sh.itjust.works

used to train both commercial

commercial training is, in this case, stealing people's work for commercial gain

and open source language models

so, uh, let us train open-source models on open-source text. There's so much of it that there's no need to steal.

?

I'm not sure why you added a question mark at the end of your statement.
G This user is from outside of this forum
G This user is from outside of this forum
gaylord_fartmaster@lemmy.world

schrieb am zuletzt editiert von

#52

I'm not sure why you added a question mark at the end of your statement.

I was questioning whether or not you would see that as a benefit. Clearly you don't.

Are you also against libraries letting people borrow books since those are also lost sales for the authors, or are you just a luddite?
S 1 Antwort Letzte Antwort

2
G gaylord_fartmaster@lemmy.world

I'm not sure why you added a question mark at the end of your statement.

I was questioning whether or not you would see that as a benefit. Clearly you don't.

Are you also against libraries letting people borrow books since those are also lost sales for the authors, or are you just a luddite?
S This user is from outside of this forum
S This user is from outside of this forum
sentient_loom@sh.itjust.works

schrieb am zuletzt editiert von

#53

libraries letting people borrow books

This is so far from analogous that it's almost a nonsequitur.

are you just a luddite?

No, and you don't even believe such nonsense. You're grasping, ineffectively.
1 Antwort Letzte Antwort

1
M This user is from outside of this forum
M This user is from outside of this forum
meaanbeaan@lemmy.world

schrieb am zuletzt editiert von

#54

Wait, the authors argued that? Why? That's literally the opposite of the thing they needed to argue.
1 Antwort Letzte Antwort

3
A artisian@lemmy.world

As a civil matter, the publishing houses are more likely to get the full money if anthropic stays in business (and does well). So it might be bad, but I'm really skeptical about bankruptcy (and I'm not hearing anyone seriously floating it?)
X This user is from outside of this forum
X This user is from outside of this forum
xthexder@l.sw0.com

schrieb am zuletzt editiert von

#55

Depending on the type of bankruptcy, the business can still operate, all their profits would just be going towards paying off their depts.
1 Antwort Letzte Antwort

2
X xthexder@l.sw0.com

C could still bankrupt the company depending on how trial goes. They pirated a lot of books.
X This user is from outside of this forum
X This user is from outside of this forum
xerxos@lemmy.ml

schrieb am zuletzt editiert von xerxos@lemmy.ml

#56

It might be that bad. Most 'damage' (as publishers see it) comes from distribution, not the download itself. Depending on how they acquired the books, it might be not be much of a problem.
1 Antwort Letzte Antwort

2
A artisian@lemmy.world

Plantifs made that argument and the judge shoots it down pretty hard. That competition isn't what copyright protects from. He makes an analogy with teachers teaching children to write fiction: they are using existing fantasy to create MANY more competitors on the fiction market. Could an author use copyright to challenge that use?

Would love to hear your thoughts on the ruling itself (it's linked by reuters).
C This user is from outside of this forum
C This user is from outside of this forum
cort@lemmy.world

schrieb am zuletzt editiert von

#57

Orcs and dwarves (with a v) are creations of Tolkien, if the fantasy stories include them, it's a violation of copyright the same as including Mickey mouse.

My argument would have been to ask the ai for the bass line to Queen & David Bowie's Under Pressure. Then refer to that as a reproduction of copyrighted material. But then again, AI companies probably have better lawyers than vanilla ice.
A 1 Antwort Letzte Antwort

0
T the_q@lemmy.zip

An 80 year old judge on their best day couldn't be trusted to make an informed decision. This guy was either bought or confused into his decision. Old people gotta go.
A This user is from outside of this forum
A This user is from outside of this forum
awesomelowlander@sh.itjust.works

schrieb am zuletzt editiert von

#58

Funny, there's a lot of people on lemmy itself (especially around dbzer0) who would agree with the judge wholeheartedly.
1 Antwort Letzte Antwort

0
C cort@lemmy.world

Orcs and dwarves (with a v) are creations of Tolkien, if the fantasy stories include them, it's a violation of copyright the same as including Mickey mouse.

My argument would have been to ask the ai for the bass line to Queen & David Bowie's Under Pressure. Then refer to that as a reproduction of copyrighted material. But then again, AI companies probably have better lawyers than vanilla ice.
A This user is from outside of this forum
A This user is from outside of this forum
artisian@lemmy.world

schrieb am zuletzt editiert von artisian@lemmy.world

#59

The students read Tolkien, then invent their own settings. The judge thinks this is similar to how claude works. I, nor I suspect the judge, meant that the students were reusing world building whole cloth.
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

1

Bubble Trouble
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

46 Stimmen

10 Beiträge

141 Aufrufe

1

Yeah that would be the logical end game since companies have invested billions into this trend now.
P

An analysis of 15M+ biomedical abstracts from 2010 to 2024 finds researchers using AI to write abstracts use certain words far more often than those who don't
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

47 Stimmen

4 Beiträge

59 Aufrufe

T

Very interesting paper, and grade A irony to begin the title with “delving” while finding that “delve” is one of the top excess words/markers of LLM writing. Moreover, the authors highlight a few excerpts that “illustrate the LLM-style flowery language” including By meticulously delving into the intricate web connecting […] and […], this comprehensive chapter takes a deep dive into their involvement as significant risk factors for […]. …and then they clearly intentionally conclude the discussion section thus We hope that future work will meticulously delve into tracking LLM usage more accurately and assess which policy changes are crucial to tackle the intricate challenges posed by the rise of LLMs in scientific publishing. Great work.
P

The female TikTokers silenced through murder: Women influencers around the world are killed for simply speaking online
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

131 Stimmen

6 Beiträge

69 Aufrufe

P

This is a tough one for me: I'm opposed to femicide, but I only wish the absolute worst on influencers.
M

You probably don't remember these but I have a question
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
52

2

96 Stimmen

52 Beiträge

592 Aufrufe

L

Priorities man, priorities
P

OpenAI featured chatbot is pushing extreme surgeries to “subhuman” men: OpenAI's featured chatbot recommends $200,000 in surgeries while promoting incel ideology
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
40

1

100 Stimmen

40 Beiträge

320 Aufrufe

A

Lmfao I love this comment
P

AI cheating surge pushes schools into chaos
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
25

45 Stimmen

25 Beiträge

248 Aufrufe

C

Sorry for the late reply, I had to sit and think on this one for a little bit. I think there are would be a few things going on when it comes to designing a course to teach critical thinking, nuances, and originality; and they each have their own requirements. For critical thinking: The main goal is to provide students with a toolbelt for solving various problems. Then instilling the habit of always asking "does this match the expected outcome? What was I expecting?". So usually courses will be setup so students learn about a tool, practice using the tool, then have a culminating assignment on using all the tools. Ideally, the problems students face at the end require multiple tools to solve. Nuance mainly naturally comes with exposure to the material from a professional - The way a mechanical engineer may describe building a desk will probably differ greatly compared to a fantasy author. You can also explain definitions and industry standards; but thats really dry. So I try to teach nuances via definitions by mixing in the weird nuances as much as possible with jokes. Then for originality; I've realized I dont actually look for an original idea; but something creative. In a classroom setting, you're usually learning new things about a subject so a student's knowledge of that space is usually very limited. Thus, an idea that they've never heard about may be original to them, but common for an industry expert. For teaching originality creativity, I usually provide time to be creative & think, and provide open ended questions as prompts to explore ideas. My courses that require originality usually have it as a part of the culminating assignment at the end where they can apply their knowledge. I'll also add in time where students can come to me with preliminary ideas and I can provide feedback on whether or not it passes the creative threshold. Not all ideas are original, but I sometimes give a bit of slack if its creative enough. The amount of course overhauling to get around AI really depends on the material being taught. For example, in programming - you teach critical thinking by always testing your code, even with parameters that don't make sense. For example: Try to add 123 + "skibbidy", and see what the program does.
R

After an Arizona man was shot, an AI video of him addresses his killer in court
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
8

32 Stimmen

8 Beiträge

78 Aufrufe

J

Apparently, it was required to be allowed in that state: Reading a bit more, during the sentencing phase in that state people making victim impact statements can choose their format for expression, and it's entirely allowed to make statements about what other people would say. So the judge didn't actually have grounds to deny it. No jury during that phase, so it's just the judge listening to free form requests in both directions. It's gross, but the rules very much allow the sister to make a statement about what she believes her brother would have wanted to say, in whatever format she wanted. From: https://sh.itjust.works/comment/18471175 influence the sentence From what I've seen, to be fair, judges' decisions have varied wildly regardless, sadly, and sentences should be more standardized. I wonder what it would've been otherwise.
1

Freetube is the best way to watch YouTube
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
5

1

0 Stimmen

5 Beiträge

35 Aufrufe

1

Yeah there are some differences. Flatpaks are not updated when you update your system but you can run the "flatpak update" command to update all your Flatpak apps at once. After install, it should just work.