Judge backs AI firm over use of copyrighted books
-
C could still bankrupt the company depending on how trial goes. They pirated a lot of books.
As a civil matter, the publishing houses are more likely to get the full money if anthropic stays in business (and does well). So it might be bad, but I'm really skeptical about bankruptcy (and I'm not hearing anyone seriously floating it?)
-
This post did not contain any content.
Anakin: “Judge backs AI firm over use of copyrighted books”
Padme: “But they’ll be held accountable when they reproduce parts of those works or compete with the work they were trained on, right?”
Anakin: “…”
Padme: “Right?” -
Because of the vast amount of data needed, there will be no competitive viable open source solution if half the data is kept in a walled garden.
This is about open weights vs closed weights.
I agree that we need open-source and emancipate ourselves. The main issue I see is: The entire approach doesn't work. I'd like to give the internet as an example. It's meant to be very open, connect everyone and enable them to share information freely. It is set up to be a level playing field... Now look what that leads to. Trillion dollar mega-corporations, privacy issues everywhere and big data silos. That's what the approach promotes. I agree with the goal. But in my opinion the approach will turn out to lead to less open source and more control by rich companies. And that's not what we want.
Plus nobody even opens the walled gardes. Last time I looked, Reddit wanted money for data. Other big platforms aren't open either. And there's kind of a small war going on with the scrapers and crawlers and anti-measures. So it's not as if it's open as of now.
-
This post did not contain any content.
Pirate everything!
-
If you try to sell "the new adventures of Doctor Strange, Jonathan Strange and Magic Man." existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine, but pulling data that is not freely accessible should be theft, as it is already.
I have a freely accessible document that I have a cc license for that states it is not to be used for commercial use. This is commercial use. Your policy would allow for that document to be used though since it is accessible. This kind of policy discourages me from easily sharing my works as others profit from my efforts and my works are more likely to be attributed to a corporate beast I want nothing to do with then to me.
I'm all for copyright reform and simpler copyright law, but these companies need to be held to standard copyright rules and not just made up modifications.
I'm convinced a perfectly decent LLM could be built without violating copyrights.I'd also be ok sharing works with a not for profit open source LLM and I think others might as well.
-
I agree that we need open-source and emancipate ourselves. The main issue I see is: The entire approach doesn't work. I'd like to give the internet as an example. It's meant to be very open, connect everyone and enable them to share information freely. It is set up to be a level playing field... Now look what that leads to. Trillion dollar mega-corporations, privacy issues everywhere and big data silos. That's what the approach promotes. I agree with the goal. But in my opinion the approach will turn out to lead to less open source and more control by rich companies. And that's not what we want.
Plus nobody even opens the walled gardes. Last time I looked, Reddit wanted money for data. Other big platforms aren't open either. And there's kind of a small war going on with the scrapers and crawlers and anti-measures. So it's not as if it's open as of now.
A lot of our laws are indeed obsolete. I think the best solution would be to force copy left licenses on anything using public created data.
But I'll take the wild west we have now with no walls then any kind of copyright dystopia. Reddit did successfully sell it's data to Google for 60 million. Right now, you can legally scrape anything you want off reddit, it is an open garden in every sense of the word (even if they dont like it). It's a lot more legal then using pirated books, but Google still bet 60 million that copyright laws would swing broadly in their favor.
I think it's very foolhardy to even hint at a pro copyright stance right now. There is a very real chance of AI getting monopolized and this is how they will do it.
-
A lot of our laws are indeed obsolete. I think the best solution would be to force copy left licenses on anything using public created data.
But I'll take the wild west we have now with no walls then any kind of copyright dystopia. Reddit did successfully sell it's data to Google for 60 million. Right now, you can legally scrape anything you want off reddit, it is an open garden in every sense of the word (even if they dont like it). It's a lot more legal then using pirated books, but Google still bet 60 million that copyright laws would swing broadly in their favor.
I think it's very foolhardy to even hint at a pro copyright stance right now. There is a very real chance of AI getting monopolized and this is how they will do it.
I agree a copyright dystopia wouldn't be any good. Just mind that wild west or law of the jungle is the "right of the strongest". You're advantaging big companies and disadvantaging smaller players or people with ethics or who are more open/transparent.
And I don't think legality with web scraping is the biggest issue. Sure I maybe could do it if it were possible. But I'm occasionally doing some weird stuff and most services have countermeasures in place. In reality I just can't scrape Reddit. Lot's of bots and crawlers just don't work any more. I'm getting rate limited left and right from all big platforms. Lots of things require an account these days, and services are quick banning me for "suspicious activity". It's barely possible to download Youtube videos these days. So, no. I can't. While Google can just pay for it and have the data.
Also Reddit isn't really the benevolent underdog here. They're a big company as well. And they're not selling their data... They're selling their user's data. They're mainly monetizing other people's creations.
-
If you try to sell "the new adventures of Doctor Strange, Jonathan Strange and Magic Man." existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine, but pulling data that is not freely accessible should be theft, as it is already.
as it is already
Copies of copyrighted works cannot be regarded as "stolen property" for the purposes of a prosecution under the National Stolen Property Act of 1934.
https://en.m.wikipedia.org/wiki/Dowling_v.United_States(1985)
-
used to train both commercial
commercial training is, in this case, stealing people's work for commercial gain
and open source language models
so, uh, let us train open-source models on open-source text. There's so much of it that there's no need to steal.
?
I'm not sure why you added a question mark at the end of your statement.
I'm not sure why you added a question mark at the end of your statement.
I was questioning whether or not you would see that as a benefit. Clearly you don't.
Are you also against libraries letting people borrow books since those are also lost sales for the authors, or are you just a luddite?
-
I'm not sure why you added a question mark at the end of your statement.
I was questioning whether or not you would see that as a benefit. Clearly you don't.
Are you also against libraries letting people borrow books since those are also lost sales for the authors, or are you just a luddite?
libraries letting people borrow books
This is so far from analogous that it's almost a nonsequitur.
are you just a luddite?
No, and you don't even believe such nonsense. You're grasping, ineffectively.
-
Wait, the authors argued that? Why? That's literally the opposite of the thing they needed to argue.
-
As a civil matter, the publishing houses are more likely to get the full money if anthropic stays in business (and does well). So it might be bad, but I'm really skeptical about bankruptcy (and I'm not hearing anyone seriously floating it?)
Depending on the type of bankruptcy, the business can still operate, all their profits would just be going towards paying off their depts.
-
Fairphone announces the €599 Fairphone 6, with a 6.31" 120Hz LTPO OLED display, a Snapdragon 7s Gen 3 chip, and enhanced modularity with 12 swappable parts
Technology1
-
-
-
-
-
-
-
Microsoft’s vast advertising business is target of Irish Council for Civil Liberties (ICCL) Enforce application for class action launch under EU data law
Technology1