AI industry horrified to face largest copyright class action ever certified
-
They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.
For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.
Let’s give them this one last win. For spite.
-
This post did not contain any content.
Well, theft has never been the best foundation for a business, has it?
While I completely agree that copyright terms are completely overblown, they are valid law that other people suffer under, so it is 100% fair to make them suffer the same. Or worse, as they all broke the law for commercial gain.
-
Oh no! Building a product with stolen data was a rotten idea after all. Well, at least the AI companies can use their fabulously genius PhD level LLMs to weasel their way out of all these lawsuits. Right?
PhD level LLM = paying MAs $21/hr to write summaries of paragraphs for them to improve off of. Google Gemini outsourced their work like this, so I assume everyone else did too.
-
As Anthropic argued, it now "faces hundreds of billions of dollars in potential damages liability at trial in four months" based on a class certification rushed at "warp speed" that involves "up to seven million potential claimants, whose works span a century of publishing history," each possibly triggering a $150,000 fine.
So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That's not how that works.
Actually that usually is how it works. Unfortunately.
*Too big to fail" was probably made up by the big ones.
-
The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a "century of publishing history".
The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.
Copyright companies losing the case wouldn't make copyright any shorter.
-
Copyright companies losing the case wouldn't make copyright any shorter.
Their winning of the case reinforces a harmful precedent.
At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.
-
They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.
For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.
Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.
I love Cory's writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it's laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.
That is that training models on creative works and then selling access to the derivative "creative" works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call "fair use" that hasn't been really tested in courts.
Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don't think anyone would argue that is not a derivative work, or that falls under "fair use." However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is "fair use" to sell. It's not producing copy-cat literature.
I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under "fair use", but it's hard to justify the slop machines as not a copyright problem.
In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won't help artists and authors.
-
Their winning of the case reinforces a harmful precedent.
At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.
Copyright owners winning the case maintains the status quo.
The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.
-
This post did not contain any content.
Let's go baby! The law is the law, and it applies to everybody
If the "genie doesn't go back in the bottle", make him pay for what he's stealing.
-
Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.
I love Cory's writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it's laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.
That is that training models on creative works and then selling access to the derivative "creative" works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call "fair use" that hasn't been really tested in courts.
Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don't think anyone would argue that is not a derivative work, or that falls under "fair use." However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is "fair use" to sell. It's not producing copy-cat literature.
I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under "fair use", but it's hard to justify the slop machines as not a copyright problem.
In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won't help artists and authors.
I agree, and I think your points line up with Doctorow’s other writing on the subject. It’s just hard to cover everything in one short essay.
-
Copyright owners winning the case maintains the status quo.
The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.
The status quo is a giant fucking problem, and has been for decades.
The rest of your comment is alarmist nonsense.
-
I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.
I disagree with the EFF and ALA on this one.
These were entire sets of writing consumed and reworked into poor data without respecting the license to them.
Honestly, I wouldn't be surprised if copyright wasn't the only thing to be the problem here, but intellectual property as well. In that case, EFF probably has an interest in that instead. Regardless, I really think it need to be brought through court.
LLMs are harmful, full stop. Most other Machine Learning mechanisms use licensed data to train. In the case of software as a medical device, such as image analysis AI, that data is protected by HIPPA and special attention is already placed in order to utilize it.
-
This post did not contain any content.
Good. Burn it down. Bankrupt them.
If it's so "critical to national security" then nationalize it.
-
This post did not contain any content.
-
With the amount of money pouring in you'd think they'd just pay for it
Now now. You know that's not how capitalism works.
-
Let's go baby! The law is the law, and it applies to everybody
If the "genie doesn't go back in the bottle", make him pay for what he's stealing.
I just remembered the movie where the genie was released from the bottle of a real genie, he turned the world into chaos by freeing his own kind, and if it weren't for the power of the plot, I'm afraid people there would have become slaves or died out.
Although here it is already necessary to file a lawsuit for theft of the soul in the literal sense of the word.
-
Let's go baby! The law is the law, and it applies to everybody
If the "genie doesn't go back in the bottle", make him pay for what he's stealing.
The law absolutely does not apply to everybody, and you are well aware of that.
-
This post did not contain any content.
Welp, I guess if you have any AI stock, now is the time to dump it
-
This post did not contain any content.
Unfortunately, this will probably lead to nothing: in our world, only the poor seem to be punished for stealing. Well, corporations always get away with everything, so we sit on the couch and shout "YES!!!" for the fact that they are trying to console us with this.
-
The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a "century of publishing history".
The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.
Shutup thief. Go to jail.
-
-
-
Europe Flowers and Ornamental Plants Market Trends: Growth, Share, Value, Size, and Analysis
Technology2
-
OpenAI just launched its new ChatGPT Agent that can make as many as 1 complicated cupcake order per hour, but even Sam Altman says you probably shouldn't trust it for 'high-stakes uses'
Technology1
-
Connor Myers: As if graduating weren’t daunting enough, now students like me face a jobs market devastated by AI
Technology1
-
-
Trump Media & Technology Group, the company owned by the President, said Tuesday that it would raise $2.5 billion to invest in Bitcoin
Technology1
-