AI industry horrified to face largest copyright class action ever certified
-
I propose that anyone defending themselves in court over AI stealing data must be represented exclusively by AI.
Hilarious.
-
This post did not contain any content.
With the amount of money pouring in you'd think they'd just pay for it
-
This post did not contain any content.
I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.
-
This post did not contain any content.
Let them fight!
-
I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.
AI coding tools are using the exact same backends as AI fiction writing tools, so it would hurt the fledgling vibe coder profession (which according to proper software developers should not be allowed to exist at all).
-
I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.
They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.
For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.
-
As Anthropic argued, it now "faces hundreds of billions of dollars in potential damages liability at trial in four months" based on a class certification rushed at "warp speed" that involves "up to seven million potential claimants, whose works span a century of publishing history," each possibly triggering a $150,000 fine.
So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That's not how that works.
The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a "century of publishing history".
The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.
-
They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.
For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.
Let’s give them this one last win. For spite.
-
This post did not contain any content.
Well, theft has never been the best foundation for a business, has it?
While I completely agree that copyright terms are completely overblown, they are valid law that other people suffer under, so it is 100% fair to make them suffer the same. Or worse, as they all broke the law for commercial gain.
-
Oh no! Building a product with stolen data was a rotten idea after all. Well, at least the AI companies can use their fabulously genius PhD level LLMs to weasel their way out of all these lawsuits. Right?
PhD level LLM = paying MAs $21/hr to write summaries of paragraphs for them to improve off of. Google Gemini outsourced their work like this, so I assume everyone else did too.
-
As Anthropic argued, it now "faces hundreds of billions of dollars in potential damages liability at trial in four months" based on a class certification rushed at "warp speed" that involves "up to seven million potential claimants, whose works span a century of publishing history," each possibly triggering a $150,000 fine.
So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That's not how that works.
Actually that usually is how it works. Unfortunately.
*Too big to fail" was probably made up by the big ones.
-
The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a "century of publishing history".
The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.
Copyright companies losing the case wouldn't make copyright any shorter.
-
Copyright companies losing the case wouldn't make copyright any shorter.
Their winning of the case reinforces a harmful precedent.
At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.
-
They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.
For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.
Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.
I love Cory's writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it's laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.
That is that training models on creative works and then selling access to the derivative "creative" works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call "fair use" that hasn't been really tested in courts.
Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don't think anyone would argue that is not a derivative work, or that falls under "fair use." However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is "fair use" to sell. It's not producing copy-cat literature.
I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under "fair use", but it's hard to justify the slop machines as not a copyright problem.
In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won't help artists and authors.
-
Their winning of the case reinforces a harmful precedent.
At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.
Copyright owners winning the case maintains the status quo.
The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.
-
This post did not contain any content.
Let's go baby! The law is the law, and it applies to everybody
If the "genie doesn't go back in the bottle", make him pay for what he's stealing.
-
Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.
I love Cory's writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it's laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.
That is that training models on creative works and then selling access to the derivative "creative" works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call "fair use" that hasn't been really tested in courts.
Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don't think anyone would argue that is not a derivative work, or that falls under "fair use." However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is "fair use" to sell. It's not producing copy-cat literature.
I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under "fair use", but it's hard to justify the slop machines as not a copyright problem.
In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won't help artists and authors.
I agree, and I think your points line up with Doctorow’s other writing on the subject. It’s just hard to cover everything in one short essay.
-
Copyright owners winning the case maintains the status quo.
The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.
The status quo is a giant fucking problem, and has been for decades.
The rest of your comment is alarmist nonsense.
-
I was reading the article and thinking "suck a dick, AI companies" but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I'm wondering what the problem is.
I disagree with the EFF and ALA on this one.
These were entire sets of writing consumed and reworked into poor data without respecting the license to them.
Honestly, I wouldn't be surprised if copyright wasn't the only thing to be the problem here, but intellectual property as well. In that case, EFF probably has an interest in that instead. Regardless, I really think it need to be brought through court.
LLMs are harmful, full stop. Most other Machine Learning mechanisms use licensed data to train. In the case of software as a medical device, such as image analysis AI, that data is protected by HIPPA and special attention is already placed in order to utilize it.
-
This post did not contain any content.
Good. Burn it down. Bankrupt them.
If it's so "critical to national security" then nationalize it.
-
UK: X's design and policy choices created fertile ground for inflammatory, racist narratives targeting Muslims and migrants following Southport attack
Technology1
-
-
-
-
Last year China generated almost 3 times as much solar power as the EU did, and it's close to overtaking all OECD countries put together (whose combined population is 1.38 billion people)
Technology2
-
-
-