Codeberg: army of AI crawlers are extremely slowing us; AI crawlers learned how to solve the Anubis challenges.
-
Socialized healthcare isn't socialism...
Interesting.
How about Canada?
The US does not have socialized healthcare. It uses a welfare state. What about Canada?
-
The US does not have socialized healthcare. It uses a welfare state. What about Canada?
Does Canada have socialized healthcare?
Does anywhere?
-
OCR could be done effectively without AI
OCR has been neural nets even before convolutional networks emerged in the 2010s
Yeah you're right, I was using AI in the colloquial modern sense. My mistake. It actually drives me nuts when people do that. I should have said "without compute-heavy AI".
-
Does Canada have socialized healthcare?
Does anywhere?
Yeah they do! A few places do. I’m just enjoying seeing you prove your own sarcasm as actual truth because these things formed in spite of capitalism
-
Yeah they do! A few places do. I’m just enjoying seeing you prove your own sarcasm as actual truth because these things formed in spite of capitalism
Is Canada not capitalist?
-
Is Canada not capitalist?
Are you willingly ignoring my point or just not getting it? Oh well
-
Are you willingly ignoring my point or just not getting it? Oh well
Sounds like you just realized you proved mine.
-
Just provide a full dump.zip plus incremental daily dumps and they won't have to scrape ?
Isn't that an obvious solution ? I mean, it's public data, it's out there, do you want it public or not ?
Do you want it only on openai and google but nowhere else ? If so then good luck with the piranhasThe Wikimedia Foundation does just that, and still, their infrastructure is under stress because of AI scrapers.
Dumps or no dumps, these AI companies don't care. They feel like they're entitled to taking or stealing what they want.
-
Sounds like you just realized you proved mine.
If you’re simping for capitalism then I guess I’m not surprised you’ll take whatever narrative and run with it.
-
Could you clarify on what you mean with “dealing with people”? I’m not really sure the point you’re trying to make with that
The complaint that got blamed on capitalism was:
The information age gave way for the misinformation age, where everything is fake.
and if there's one entity/person most responsible for that, it's Putin or the GOP. Most of it is political, and very little to do with capitalism itself. Except that capitalism surrounds and is intertwined with everything.
Still, if you get rid of capitalism, it doesn't get rid of politics. I'd argue that the root of the issue is the GOP trying to hoard power (money and otherwise), and power is going to exist with or without capitalism. Is North Korea capitalist? Do they have issues with disinfo?
This Christian Sharia Law movement doesn't exist for money.
-
If you’re simping for capitalism then I guess I’m not surprised you’ll take whatever narrative and run with it.
Lol can't engage with someone's point? Just call them a simp.
Why did you even engage if you can't walk the walk, lil bro?
-
getting into a car with a stranger who said he was 15 minutes away two hours ago
You were there too?
-
Lol can't engage with someone's point? Just call them a simp.
Why did you even engage if you can't walk the walk, lil bro?
You: “capitalism is the reason we have everything! And when people get sick of capitalism’s greed and form social policies to help the labor class because they continually get exploited in spite of capitalism, somehow that’s capitalism’s fault!”
But then the irony is lost on you
-
I would give this reddit gold
Instant easy complaints help-i'm-oppressed-by-Capitalism today sound an awful lot like the instant easy complaints help-i'm-oppressed-by-Communism I used to hear from rednecks
Ask someone who starved & died under either system how obviously superior it is, you will find millions on either side
Also consider that Socialism is totally legal under Capitalism. Want to start a co-op? Go for it. Want to legislate and implement socialized healthcare? Many Capitalist countries have.
Under Communism, Capitalism must be illegal and stamped out by force. Want to start a business making shoes and hire someone to work for an agreed upon wage? Illegal.
When the goal involves guaranteeing positive rights, I'm not sure how it can be achieved without coercion. Which is how any socialist policies get implemented under capitalism anyways.
-
You: “capitalism is the reason we have everything! And when people get sick of capitalism’s greed and form social policies to help the labor class because they continually get exploited in spite of capitalism, somehow that’s capitalism’s fault!”
But then the irony is lost on you
Whoever you're shadowboxing there is taking a hell of a beating!
-
Whoever you're shadowboxing there is taking a hell of a beating!
Mkay, Lucille
-
Mkay, Lucille
Thanks for exemplifying the "CaPiTaLiSm BaD" person from my original comment so well! The total inability to engage with my comment to the point of inventing a whole new one to argue with was a really nice touch!
-
Just provide a full dump.zip plus incremental daily dumps and they won't have to scrape ?
Isn't that an obvious solution ? I mean, it's public data, it's out there, do you want it public or not ?
Do you want it only on openai and google but nowhere else ? If so then good luck with the piranhasI think the issue is that the scrapers are fully automatically collecting text, jumping from link to link like a search engine indexer.
-
The Wikimedia Foundation does just that, and still, their infrastructure is under stress because of AI scrapers.
Dumps or no dumps, these AI companies don't care. They feel like they're entitled to taking or stealing what they want.
That's crazy, it makes no sense, it takes as much bandwidth and processing power on the scraper side to process and use the data as it takes to serve it.
They also have an open API that makes scraper entirely unnecessary too.
Here are the relevant quotes from the article you posted
"Scraping has become so prominent that our outgoing bandwidth has increased by 50% in 2024."
"At least 65% of our most expensive requests (the ones that we can’t serve from our caching servers and which are served from the main databases instead) are performed by bots."
"Over the past year, we saw a significant increase in the amount of scraper traffic, and also of related site-stability incidents: Site Reliability Engineers have had to enforce on a case-by-case basis rate limiting or banning of crawlers repeatedly to protect our infrastructure."
And it's wikipedia ! The entire data set is trained INTO the models already, it's not like encyclopedic facts change that often to begin with !
The only thing I imagine is that it is part of a larger ecosystem issue, there the rare case where a dump and API access is so rare, and so untrust worthy that the scrapers are just using scrape for everything, rather than taking the time to save bandwidth by relying on dumps.
Maybe it's consequences from the 2023 API wars, where it was made clear that data repositories would be leveraging their place as pool of knowledge to extract rent from search and AI and places like wikipedia and other wikis and forums are getting hammered as a result of this war.
If the internet wasn't becoming a warzone, there really wouldn't be a need for more than one scraper to scrape a site, even if the site was hostile, like facebook, it only need to be scraped once and then the data could be shared over a torrent swarm efficiently.
-
Is there nightshade but for text and code? Maybe my source headers should include a bunch of special characters that then give a prompt injection. And sprinkle some nonsensical code comments before the real code comment.
There are glitch tokens but I think those only effect it when using it.
-
-
Artificial Intelligence in Supply Chain Market Future Scope: Growth, Share, Value, Size, and Analysis
Technology2
-
-
Help us understand the challenges patients face opting out of voluntary uses of their data, or getting access to their records.
Technology1
-
-
-
-
'An Insult To Life Itself': Hayao Miyazaki’s AI Criticism Resurfaces As OpenAI’s Ghibli-Style Image Trend Takes Over Social Media
Technology1