Codeberg: army of AI crawlers are extremely slowing us; AI crawlers learned how to solve the Anubis challenges.
-
You need to properly detect that they're bots first and then they'll just figure out how to spoof that. Then you're back to square one.
Abstractly, POW doesn't need to determine if you're a bot or not. To make a request, as a human or bot, you need to pay in cpu-time. The hope is that the cost is not so high that a human notices very much but for a bot trying to hoover up data as fast as possible, the aggregate cost is high.
I think the more horrifying aspect is that they'll just build ever bigger datacenters to crunch POW tests faster and the carbon cost will skyrocket even more.
Oh I haven't even considered the carbon aspect. Anubis is an even worse idea than I previously thought...
-
cross-posted from: https://programming.dev/post/35852706
Crazy. DDoS attacks are illegal here in the UK.
-
There once was a dream of the semantic web, also known as web2. The semantic web could have enabled easy to ingest information of webpages, removing soo much of the computation required to get the information. Thus preventing much of the AI crawling cpu overhead.
What we got as web2 instead was social media. Destroying facts and making people depressed at a newer before seen rate.
Web3 was about enabling us to securely transfer value between people digitally and without middlemen.
What crypto gave us was fraud, expensive jpgs and scams. The term web is now even so eroded that it has lost much of its meaning. The information age gave way for the misinformation age, where everything is fake.
Mr. Internet, tear down these walls! (for all these walled gardens)
Return the internet to the wild. Let it run feral like dinosaurs on an island.
Let the grannies and idiots stick themselves in the reservations and asylums run by billionaires.
Let's all make Neocities pages about our hobbies and dirtiest, innermost thoughts. With gifs all over.
-
Eventually we'll have "defensive" and "offensive" llm's managing all kinds of electronic warfare automatically, effectively nullifying each other.
Obligatory AI ≠ LLM. How would scrapers benefit from the LLMs they help train? The defense is obvious, LLM-generated slop traps against scrapers already exist.
-
cross-posted from: https://programming.dev/post/35852706
how this felt like while reading
-
Anubis isn't supposed to be hard to avoid, but expensive to avoid. Not really surprised that a big company might be willing to throw a bunch of cash at it.
No, it's expensive to comply (at a massive scale), but easy to avoid. Just change the user agent. There's even a dedicated extension for bypassing Anubis.
Even then AI servers have plenty of compute, it realistically doesn’t cost much. Maybe like a thousandth of a cent per solve? They're spending billions on GPU power, they don't care.
I've been saying this since day 1 of Anubis but nobody wants to hear it.
-
Yeah but ai companies are losing money so in the long run Anubis seems like it should eventually return to working.
Costs of solving PoW for Anubis is absolutely not a factor in any AI companies budget. Just the costs of answering one question is millions of times more expensive than running sha256sum for Anubis.
Just in case you're being glib and mean the businesses will go under regardless of Anubis: most of these are coming from China. China absolutely will keep running these companies at a loss for the sake of strategic development.
-
What the alternative?
Not much for open source solutions. A simple captcha however would cost scrapers more to crack than Anubis.
But when it comes to "real" bot management solutions: The least invasive solutions will try to match User-Agent and other headers against the TLS fingerprint and block if they don't match. More invasive solutions will fingerprint your browser and even your GPU, then either block you or issue you a tracking cookie which is often pinned to your IP and user-agent. Both of those solutions require a large base of data to know what real and fake traffic actually looks like. Only large hosting providers like CloudFlare and Akamai have that data and can provide those sorts of solutions.
-
Except previously bombarding another person's server for personal gain was illegal.
Not if it's AI.
/s aside, maybe you could call'em out on involuntary DoSing, but then slashdot and similar sites would get into trouble. -
the old fashioned way.
A whole swath of trained toads using a special made tube network?
getting into a car with a stranger who said he was 15 minutes away two hours ago
-
cross-posted from: https://programming.dev/post/35852706
Can there be a challenge that actually does some maliciously useful compute? Like make their crawlers mine bitcoin or something.
-
Can there be a challenge that actually does some maliciously useful compute? Like make their crawlers mine bitcoin or something.
Did you just say use the words "useful" and "bitcoin" in the same sentence? o_O
-
Web3 was about enabling us to securely transfer value between people digitally and without middlemen.
It's ironic that the middlemen showed up anyway and busted all the security of those transfers
You want some bipcoin to buy weed drugs on the slip road? Don't bother figuring out how to set up that wallet shit, come to our nifty token exchange where you can buy and sell all kinds of bipcoins
oh btw every government on the planet showed up and dug through our insecure records. hope you weren't actually buying shroom drugs on the slip rod
also we got hacked, you lost all your bipcoins sorry
At least, that's my recollection of events. I was getting my illegal narcotics the old fashioned way.
also we got hacked, you lost all your bipcoins sorry
aaaaaaaaand - it's gone!
-
Did you just say use the words "useful" and "bitcoin" in the same sentence? o_O
Bro couldn't even bring himself to mention protein folding because that's too socialist I guess.
-
There once was a dream of the semantic web, also known as web2. The semantic web could have enabled easy to ingest information of webpages, removing soo much of the computation required to get the information. Thus preventing much of the AI crawling cpu overhead.
What we got as web2 instead was social media. Destroying facts and making people depressed at a newer before seen rate.
Web3 was about enabling us to securely transfer value between people digitally and without middlemen.
What crypto gave us was fraud, expensive jpgs and scams. The term web is now even so eroded that it has lost much of its meaning. The information age gave way for the misinformation age, where everything is fake.
Web3 was about enabling us to securely transfer value between people digitally and without middlemen
I don't think it ever was that, I think folding ideas has the best explanation of what it was meant to be, it was meant to be a way to grab power, away from those who already have it
-
Can there be a challenge that actually does some maliciously useful compute? Like make their crawlers mine bitcoin or something.
Not without making real users also mine bitcoin/avoiding the site because their performance tanked.
-
I mean, we really have to ask ourselves - as a civilization - whether human collaboration is more important than AI data harvesting.
I was fine before the AI.
The biggest customer of AI are the billionaires who can't hire enough people for their technofeudalist/surveillance capitalism agenda. The billionaires (wannabe aristocrats) know that machines have no morals, no bottom lines, no scruples, don't leak info to the press, don't complain, don't demand to take time off or to work from home, etc.
AI makes the perfect fascist.
They sell AI like it's a benefit to us all, but it ain't that. It's a benefit to the billionaires who think they own our world.
AI is used for censorship, surveillance pricing, activism/protest analysis, making firing decisions, making kill decisions in battle, etc. It's a nightmare fuel under our system of absurd wealth concentration.
Fuck AI.
-
No, it's expensive to comply (at a massive scale), but easy to avoid. Just change the user agent. There's even a dedicated extension for bypassing Anubis.
Even then AI servers have plenty of compute, it realistically doesn’t cost much. Maybe like a thousandth of a cent per solve? They're spending billions on GPU power, they don't care.
I've been saying this since day 1 of Anubis but nobody wants to hear it.
The website would also have to display to users at the end of the day. It's a similar problem as trying to solve media piracy. Worst comes to it, the crawlers could read the page like a person would.
-
There once was a dream of the semantic web, also known as web2. The semantic web could have enabled easy to ingest information of webpages, removing soo much of the computation required to get the information. Thus preventing much of the AI crawling cpu overhead.
What we got as web2 instead was social media. Destroying facts and making people depressed at a newer before seen rate.
Web3 was about enabling us to securely transfer value between people digitally and without middlemen.
What crypto gave us was fraud, expensive jpgs and scams. The term web is now even so eroded that it has lost much of its meaning. The information age gave way for the misinformation age, where everything is fake.
Sound like it went the same way everything else went. The less money is involved the more trustworthy it is.
-
Costs of solving PoW for Anubis is absolutely not a factor in any AI companies budget. Just the costs of answering one question is millions of times more expensive than running sha256sum for Anubis.
Just in case you're being glib and mean the businesses will go under regardless of Anubis: most of these are coming from China. China absolutely will keep running these companies at a loss for the sake of strategic development.
Thanks for the info
would not have thought Anubis would be so irrelevant
-
Data that taxpayers have paid for and rely on is disappearing – here’s how it’s happening and what you can do about it
Technology1
-
-
-
Doctors are using unapproved AI software to record patient meetings, investigation reveals
Technology1
-
-
Massive internet outage reported: Google services, Cloudflare, Character.AI among dozens of services impacted
Technology1
-
-