Millions of websites to get 'game-changing' AI bot blocker
-
This post did not contain any content.
-
This post did not contain any content.
Until the AI companies find a way around it. Love the idea so hopefully it causes at least 3 days of struggle for the AI crawlers.
Having said that... Can someone else put this in place so we do not have Cloudflare hosting everything where we would just be one intern away from a global outage. Please? Pretty please?
-
Until the AI companies find a way around it. Love the idea so hopefully it causes at least 3 days of struggle for the AI crawlers.
Having said that... Can someone else put this in place so we do not have Cloudflare hosting everything where we would just be one intern away from a global outage. Please? Pretty please?
The problem is that the biggest service Cloudflare provides is DDoS protection, and doing that requires that you have more bandwidth available than your attacker. Having enough bandwidth to withstand modern botnet powered DDoS attacks is ridiculously expensive (and it's also a finite resource, there's only so much backbone infrastructure). Basically it's economically infeasible to have multiple companies providing the service Cloudflare does. You might be able to get away with two companies doing so, but it's unlikely you could manage more than that without some of them starting to go bankrupt.
-
This post did not contain any content.
I really wish the answer was a legally enforced robots.txt file that very easily allowed any web data any organization or individual user is posting to script out what the permissions are. I often use a LLM as a search and most of the time the citations are pretty decent and I use those to link out to source content.
I run a small blog and I'd love to get indexed in a LLM, not blocked, as long as I was assured a reference link for any content used and had some legal recourse if I found my data was being misused.
I don't love the answer being another mega corporation posing as a white knight looking to skim some money off of the "loophole" that is AI copyright infringement. -
The problem is that the biggest service Cloudflare provides is DDoS protection, and doing that requires that you have more bandwidth available than your attacker. Having enough bandwidth to withstand modern botnet powered DDoS attacks is ridiculously expensive (and it's also a finite resource, there's only so much backbone infrastructure). Basically it's economically infeasible to have multiple companies providing the service Cloudflare does. You might be able to get away with two companies doing so, but it's unlikely you could manage more than that without some of them starting to go bankrupt.
I wonder if it would be a good investment for a country to have their own then down the line expand to sell the same service to others
-
The problem is that the biggest service Cloudflare provides is DDoS protection, and doing that requires that you have more bandwidth available than your attacker. Having enough bandwidth to withstand modern botnet powered DDoS attacks is ridiculously expensive (and it's also a finite resource, there's only so much backbone infrastructure). Basically it's economically infeasible to have multiple companies providing the service Cloudflare does. You might be able to get away with two companies doing so, but it's unlikely you could manage more than that without some of them starting to go bankrupt.
when a critical service is not economical for more than one business to do (natural monopoly), that's when govt should be stepping in.
-
I wonder if it would be a good investment for a country to have their own then down the line expand to sell the same service to others
It's OurFlare, comrade.
-
This post did not contain any content.
This is not about stopping bot-scrapers, it's about charging them.
-
Until the AI companies find a way around it. Love the idea so hopefully it causes at least 3 days of struggle for the AI crawlers.
Having said that... Can someone else put this in place so we do not have Cloudflare hosting everything where we would just be one intern away from a global outage. Please? Pretty please?
Proof of work seems to be working pretty well for many websites.
-
when a critical service is not economical for more than one business to do (natural monopoly), that's when govt should be stepping in.
Which govt? I'm not comfortable with the idea of the current US govt having control over this sort of service.
-
This post did not contain any content.
I can't wait to be denied access to websites because of it. Even more than I already am, that is.
-
This post did not contain any content.
To that end the company is developing a "Pay Per Crawl" system, which would give content creators the option to request payment from AI companies for utilising their original content.
So Cloudflare is not as much "saving the Internet", as just becoming a middleman between LLM training companies and content creators. Which I believe has a potential of being a true goldmine in the future.
-
To that end the company is developing a "Pay Per Crawl" system, which would give content creators the option to request payment from AI companies for utilising their original content.
So Cloudflare is not as much "saving the Internet", as just becoming a middleman between LLM training companies and content creators. Which I believe has a potential of being a true goldmine in the future.
Corps are gonna corp.
-
To that end the company is developing a "Pay Per Crawl" system, which would give content creators the option to request payment from AI companies for utilising their original content.
So Cloudflare is not as much "saving the Internet", as just becoming a middleman between LLM training companies and content creators. Which I believe has a potential of being a true goldmine in the future.
Can you DRM a crawl ?
-
This is not about stopping bot-scrapers, it's about charging them.
Hopefully people will price their content out of reach of the bot-scrapers, effectively stopping them.
-
Which govt? I'm not comfortable with the idea of the current US govt having control over this sort of service.
are you comfortable with a single corporation having control over this sort of service? the current government is obviously not ideal but that shouldn’t stop us from regulating monopolies.
-
Until the AI companies find a way around it. Love the idea so hopefully it causes at least 3 days of struggle for the AI crawlers.
Having said that... Can someone else put this in place so we do not have Cloudflare hosting everything where we would just be one intern away from a global outage. Please? Pretty please?
GitHub - TecharoHQ/anubis: Weighs the soul of incoming HTTP requests to stop AI crawlers
Weighs the soul of incoming HTTP requests to stop AI crawlers - TecharoHQ/anubis
GitHub (github.com)
-
Until the AI companies find a way around it. Love the idea so hopefully it causes at least 3 days of struggle for the AI crawlers.
Having said that... Can someone else put this in place so we do not have Cloudflare hosting everything where we would just be one intern away from a global outage. Please? Pretty please?
Yeah this will have absolutely no impact to gathering training data.
I assumed it was to block ai agents crawling it during requests, which they’d be unlikely to bypass in the web ui.
But no company spending millions on training will hesitate to have an agent appear as a regular desktop user to scrape data.
-
Can you DRM a crawl ?
You can if you're Cloudflare.
-
Yeah this will have absolutely no impact to gathering training data.
I assumed it was to block ai agents crawling it during requests, which they’d be unlikely to bypass in the web ui.
But no company spending millions on training will hesitate to have an agent appear as a regular desktop user to scrape data.
Does cloudflare still look at the agent? I thought they have more reliable data points.
-
-
-
Never run out of content again. Mojo Video generates unlimited original videos with a single click.
Technology1
-
-
-
-
Google Play’s latest security change may break many Android apps for some power users. The Play Integrity API uses hardware-backed signals that are trickier for rooted devices and custom ROMs to pass.
Technology1
-