linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

The AI company Perplexity is complaining their bots can't bypass Cloudflare's firewall

138 Beiträge 79 Kommentatoren 1 Aufrufe

C cm0002@piefed.world

The original comment reply to you was all about how the legal system would act, that's the primary concern. All it would take is a Trump loyalist judge, a Trump leaning appeals court and the right-wing Supreme Court and boom suddenly the CFAA covers a whole lot more than what was "logical"
E This user is from outside of this forum
E This user is from outside of this forum
encryptkeeper@lemmy.world

schrieb zuletzt editiert von

#122

The original comment reply to me was all about how the legal system would act in the context of the CFAA specifically. And in that context that logic does not follow. Theres not much latitude for any judge to interpret the CFAA that way.

They could always push through some new law however.
1 Antwort Letzte Antwort

1
E encryptkeeper@lemmy.world

If I put a banner on my site that says "by visiting my site you agree not to modify the scripts or ads displayed on the site," does that make my visit with an ad blocker "unauthorized" under the CFAA?

How would you “authorize” a user to access assets served by your systems based on what they do with them after they've accessed them? That doesn’t logically follow so no, that would not make an ad blocker unauthorized under the CFAA. Especially because you’re not actually taking any steps to deny these people access either.

AI scrapers on the other hand are a type of users that you’re not authorizing to begin with, and if you’re using CloudFlares bot protection you’re putting into place a system to deny them access. To purposefully circumvent that access would be considered unauthorized.
G This user is from outside of this forum
G This user is from outside of this forum
gamingchairmodel@lemmy.world

schrieb zuletzt editiert von

#123

That doesn’t logically follow so no, that would not make an ad blocker unauthorized under the CFAA.

The CFAA also criminalizes "exceeding authorized access" in every place it criminalizes accessing without authorization. My position is that mere permission (in a colloquial sense, not necessarily technical IT permissions) isn't enough to define authorization. Social expectations and even contractual restrictions shouldn't be enough to define "authorization" in this criminal statute.

To purposefully circumvent that access would be considered unauthorized.

Even as a normal non-bot user who sees the cloudflare landing page because they're on a VPN or happen to share an IP address with someone who was abusing the network? No, circumventing those gatekeeping functions is no different than circumventing a paywall on a newspaper website by deleting cookies or something. Or using a VPN or relay to get around rate limiting.

The idea of criminalizing scrapers or scripts would be a policy disaster.
1 Antwort Letzte Antwort

1
S stocktoncrushed@sh.itjust.works

I mean, that's just capitalism.

Just wait till the bear is lobbying the game warden to put ankle weights on every rabbit. Also the bear would like an assault rifle. Stop being so anti-bear.
S This user is from outside of this forum
S This user is from outside of this forum
sugarcatdestroyer@lemmy.world

schrieb zuletzt editiert von

#124

So that he doesn't have to run after the rabbits, he will learn to raise them and manage them with a fake smile, providing them with a stable life lol.

Well, I think the thing is that we still live by the law: the strong do what they want, and the weak just whine and complain.
1 Antwort Letzte Antwort

0
K kibiz0r@midwest.social

They already prosecute people under the unauthorized access provision. They just don’t prosecute rich people under it.
G This user is from outside of this forum
G This user is from outside of this forum
gamingchairmodel@lemmy.world

schrieb zuletzt editiert von

#125

They prosecuted and convicted a guy under the CFAA for figuring out the URL schema for an AT&T website designed to be accessed by the iPad when it first launched, and then just visiting that site by trying every URL in a script. And then his lawyer (the foremost expert on the CFAA) got his conviction overturned:

United States v. Andrew Auernheimer

Andrew “Weev” Auernheimer was convicted of violating the Computer Fraud and Abuse Act ("CFAA") in New Jersey federal court and sentenced to 41 months in federal prison in March of 2013 for revealing to media outlets that AT...

Electronic Frontier Foundation (www.eff.org)

We have to maintain that fight, to make sure that the legal system doesn't criminalize normal computer tinkering, like using scripts or even browser settings in ways that site owners don't approve of.
1 Antwort Letzte Antwort

3
N notasharkinamansuit@lemmy.world

That’s the entire point, dipshit. I wish we got one of the cool techno dystopias rather than this boring corporate idiot one.
D This user is from outside of this forum
D This user is from outside of this forum
dojan@pawb.social

schrieb zuletzt editiert von

#126

I'm still holding out for Stephen Hawking to mail out Demon Summoning programs.
1 Antwort Letzte Antwort

6
D davriellelouna@lemmy.world

This post did not contain any content.
K This user is from outside of this forum
K This user is from outside of this forum
kissaki@feddit.org

schrieb zuletzt editiert von kissaki@feddit.org

#127

Perplexity argues that a platform’s inability to differentiate between helpful AI assistants and harmful bots causes misclassification of legitimate web traffic.

So, I assume Perplexity uses appropriate identifiable user-agent headers, to allow hosters to decide whether to serve them one way or another?
L 1 Antwort Letzte Antwort

51
J jqubed@lemmy.world

I think in Cloudflare’s case the free tier website owners are more an example of just giving the users a limited product in hopes of enticing them to upgrade to the paid product with more features and better performance. Cloudflare might get some benefit in the ability to track end-users across more websites as part of their efforts to determine who is a real human versus a potentially-malicious bot, but I don’t think that really gives the same ROI like Facebook or other services extract from their “free” services where the users are the actual product.
_ This user is from outside of this forum
_ This user is from outside of this forum
_cryptagion@anarchist.nexus

schrieb zuletzt editiert von

#128

Actually, they've said that their free tier is what gives them a paid tier to sell to other people. They know most people aren't going to buy anything from them, but the are fine with that because they get to collect a ton of data about who is using hundreds of thousands of websites in order to figure out what traffic is bad. Without that huge user base, they can't do what they do.

And judging from the article, it's working out for them.
1 Antwort Letzte Antwort

1
G glitchvid@lemmy.world

When sites put challenges like Anubis or other measures to authenticate that the viewer isn't a robot, and scrapers then employ measures to thwart that authentication (via spoofing or other means) I think that's a reasonable violation of the CFAA in spirit — especially since these mass scraping activities are getting attention for the damage they are causing to site operators (another factor in the CFAA, and one that would promote this to felony activity.)

The fact is these laws are already on the books, we may as well utilize them to shut down this objectively harmful activity AI scrapers are doing.
A This user is from outside of this forum
A This user is from outside of this forum
aatube@lemmy.dbzer0.com

schrieb zuletzt editiert von

#129

That same logic is how Aaron Swartz was cornered into suicide for scraping JSTOR, something widely agreed to be a bad idea by a wide range of lawspeople including SCOTUS in its 2021 decision Van Buren v. US that struck this interpretation off the books.
1 Antwort Letzte Antwort

1
D davriellelouna@lemmy.world

This post did not contain any content.
T This user is from outside of this forum
T This user is from outside of this forum
tibi@lemmy.world

schrieb zuletzt editiert von

#130

You could say they are... Perplexed.
1 Antwort Letzte Antwort

39
W wolflink@sh.itjust.works

This is a nice CloudFlare ad
P This user is from outside of this forum
P This user is from outside of this forum
pyre@lemmy.world

schrieb zuletzt editiert von

#131

yeah. still not worth dealing with fucking cloudflare. fuck cloudflare.
I 1 Antwort Letzte Antwort

16
P pyre@lemmy.world

yeah. still not worth dealing with fucking cloudflare. fuck cloudflare.
I This user is from outside of this forum
I This user is from outside of this forum
int32@lemmy.dbzer0.com

schrieb zuletzt editiert von

#132

DEATH TO CLOUDFLARE!
1 Antwort Letzte Antwort

5
K kissaki@feddit.org

Perplexity argues that a platform’s inability to differentiate between helpful AI assistants and harmful bots causes misclassification of legitimate web traffic.

So, I assume Perplexity uses appropriate identifiable user-agent headers, to allow hosters to decide whether to serve them one way or another?
L This user is from outside of this forum
L This user is from outside of this forum
lime@feddit.nu

schrieb zuletzt editiert von

#133

yeah it's almost like there as already a system for this in place
1 Antwort Letzte Antwort

13
T tomalley8342@lemmy.world

Nah, that would also mean using Newpipe, YoutubeDL, Revanced, and Tachiyomi would be a crime, and it would only take the re-introduction of WEI to extend that criminalization to the rest of the web ecosystem. It would be extremely shortsighted and foolish of me to cheer on the criminalization of user spoofing and browser automation because of this.
G This user is from outside of this forum
G This user is from outside of this forum
glitchvid@lemmy.world

schrieb zuletzt editiert von glitchvid@lemmy.world

#134

Do you think DoS/DDoS activities should be criminal?

If you're a site operator and the mass AI scraping is genuinely causing operational problems (not hard to imagine, I've seen what it does to my hosted repositories pages) should there be recourse? Especially if you're actively trying to prevent that activity (revoking consent in cookies, authorization captchas).

In general I think the idea of "your right to swing your fists ends at my face" applies reasonably well here — these AI scraping companies are giving lots of admins bloody noses and need to be held accountable.

I really am amenable to arguments wrt the right to an open web, but look at how many sites are hiding behind CF and other portals, or outright becoming hostile to any scraping at all; we're already seeing the rapid death of the ideal because of these malicious scrapers, and we should be using all available recourse to stop this bleeding.
T 1 Antwort Letzte Antwort

0
K kokesh@lemmy.world

Is there some simply deployable PHP honeytrap for AI crawlers?
B This user is from outside of this forum
B This user is from outside of this forum
blargh513@sh.itjust.works

schrieb zuletzt editiert von

#135

Used to make tarpits with reverse proxies. Accept the connection and then set the responses for a few seconds before default TCP timeout. Doesn't eat much resource as long as you have enough TCP connections and can reuse them effectively.
1 Antwort Letzte Antwort

0
G glitchvid@lemmy.world

Do you think DoS/DDoS activities should be criminal?

If you're a site operator and the mass AI scraping is genuinely causing operational problems (not hard to imagine, I've seen what it does to my hosted repositories pages) should there be recourse? Especially if you're actively trying to prevent that activity (revoking consent in cookies, authorization captchas).

In general I think the idea of "your right to swing your fists ends at my face" applies reasonably well here — these AI scraping companies are giving lots of admins bloody noses and need to be held accountable.

I really am amenable to arguments wrt the right to an open web, but look at how many sites are hiding behind CF and other portals, or outright becoming hostile to any scraping at all; we're already seeing the rapid death of the ideal because of these malicious scrapers, and we should be using all available recourse to stop this bleeding.
T This user is from outside of this forum
T This user is from outside of this forum
tomalley8342@lemmy.world

schrieb zuletzt editiert von

#136

DoS attacks are already a crime, so of course the need for some kind of solution is clear. But any proposal that gatekeeps the internet and restricts the freedoms with which the user can interact with it is no solution at all. To me, the openness of the web shouldn't be something that people just consider, or are amenable to. It should be the foundation in which all reasonable proposals should consider as a principle truth.
1 Antwort Letzte Antwort

0
J jve@lemmy.world

Right? Isn’t this a textbook DMCA violation, too?
W This user is from outside of this forum
W This user is from outside of this forum
whyjiffie@sh.itjust.works

schrieb zuletzt editiert von

#137

for us, not for them. wait until they argue in court that actually its us at fault and we need to provide access or else
1 Antwort Letzte Antwort

0
D davriellelouna@lemmy.world

This post did not contain any content.
K This user is from outside of this forum
K This user is from outside of this forum
kittenzrulz123@lemmy.blahaj.zone

schrieb zuletzt editiert von

#138
1 Antwort Letzte Antwort

3

Anmelden zum Antworten

R

St. Paul, MN, was hacked so badly that the National Guard has been deployed
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
71

1

232 Stimmen

71 Beiträge

240 Aufrufe

S

So while Utah punches above its weight in tech, St. Paul area absolutely dwarfs it in population. Surely they have a robust cybersecurity industry there... https://lecbyo.files.cmp.optimizely.com/download/fa9be256b74111efa0ca8e42e80f1a8f?sfvrsn=a8aa5246_2 Utah, #1 projected tech sector growth in the next decade, of all 50 states. Utah, #8 for tech sector % of entire state economy, of all 50 states. Minnesota? Doesn't crack top 10 for any metrics. Utah may not be the biggest or techiest state, but it is way more so than Minnesota. The National Guard just seems like a desperate move. Again, this is my argument, but you are only seeing desperation as due to incompetence, not due to... actual severity. When they're deployed, they take orders from the the federal military, Not actually true unless the Nat Guard has been given a direct command by the Pentagon. and at peace, monitoring foreign threats seems like a federal thing. ... which is why the FBI were called in, in addition to the Nat Guard being able to report up the military CoC. You call in the National Guard to put down a riot or something where you just need bodies, not for anything niche. I mean, you yourself have explained that the Nat Guard does have a CyberSec ability, and I've explained they also have the ability to potentially summon even greater CyberSec ability. I guess you would be surprised how involved the military is / can be in defending against national security threatening, critical infrastructure comprimising kinds of domestic threats. Remember Stuxnet? Yeah other people can do that to us now, we kinda uncorked the genie bottle on that one. Otherwise, just call a local cybersecurity firm to trace the attack and assess damage. It is not everyone's instinct or best practice to immediately hire a contracted firm to do things that government agencies can, and have a responsibility to do. If this was like, Amazon being comprimised, yeah I can see that being a more likely avenue, though if it was serious, they'd probably call in some or multiple forms of 'the Feds' as well. But this was a breach/compromise of a municipal network... thats a government thing. Not a private sector thing. EDIT: Also, you are acting like either you are unaware of the following, or ... don't think its real? https://en.wikipedia.org/wiki/Utah_Data_Center Kind of a really big deal in terms of Utah and the tech sector and the Federal government and... things that were totally illegal before the PATRIOT Act. Exabytes of storage. Exabytes. Utah literally is where the NSA is doing their damndest to make a hardcopy of literally all internet traffic and content. Given how classified this facility is, I wouldn't be surprised if their employees don't exactly show up in standard Utah employment figures.
D

Reddit executive Roxy Young is departing the social media company
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

70 Stimmen

6 Beiträge

67 Aufrufe

R

Sinking ship.
D

Musk's AI firm deletes posts after chatbot praises Adolf Hitler
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
53

1

499 Stimmen

53 Beiträge

2k Aufrufe

S

Some of them were
P

He pioneered the cellphone. It changed how people around the world talk to each other — and don’t
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

103 Stimmen

6 Beiträge

82 Aufrufe

F

Anybody got a time machine? Stop this man!
B

Final Nokia feature phones coming before HMD deal ends in 2026
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

33 Stimmen

2 Beiträge

36 Aufrufe

B

HMD feature phones are such a let down. The Polish language translation within the system is clearly automated translation - the words used sometimes don't make sense. CloudFone apps are also not available in Europe. The HMD 110 4G (2024, not 2023) has the Unisoc T127 chipset which supports hotspot, but HMD deliberately chose not to include it. I know because the Itel Neo R60+ has hotspot with the same chipset. At least they made Nokia XR21 in Europe for a while.
B

Study finds persistent spike in hate speech on X
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
43

1

348 Stimmen

43 Beiträge

631 Aufrufe

E

You are a zionist so it's funny that you say that
A

America’s drone 9/11 is coming — and just like on 9/11, we aren’t ready
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21

77 Stimmen

21 Beiträge

241 Aufrufe

G

Because the trillions is the point.. Not security.
H

How to store data on paper?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

44 Stimmen

9 Beiträge

105 Aufrufe

U

This has to be a shitpost. Transportation of paper-stored data You can take the sheets with you, send them by post, or even attach them to homing pigeons

1
2
3
4
5
6
7