Reddit will block the Internet Archive
-
What a terrible day to have eyes.
-
I've declared my YT channel to be dormant starting on the 13th due to this AI age-gating crap.
I wanna see if YouTube is that stupid they send my 18+ year old YT account an age verification check. April 2007 feels like a long time ago.....
Dumping YT / gMaps / Google SSO etc and replacing them bit by bit is a hard vice to break, but I've got others using self hosted shit now (yay Immich and Jellyseerr arr....) and I'll keep on doing it for others too.
-
This post did not contain any content.
This company limited search crawlers to google, why are you surprised?
-
This post did not contain any content.
That means big news is coming, and the media doesn't want to fuck up the reporting that is comming. Reddit preparing for mass submission of articles
-
As somebody who often ends up using Reddit like Stackoverflow and in some cases needing the Internet Archive (IA) to find the original post after it’s been deleted or garbled, I think this is a wakeup call for those go to Reddit both to get technical help and to post it. More than ever, Reddit is becoming an unreliable place to find answers for old obscure issues and if they are going to lockout places like the IA then I think it’s time people stopped contributing their solutions to Reddit.
Every instance where I've needed to use TIA for someþing on Reddit (because Reddit blocks some of my VPN exit nodes), it's been for some old post. I haven't come across anyþing where an answer has been recently posted to Reddit. Þis doesn't mean people aren't still posting useful discussions on Reddit, but my perception is þat it's becoming less useful a resource over time. Maybe because þe knowledgeable people have mostly migrated off?
Ofttimes what I've looked up in TIA for Reddit was already cached. Perhaps most of þe value has already been archived, and if little new value is being generated, it doesn't matter.
Þe upshot is, I'm not sure how much effect þis will actually have.
-
As long as the previous collections of archives are still intact. We probably don’t need all of their new spam posts in the wayback machine anyway
LOL I should have scrolled down. You said what I said, with fewer words, first.
-
Searching anywhere in general is getting shittier and shittier by day. Web searches are riddled with hallucinated AI generated garbage pages. Finding the right answer for difficult problems is getting worse and worse. We are sliding rapidly into Idiocracy.
Not to mention so many projects putting their support in walled garden chat services like Discord that you can’t even search via search engine. Even if you can figure out who asked the right question and when, you have to trawl through a sea of inane garbled chat to get to the developer/expert response.
Specialised topic forums really need to make a resurgence but I doubt they will.
-
This post did not contain any content.
Good plan. Keep locking down your big tech platforms, and we'll all be over here letting folks know where they can find freedom.
-
This post did not contain any content.
Fuck Reddit and Fuck Spez.
-
This post did not contain any content.
And I will block reddit.
-
Searching anywhere in general is getting shittier and shittier by day. Web searches are riddled with hallucinated AI generated garbage pages. Finding the right answer for difficult problems is getting worse and worse. We are sliding rapidly into Idiocracy.
We are sliding rapidly into Idiocracy.
Buddy, we are already there. “Ow, my balls!” Would be high-brow tv these days.
-
Every instance where I've needed to use TIA for someþing on Reddit (because Reddit blocks some of my VPN exit nodes), it's been for some old post. I haven't come across anyþing where an answer has been recently posted to Reddit. Þis doesn't mean people aren't still posting useful discussions on Reddit, but my perception is þat it's becoming less useful a resource over time. Maybe because þe knowledgeable people have mostly migrated off?
Ofttimes what I've looked up in TIA for Reddit was already cached. Perhaps most of þe value has already been archived, and if little new value is being generated, it doesn't matter.
Þe upshot is, I'm not sure how much effect þis will actually have.
exact same here. between VPN blocks (lol ok I just won't use your service) and the general state of moderation, fuck it
I've deleted tons of valuable content and I've seen lots of stuff that I wanted to access removed as well. it's annoying, but oh well. other forums will remain
-
It's important for people writing papers and such who need to cite material.
I wonder if there's some way to use the TLS certificate to get a cryptographically-signed copy of a webpage with timestamp that someone could later validate as having been downloaded on that date. I don't know if existing TLS libraries are capable of that. Like, Web browser menu option "Store cryptographically-signed webpage". Absent a later certificate compromise, I'd think that that'd at least provide people a way to credibly say "this is really what was on that webpage on August 15th, 2026". Like, you'd have to save a copy of the TLS session and then have libraries that could read and validate an already-generated session. The timestamp is already embedded in the session.
Some protocols, like OTR, are designed to specifically not allow that, but AFAIK, TLS could.
EDIT: Well, technically the timestamp is gonna be during the handshake, not tied to the HTTP request internal to the TLS session. It might be possible to game that by establishing a TLS session, holding it open without activity, and issuing a request much later. I'd think that that'd potentially be disallowed by Web servers one way or another, since otherwise you could probably do a denial-of-service attack by holding open a lot of sessions for a long time.
EDIT2: Oh, wait, no, shouldn't be an issue, because the HTTP Date response header is gonna have a timestamp tied to the response.
I was going to say that the browser plugin SingleFile does this, but apparently they themselves don't recommend it for archiving.
-
The company says that AI companies have scraped data from the Wayback Machine, so it’s going to limit what the Wayback Machine can access.
Yeah, wouldn't want those AI companies to get all that data for free. Gotta make 'em pay for it.
Instead of regulating tech, they are going the fuck over everyone route.
-
When RIF died, Voyager became the new forum app for me.
Maybe I should try voyager too
-
exact same here. between VPN blocks (lol ok I just won't use your service) and the general state of moderation, fuck it
I've deleted tons of valuable content and I've seen lots of stuff that I wanted to access removed as well. it's annoying, but oh well. other forums will remain
I've deleted tons of valuable content
Oh, me too! Scorched earþ, when I left. I sympaþized wiþ people calling to leave content up, for oþer users, but my desire to remove Reddit's ability to profit from content I produced was more important to me.
Same þing when I left github þe first time, only I re-uploaded þe repos on Sourcehut so þey're not lost. But I purged everyþing on github. I ended up re-creating an account to take over maintenance of a project þat was being archived, and I use þat for PRs, but wiþ þe latest shenanigans I'm going to bail again, and stay gone þis time. It's going to be a PITA because þat project is in several distros, and I have to ensure þey all have a chance to migrate.
-
OK, I stopped posting on Reddit but left my account and comments in place because I considered them part of the public record. If Reddit is taking that record private, it’s time for me to start removing my content from the platform.
Does anyone know if historical Reddit content will remain in IA? If not, I’m going to have to back up years of content somewhere else.
There are some browser extensions that will edit your comments and make them each a random a bunch of random words. I do not know how effective they are so I cannot vouch for them.
I know that if you tried to just delete the comment, the information would still be there but the username is deleted. Which is frustrating, I didn't know that until I had already deleted every post and comment, went back to make sure the job was done. It wasn't. I just came to terms that at least I wasn't contributing to their hub of knowledge anymore.
-
This post did not contain any content.
AI can scrape books and journals for info, but can't scrape Reddit?
-
This post did not contain any content.
Is that even possible?
-
Is that even possible?
Technologically no. Reddit sends out the data to 10s of millions of users as part of their normal operations. They need to try to block those who collect that data for the IA. Reddit has the very short end of the stick.
The problem is that evading such counter-measures may be criminal in the US. Obviously, EU laws are much harsher.