Reddit will block the Internet Archive
-
I am new to Lemmy, is there a fuckreddit sub?
If you seek a pleasant public forum, look about you.
-
I am new to Lemmy, is there a fuckreddit sub?
Why would you want to spend more time thinking about a dead site?
-
I am new to Lemmy, is there a fuckreddit sub?
Lemmy Explorer
Instance and Community Explorer for Lemmy
(lemmyverse.net)
This is a great site to search for communities. Doesnt seem like there is one.
-
It is my understanding that if you block the wayback machine from indexing your site it will also delist the history as well.
They do archive sites against the owners wishes when they consider it an important site for public archiving, like some news sites. They are in no obligation to delete the archives and hope they don’t.
-
I am new to Lemmy, is there a fuckreddit sub?
!reddit@lemmy.world
-
fuck spez
-
Why would you want to spend more time thinking about a dead site?
I just like to laugh at things I dislike. And I also like to see how bad it's getting. Iwas in the undelete sub and it was amazing.
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
Just more vindication for my ditching that trash heap of a platform. YT is probably going to be the next platform I ditch as they're going full Reddit now.
It's a matter of time before third-party YT front-ends start getting throttled or outright blocked like third-party Reddit front-ends.
-
I am new to Lemmy, is there a fuckreddit sub?
In a way, the entire lemmy community is the fuckreddit sub
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
OK, I stopped posting on Reddit but left my account and comments in place because I considered them part of the public record. If Reddit is taking that record private, it’s time for me to start removing my content from the platform.
Does anyone know if historical Reddit content will remain in IA? If not, I’m going to have to back up years of content somewhere else.
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
Time to just ignore them and scrape it anyways
-
Cuck boy getting pegged by post top op Garfield is definitely not something I had jotted down in my day-at-a-glance.
-
Just more vindication for my ditching that trash heap of a platform. YT is probably going to be the next platform I ditch as they're going full Reddit now.
It's a matter of time before third-party YT front-ends start getting throttled or outright blocked like third-party Reddit front-ends.
YouTube's already throttling users in their mobile site. They have these massive channel cards in their feeds and the video titles/thumbnails disappear after a few offerings, leaving you with the ability to blindly click on a video.
-
Given that the Internet Archive is the de facto standard way to cite material as seen on a given date --- they're a trustworthy party that will probably persist for a long time --- that's going to make it harder to cite content on Reddit.
Damn, guess if you want reddit data to train your AI that you’ll need to pay Spez for access.
-
OK, I stopped posting on Reddit but left my account and comments in place because I considered them part of the public record. If Reddit is taking that record private, it’s time for me to start removing my content from the platform.
Does anyone know if historical Reddit content will remain in IA? If not, I’m going to have to back up years of content somewhere else.
I'm assuming IA will continue to host their historical archives of Reddit, they'll just not have any new captures after this. Unless IA has said otherwise, it'd be very strange to wipe their archive of Reddit
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
I already gave up from Reddit long time ago. Deleted all
-
lol i think that might be the worst/best thing I have seen in a long time
Unrelated but is your username a play on benzene?
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
The company says that AI companies have scraped data from the Wayback Machine, so it’s going to limit what the Wayback Machine can access.
Yeah, wouldn't want those AI companies to get all that data for free. Gotta make 'em pay for it.
-
This post did not contain any content.
Reddit will block the Internet Archive
Reddit caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to limit the Internet Archive from indexing some data.
The Verge (www.theverge.com)
As somebody who often ends up using Reddit like Stackoverflow and in some cases needing the Internet Archive (IA) to find the original post after it’s been deleted or garbled, I think this is a wakeup call for those go to Reddit both to get technical help and to post it. More than ever, Reddit is becoming an unreliable place to find answers for old obscure issues and if they are going to lockout places like the IA then I think it’s time people stopped contributing their solutions to Reddit.
-
They do archive sites against the owners wishes when they consider it an important site for public archiving, like some news sites. They are in no obligation to delete the archives and hope they don’t.
Parties have archived the data from pushshift, which cover a lot of Reddit history.
kagis
Subreddit comments/submissions 2005-06 to 2024-12
This is the top 40,000 subreddits from reddit s history in separate files. You can use your torrent client to only download the subreddit s you re interested in. These are from the pushshift dumps from 2005-06 to 2024-12 which can be found here These are zstandard compressed ndjson files. Example python scripts for parsing the data can be found here If you have questions, please reply to this reddit post or DM u/Watchful on reddit or respond to this post , Info Hash: 1614740ac8c94505e4ecb9d88be8bed7b6afddd4
Academic Torrents (academictorrents.com)
Subreddit comments/submissions 2005-06 to 2024-12
This is the top 40,000 subreddits from reddit's history in separate files. You can use your torrent client to only download the subreddit's you're interested in.
I mean, that won't have the past half year or some low-traffic subreddits, but...