Skip to content

Codeberg: army of AI crawlers are extremely slowing us; AI crawlers learned how to solve the Anubis challenges.

Technology
77 57 0
  • 415 Stimmen
    114 Beiträge
    506 Aufrufe
    I
    I strangle anyone who sees my face.
  • 923 Stimmen
    120 Beiträge
    505 Aufrufe
    A
    That's fair, I definitely took longer than that but I'm far from experienced in all this. Still, it was worth the effort in the end.
  • Watermarks offer no defense against deepfakes, study suggests

    Technology technology
    30
    1
    191 Stimmen
    30 Beiträge
    241 Aufrufe
    K
    You can have whatever token you want with all the metadata, licensing and ownership information you want... ...unless you plan on only seeing images in your own platform, nobody gives a shit, people will take screenshots and image files and share and use them however they want. There's no world in which you load a full DRM plugin or do 4 different types of handshake with a full blockchain just to load a jpeg into a comment.
  • 172 Stimmen
    19 Beiträge
    181 Aufrufe
    P
    That is still beyond extremely optimistic
  • 75 Stimmen
    1 Beiträge
    20 Aufrufe
    Niemand hat geantwortet
  • Album 'D11-04' Out Now

    Technology technology
    1
    1
    1 Stimmen
    1 Beiträge
    18 Aufrufe
    Niemand hat geantwortet
  • 310 Stimmen
    37 Beiträge
    383 Aufrufe
    S
    Same, especially when searching technical or niche topics. Since there aren't a ton of results specific to the topic, mostly semi-related results will appear in the first page or two of a regular (non-Gemini) Google search, just due to the higher popularity of those webpages compared to the relevant webpages. Even the relevant webpages will have lots of non-relevant or semi-relevant information surrounding the answer I'm looking for. I don't know enough about it to be sure, but Gemini is probably just scraping a handful of websites on the first page, and since most of those are only semi-related, the resulting summary is a classic example of garbage in, garbage out. I also think there's probably something in the code that looks for information that is shared across multiple sources and prioritizing that over something that's only on one particular page (possibly the sole result with the information you need). Then, it phrases the summary as a direct answer to your query, misrepresenting the actual information on the pages they scraped. At least Gemini gives sources, I guess. The thing that gets on my nerves the most is how often I see people quote the summary as proof of something without checking the sources. It was bad before the rollout of Gemini, but at least back then Google was mostly scraping text and presenting it with little modification, along with a direct link to the webpage. Now, it's an LLM generating text phrased as a direct answer to a question (that was also AI-generated from your search query) using AI-summarized data points scraped from multiple webpages. It's obfuscating the source material further, but I also can't help but feel like it exposes a little of the behind-the-scenes fuckery Google has been doing for years before Gemini. How it bastardizes your query by interpreting it into a question, and then prioritizes homogeneous results that agree on the "answer" to your "question". For years they've been doing this to a certain extent, they just didn't share how they interpreted your query.
  • 56 Stimmen
    13 Beiträge
    130 Aufrufe
    P
    I tried before, but I made my life hell on earth. I only have whatsapp now because its mandatory. Since 2022, I only have lemmy, mastodon and unfortunately whatsapp as social media.