Skip to content

AI agents wrong ~70% of time: Carnegie Mellon study

Technology
277 108 90
  • Firefox 140 Brings Tab Unload, Custom Search & New ESR

    Technology technology
    41
    1
    234 Stimmen
    41 Beiträge
    165 Aufrufe
    S
    Read again. I quoted something along the lines of "just as much a development decision as a marketing one" and I said, it wasn't a development decision, so what's left? Firefox released just as frequently before, just that they didn’t increase the major version that often. This does not appear to be true. Why don't you take a look at the version history instead of some marketing blog post? https://www.mozilla.org/en-US/firefox/releases/ Version 2 had 20 releases within 730 days, averaging one release every 36.5 days. Version 3 had 19 releases within 622 days, averaging 32.7 days per release. But these releases were unscheduled, so they were released when they were done. Now they are on a fixed 90-day schedule, no matter if anything worthwhile was complete or not, plus hotfix releases whenever they are necessary. That's not faster, but instead scheduled, and also they are incrementing the major version even if no major change was included. That's what the blog post was alluding to. In the before times, a major version number increase indicated major changes. Now it doesn't anymore, which means sysadmins still need to consider each release a major release, even if it doesn't contain major changes because it might contain them and the version name doesn't say anything about whether it does or not. It's nothing but a marketing change, moving from "version numbering means something" to "big number go up".
  • New Orleans debates real-time facial recognition legislation

    Technology technology
    12
    1
    150 Stimmen
    12 Beiträge
    55 Aufrufe
    A
    [image: 62e40d75-1358-46a4-a7a5-1f08c6afe4dc.jpeg] Palantir had a contract with New Orleans starting around ~2012 to create their predictive policing tech that scans surveillance cameras for very vague details and still misidentifies people. It's very similar to Lavender, the tech they use to identify members of Hamas and attack with drones. This results in misidentified targets ~10% of the time, according to the IDF (likely it's a much higher misidentification rate than 10%). Palantir picked Louisiana over somewhere like San Francisco bc they knew it would be a lot easier to violate rights and privacy here and get away with it. Whatever they decide in New Orleans on Thursday during this Council meeting that nobody cares about, will likely be the first of its kind on the books legal basis to track civilians in the U.S. and allow the federal government to take control over that ability whenever they want. This could also set a precedent for use in other states. Guess who's running the entire country right now, and just gave high ranking army contracts to Palantir employees for "no reason" while they are also receiving a multimillion dollar federal contract to create an insane database on every American and giant data centers are being built all across the country.
  • 311 Stimmen
    37 Beiträge
    66 Aufrufe
    S
    Same, especially when searching technical or niche topics. Since there aren't a ton of results specific to the topic, mostly semi-related results will appear in the first page or two of a regular (non-Gemini) Google search, just due to the higher popularity of those webpages compared to the relevant webpages. Even the relevant webpages will have lots of non-relevant or semi-relevant information surrounding the answer I'm looking for. I don't know enough about it to be sure, but Gemini is probably just scraping a handful of websites on the first page, and since most of those are only semi-related, the resulting summary is a classic example of garbage in, garbage out. I also think there's probably something in the code that looks for information that is shared across multiple sources and prioritizing that over something that's only on one particular page (possibly the sole result with the information you need). Then, it phrases the summary as a direct answer to your query, misrepresenting the actual information on the pages they scraped. At least Gemini gives sources, I guess. The thing that gets on my nerves the most is how often I see people quote the summary as proof of something without checking the sources. It was bad before the rollout of Gemini, but at least back then Google was mostly scraping text and presenting it with little modification, along with a direct link to the webpage. Now, it's an LLM generating text phrased as a direct answer to a question (that was also AI-generated from your search query) using AI-summarized data points scraped from multiple webpages. It's obfuscating the source material further, but I also can't help but feel like it exposes a little of the behind-the-scenes fuckery Google has been doing for years before Gemini. How it bastardizes your query by interpreting it into a question, and then prioritizes homogeneous results that agree on the "answer" to your "question". For years they've been doing this to a certain extent, they just didn't share how they interpreted your query.
  • A receipt printer cured my procrastination [ADHD]

    Technology technology
    21
    1
    120 Stimmen
    21 Beiträge
    93 Aufrufe
    cygnosis@lemmy.worldC
    Good to know. Also an easy problem to fix. Just use phenol free paper.
  • 391 Stimmen
    65 Beiträge
    116 Aufrufe
    Z
    Yes and no. Yes people are this stupid. But also bot networks. But also alt accounts. And many of those stupid people let the algorithm to pick them their political views, which is manipulated by both the bot activity and the platform holders.
  • Super Human In Transit - Living

    Technology technology
    1
    2
    0 Stimmen
    1 Beiträge
    11 Aufrufe
    Niemand hat geantwortet
  • Stepping outside the algorithm

    Technology technology
    1
    1
    19 Stimmen
    1 Beiträge
    10 Aufrufe
    Niemand hat geantwortet
  • Researchers develop recyclable, healable electronics

    Technology technology
    3
    1
    15 Stimmen
    3 Beiträge
    21 Aufrufe
    T
    Isn't the most common failure modes of electronics capacitors dying, followed closely by heat in chips? This research sounds cool and all.