Skip to content

AI Chatbots Remain Overconfident — Even When They’re Wrong: Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

Technology
66 41 0
  • Large language models aren’t designed to be knowledge machines - they’re designed to generate natural-sounding language, nothing more. The fact that they ever get things right is just a byproduct of their training data containing a lot of correct information. These systems aren’t generally intelligent, and people need to stop treating them as if they are. Complaining that an LLM gives out wrong information isn’t a failure of the model itself - it’s a mismatch of expectations.

    Neither are our brains.

    “Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

    ― Peter Watts, Blindsight (fiction)

    Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

    If nothing else, LLMs have woke me to how stupid humans are vs. the machines.

  • Sounds pretty human to me. /s

    Sounds pretty human to me. no /s

  • I guess, but it's like proving your phones predictive text has confidence in its suggestions regardless of accuracy. Confidence is not an attribute of a math function, they are attributing intelligence to a predictive model.

    I work in risk management, but don't really have a strong understanding of LLM mechanics. "Confidence" is something that i quantify in my work, but it has different terms that are associated with it. In modeling outcomes, I may say that we have 60% confidence in achieving our budget objectives, while others would express the same result by saying our chances of achieving our budget objective are 60%. Again, I'm not sure if this is what the LLM is doing, but if it is producing a modeled prediction with a CDF of possible outcomes, then representing its result with 100% confindence means that the LLM didn't model any other possible outcomes other than the answer it is providing, which does seem troubling.

  • People really do not like seeing opposing viewpoints, eh? There's disagreeing, and then there's downvoting to oblivion without even engaging in a discussion, haha.

    Even if they're probably right, in such murky uncertain waters where we're not experts, one should have at least a little open mind, or live and let live.

    It's like talking with someone who thinks the Earth is flat. There isn't anything to discuss. They're objectively wrong.

    Humans like to anthropomorphize everything. It's why you can see a face on a car's front grille. LLMs are ultra advanced pattern matching algorithms. They do not think or reason or have any kind of opinion or sentience, yet they are being utilized as if they do. Let's see how it works out for the world, I guess.

  • It's easy, just ask the AI "are you sure"? Until it stops changing it's answer.

    But seriously, LLMs are just advanced autocomplete.

    They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.

  • Neither are our brains.

    “Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

    ― Peter Watts, Blindsight (fiction)

    Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

    If nothing else, LLMs have woke me to how stupid humans are vs. the machines.

    There are plenty of similarities in the output of both the human brain and LLMs, but overall they’re very different. Unlike LLMs, the human brain is generally intelligent - it can adapt to a huge variety of cognitive tasks. LLMs, on the other hand, can only do one thing: generate language. It’s tempting to anthropomorphize systems like ChatGPT because of how competent they seem, but there’s no actual thinking going on. It’s just generating language based on patterns and probabilities.

  • I work in risk management, but don't really have a strong understanding of LLM mechanics. "Confidence" is something that i quantify in my work, but it has different terms that are associated with it. In modeling outcomes, I may say that we have 60% confidence in achieving our budget objectives, while others would express the same result by saying our chances of achieving our budget objective are 60%. Again, I'm not sure if this is what the LLM is doing, but if it is producing a modeled prediction with a CDF of possible outcomes, then representing its result with 100% confindence means that the LLM didn't model any other possible outcomes other than the answer it is providing, which does seem troubling.

    Nah so their definition is the classical "how confident are you that you got the answer right". If you read the article they asked a bunch of people and 4 LLMs a bunch of random questions, then asked the respondent whether they/it had confidence their answer was correct, and then checked the answer. The LLMs initially lined up with people (over confident) but then when they iterated, shared results and asked further questions the LLMs confidence increased while people's tends to decrease to mitigate the over confidence.

    But the study still assumes intelligence enough to review past results and adjust accordingly, but disregards the fact that an AI isnt intelligence, it's a word prediction model based on a data set of written text tending to infinity. It's not assessing validity of results, it's predicting what the answer is based on all previous inputs. The whole study is irrelevant.

  • This Nobel Prize winner and subject matter expert takes the opposite view

    Interesting talk but the number of times he completely dismisses the entire field of linguistics kind of makes me think he's being disingenuous about his familiarity with it.

    For one, I think he is dismissing holotes, the concept of "wholeness." That when you cut something apart to it's individual parts, you lose something about the bigger picture. This deconstruction of language misses the larger picture of the human body as a whole, and how every part of us, from our assemblage of organs down to our DNA, impact how we interact with and understand the world. He may have a great definition of understanding but it still sounds (to me) like it's potentially missing aspects of human/animal biologically based understanding.

    For example, I have cancer, and about six months before I was diagnosed, I had begun to get more chronically depressed than usual. I felt hopeless and I didn't know why. Surprisingly, that's actually a symptom of my cancer. What understanding did I have that changed how I felt inside and how I understood the things around me? Suddenly I felt different about words and ideas, but nothing had changed externally, something had change internally. The connections in my neural network had adjusted, the feelings and associations with words and ideas was different, but I hadn't done anything to make that adjustment. No learning or understanding had happened. I had a mutation in my DNA that made that adjustment for me.

    Further, I think he's deeply misunderstanding (possibly intentionally?) what linguists like Chomsky are saying when they say humans are born with language. They mean that we are born with a genetic blueprint to understand language. Just like animals are born with a genetic blueprint to do things they were never trained to do. Many animals are born and almost immediately stand up to walk. This is the same principle. There are innate biologically ingrained understandings that help us along the path to understanding. It does not mean we are born understanding language as much as we are born with the building blocks of understanding the physical world in which we exist.

    Anyway, interesting talk, but I immediately am skeptical of anyone who wholly dismisses an entire field of thought so casually.

    For what it's worth, I didn't downvote you and I'm sorry people are doing so.

  • They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.

    Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You're likely to see a lot more success using this approach.

  • This post did not contain any content.

    What a terrible headline. Self-aware? Really?

  • People really do not like seeing opposing viewpoints, eh? There's disagreeing, and then there's downvoting to oblivion without even engaging in a discussion, haha.

    Even if they're probably right, in such murky uncertain waters where we're not experts, one should have at least a little open mind, or live and let live.

    I think there's two basic mistakes that you made. First, you think that we aren't experts, but it's definitely true that some of us have studied these topics for years in college or graduate school, and surely many other people are well read on the subject. Obviously you can't easily confirm our backgrounds, but we exist. Second, people who are somewhat aware of the topic might realize that it's not particularly productive to engage in discussion on it here because there's too much background information that's missing. It's often the case that experts don't try to discuss things because it's the wrong venue, not because they feel superior.

  • Neither are our brains.

    “Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

    ― Peter Watts, Blindsight (fiction)

    Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

    If nothing else, LLMs have woke me to how stupid humans are vs. the machines.

    Every thread about LLMs has to have some guy like yourself saying how LLMs are like humans and smarter than humans for some reason.

  • This post did not contain any content.

    Is that a recycled piece from 2023? Because we already knew that.

  • This post did not contain any content.

    Oh shit, they do behave like humans after all.

  • Neither are our brains.

    “Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

    ― Peter Watts, Blindsight (fiction)

    Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

    If nothing else, LLMs have woke me to how stupid humans are vs. the machines.

  • This post did not contain any content.

    prompting concerns

    Oh you.

  • It's easy, just ask the AI "are you sure"? Until it stops changing it's answer.

    But seriously, LLMs are just advanced autocomplete.

    Ah, the monte-carlo approach to truth.

  • They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.

    I once gave some kind of math problem (how to break down a certain amount of money into bills) and the llm wrote a python script for it, ran it and thus gave me the correct answer. Kind of clever really.

  • It's like talking with someone who thinks the Earth is flat. There isn't anything to discuss. They're objectively wrong.

    Humans like to anthropomorphize everything. It's why you can see a face on a car's front grille. LLMs are ultra advanced pattern matching algorithms. They do not think or reason or have any kind of opinion or sentience, yet they are being utilized as if they do. Let's see how it works out for the world, I guess.

    I think so too, but I am really curious what will happen when we give them "bodies" with sensors so they can explore the world and make individual "experiences". I could imagine they would act much more human after a while and might even develop some kind of sentience.

    Of course they would also need some kind of memory and self-actualization processes.

  • Every thread about LLMs has to have some guy like yourself saying how LLMs are like humans and smarter than humans for some reason.

    Some humans are not as smart as LLMs, I give them that.

  • 6 Stimmen
    2 Beiträge
    4 Aufrufe
    H
    Top 1 things you should never use AI for.
  • The BBC is launching a paywall in the US

    Technology technology
    67
    283 Stimmen
    67 Beiträge
    336 Aufrufe
    C
    Yeah back in the day we made sure no matter who you were and what was going on you had the opportunity to hear our take on it Mind you I suppose that still happens thanks to us being a very loud and online people, but having an "America says x" channel in a time where people liked us sure was a good idea
  • 454 Stimmen
    149 Beiträge
    682 Aufrufe
    eyekaytee@aussie.zoneE
    They will say something like solar went from 600gw to 1000 thats a 66% increase this year and coal only increased 40% except coal is 3600gw to 6400. Hrmmmm, maybe these numbers are outdated? Based on this coal and gas are down: In Q1 2025, solar generation rose 48% compared to the same period in 2024. Solar power reached 254 TWh, making up 10% of total electricity. This was the largest increase among all clean energy sources. Coal-fired electricity dropped by 4%, falling to 1,421 TWh. Gas-fired power also went down by 4%, reaching 67 TWh https://carboncredits.com/china-sets-clean-energy-record-in-early-2025-with-951-tw/ are no where close to what is required to meet their climate goals Which ones in particular are you talking about? Trump signs executive order directing US withdrawal from the Paris climate agreement — again https://apnews.com/article/trump-paris-agreement-climate-change-788907bb89fe307a964be757313cdfb0 China vowed on Tuesday to continue participating in two cornerstone multinational arrangements -- the World Health Organization and Paris climate accord -- after newly sworn-in US President Donald Trump ordered withdrawals from them. https://www.france24.com/en/live-news/20250121-china-says-committed-to-who-paris-climate-deal-after-us-pulls-out What's that saying? You hate it when the person you hate is doing good? I can't remember what it is I can't fault them for what they're doing at the moment, even if they are run by an evil dictatorship and do pollute the most I’m not sure how european defense spending is relevant It suggests there is money available in the bank to fund solar/wind/battery, but instead they are preparing for? something? what? who knows. France can make a fighter jet at home but not solar panels apparently. Prehaps they would be made in a country with environmental and labour laws if governments legislated properly to prevent companies outsourcing manufacturing. However this doesnt absolve china. China isnt being forced at Gunpoint to produce these goods with low labour regulation and low environmental regulation. You're right, it doesn't absolve china, and I avoid purchasing things from them wherever possible, my solar panels and EV were made in South Korea, my home battery was made in Germany, there are only a few things in my house made in China, most of them I got second hand but unfortunately there is no escaping the giant of manufacturing. With that said it's one thing for me to sit here and tut tut at China, but I realise I am not most people, the most clearest example is the extreme anti-ai, anti-billionaire bias on this platform, in real life most people don't give a fuck, they love Amazon/Microsoft/Google/Apple etc, they can't go a day without them. So I consider myself a realist, if you want people to buy your stuff then you will need to make the conditions possible for them to WANT to buy your stuff, not out of some moral lecture and Europe isn't doing that, if we look at energy prices: Can someone actually point out to me where this comes from? ... At the end of the day energy is a small % of EU household spending I was looking at corporate/business energy use: Major European companies are already moving to cut costs and retain their competitive edge. For example, Thyssenkrupp, Germany’s largest steelmaker, said on Monday it would slash 11,000 jobs in its steel division by 2030, in a major corporate reshuffle. https://oilprice.com/Latest-Energy-News/World-News/High-Energy-Costs-Continue-to-Plague-European-Industry.html Prices have since fallen but are still high compared to other countries. A poll by Germany's DIHK Chambers of Industry and Commerce of around 3,300 companies showed that 37% were considering cutting production or moving abroad, up from 31% last year and 16% in 2022. For energy-intensive industrial firms some 45% of companies were mulling slashing output or relocation, the survey showed. "The trust of the German economy in energy policy is severely damaged," Achim Dercks, DIHK deputy chief executive said, adding that the government had not succeeded in providing companies with a perspective for reliable and affordable energy supply. https://www.reuters.com/business/energy/more-german-companies-mull-relocation-due-high-energy-prices-survey-2024-08-01/ I've seen nothing to suggest energy prices in the EU are SO cheap that it's worth moving manufacturing TO Europe, and this is what annoys me the most. I've pointed this out before but they have an excellent report on the issues: https://commission.europa.eu/document/download/97e481fd-2dc3-412d-be4c-f152a8232961_en?filename=The+future+of+European+competitiveness+_+A+competitiveness+strategy+for+Europe.pdf Then they put out this Competitive Compass: https://commission.europa.eu/topics/eu-competitiveness/competitiveness-compass_en But tbh every week in the EU it seems like they are chasing after some other goal. This would be great, it would have been greater 10 years ago. Agreed
  • Disney+ Confirmed a NEW Change Coming Soon for Subscribers

    Technology technology
    16
    1
    21 Stimmen
    16 Beiträge
    84 Aufrufe
    B
    It's also an article about another article from Variety that actually has a better headline. These things are a pet peeve for me. Hey, here's a story from an actual news service and I'll even include a link to it, but I'm going to post my link all over so people will see the ads on my page instead of theirs. Variety does some good reporting, I've rather they get the clicks.
  • Science and Technology News and Commentary: Aardvark Daily

    Technology technology
    2
    7 Stimmen
    2 Beiträge
    22 Aufrufe
    I
    What are you on about with this? Last news post 2013?
  • Anthropic's AI is Writing Its Own Blog - Oh Wait. No It's Not

    Technology technology
    4
    67 Stimmen
    4 Beiträge
    31 Aufrufe
    mrjgyfly@lemmy.worldM
    They absolutely will. AI is great if you drastically lower your standards.
  • Bookmark keywords, again (Firefox)

    Technology technology
    3
    4 Stimmen
    3 Beiträge
    27 Aufrufe
    bokehphilia@lemmy.mlB
    This is terrible news. I also have a keyboard-centric workflow and also make heavy use of keyword bookmarks. I too use custom bookmarklets containing JavaScript that I can invoke with a few key strokes for multiple uses including: 1: Auto-expanding all nested Reddit comments on posts with many comments on desktop. 2: Downloading videos from certain web sites. 3: Playing a play-by-forum online board game. 4: Helping expand and aid in downloading images from a certain host. 5: Sending X (Twitter) URLs in the browser bar to Nitter or TWStalker. And all these without touching the mouse! It's really disappointing to read that Firefox could be taking so much capability in the browser away.
  • 5 Stimmen
    10 Beiträge
    43 Aufrufe
    S
    You could look into automatic local caching for diles you're planning to seed, and stick that on an SSD. That way you don't hammer the HDDs in the NAS and still get the good feels of seeding. Then automatically delete files once they get to a certain seed rate or something and you're golden. How aggressive you go with this depends on your actual use case. Are you actually editing raw footage over the network while multiple other clients are streaming other stuff? Or are you just interested in having it be capable? What's the budget? But that sounds complicated. I'd personally rather just DIY it, that way you can put an SSD in there for cache and you get most of the benefits with a lot less cost, and you should be able to respond to issues with minimal changes (i.e. add more RAM or another caching drive).