linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

AI Chatbots Remain Overconfident — Even When They’re Wrong: Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

Technology

58 Beiträge 36 Kommentatoren 0 Aufrufe

L lodespawn@aussie.zone

I guess, but it's like proving your phones predictive text has confidence in its suggestions regardless of accuracy. Confidence is not an attribute of a math function, they are attributing intelligence to a predictive model.
F This user is from outside of this forum
F This user is from outside of this forum
fanciestpants@lemmy.world

schrieb zuletzt editiert von

#19

I work in risk management, but don't really have a strong understanding of LLM mechanics. "Confidence" is something that i quantify in my work, but it has different terms that are associated with it. In modeling outcomes, I may say that we have 60% confidence in achieving our budget objectives, while others would express the same result by saying our chances of achieving our budget objective are 60%. Again, I'm not sure if this is what the LLM is doing, but if it is producing a modeled prediction with a CDF of possible outcomes, then representing its result with 100% confindence means that the LLM didn't model any other possible outcomes other than the answer it is providing, which does seem troubling.
L 1 Antwort Letzte Antwort

1
O obinice@lemmy.world

People really do not like seeing opposing viewpoints, eh? There's disagreeing, and then there's downvoting to oblivion without even engaging in a discussion, haha.

Even if they're probably right, in such murky uncertain waters where we're not experts, one should have at least a little open mind, or live and let live.
T This user is from outside of this forum
T This user is from outside of this forum
thb@lemmy.world

schrieb zuletzt editiert von thb@lemmy.world

#20

It's like talking with someone who thinks the Earth is flat. There isn't anything to discuss. They're objectively wrong.

Humans like to anthropomorphize everything. It's why you can see a face on a car's front grille. LLMs are ultra advanced pattern matching algorithms. They do not think or reason or have any kind of opinion or sentience, yet they are being utilized as if they do. Let's see how it works out for the world, I guess.
S 1 Antwort Letzte Antwort

16
M modern_medicine_isnt@lemmy.world

It's easy, just ask the AI "are you sure"? Until it stops changing it's answer.

But seriously, LLMs are just advanced autocomplete.
L This user is from outside of this forum
L This user is from outside of this forum
lfrith@lemmy.ca

schrieb zuletzt editiert von

#21

They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.
G S J 3 Antworten Letzte Antwort

7
S shalafi@lemmy.world

Neither are our brains.

“Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

― Peter Watts, Blindsight (fiction)

Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

If nothing else, LLMs have woke me to how stupid humans are vs. the machines.
P This user is from outside of this forum
P This user is from outside of this forum
perspectivist@feddit.uk

schrieb zuletzt editiert von

#22

There are plenty of similarities in the output of both the human brain and LLMs, but overall they’re very different. Unlike LLMs, the human brain is generally intelligent - it can adapt to a huge variety of cognitive tasks. LLMs, on the other hand, can only do one thing: generate language. It’s tempting to anthropomorphize systems like ChatGPT because of how competent they seem, but there’s no actual thinking going on. It’s just generating language based on patterns and probabilities.
1 Antwort Letzte Antwort

7
F fanciestpants@lemmy.world

I work in risk management, but don't really have a strong understanding of LLM mechanics. "Confidence" is something that i quantify in my work, but it has different terms that are associated with it. In modeling outcomes, I may say that we have 60% confidence in achieving our budget objectives, while others would express the same result by saying our chances of achieving our budget objective are 60%. Again, I'm not sure if this is what the LLM is doing, but if it is producing a modeled prediction with a CDF of possible outcomes, then representing its result with 100% confindence means that the LLM didn't model any other possible outcomes other than the answer it is providing, which does seem troubling.
L This user is from outside of this forum
L This user is from outside of this forum
lodespawn@aussie.zone

schrieb zuletzt editiert von

#23

Nah so their definition is the classical "how confident are you that you got the answer right". If you read the article they asked a bunch of people and 4 LLMs a bunch of random questions, then asked the respondent whether they/it had confidence their answer was correct, and then checked the answer. The LLMs initially lined up with people (over confident) but then when they iterated, shared results and asked further questions the LLMs confidence increased while people's tends to decrease to mitigate the over confidence.

But the study still assumes intelligence enough to review past results and adjust accordingly, but disregards the fact that an AI isnt intelligence, it's a word prediction model based on a data set of written text tending to infinity. It's not assessing validity of results, it's predicting what the answer is based on all previous inputs. The whole study is irrelevant.
J 1 Antwort Letzte Antwort

1
N nymnympseudonym@lemmy.world

This Nobel Prize winner and subject matter expert takes the opposite view
S This user is from outside of this forum
S This user is from outside of this forum
snotflickerman@lemmy.blahaj.zone

schrieb zuletzt editiert von snotflickerman@lemmy.blahaj.zone

#24

Interesting talk but the number of times he completely dismisses the entire field of linguists kind of makes me think he's being disingenuous about his familiarity with it.

For one, I think he is dismissing holotes, the concept of "wholeness." That when you cut something apart to it's individual parts, you lose something about the bigger picture. This deconstruction of language misses the larger picture of the human body as a whole, and how every part of us, from our assemblage of organs down to our DNA, impact how we interact with and understand the world. He may have a great definition of understanding but it still sounds (to me) like it's potentially missing aspects of human/animal biologically based understanding.

For example, I have cancer, and about six months before I was diagnosed, I had begun to get more chronically depressed than usual. I felt hopeless and I didn't know why. Surprisingly, that's actually a symptom of my cancer. What understanding did I have that changed how I felt inside and how I understood the things around me? Suddenly I felt different about words and ideas, but nothing had changed externally, something had change internally. The connections in my neural network had adjusted, the feelings and associations with words and ideas was different, but I hadn't done anything to make that adjustment. No learning or understanding had happened. I had a mutation in my DNA that made that adjustment for me.

Further, I think he's deeply misunderstanding (possibly intentionally?) what linguists like Chomsky are saying when they say humans are born with language. They mean that we are born with a genetic blueprint to understand language. Just like animals are born with a genetic blueprint to do things they were never trained to do. Many animals are born and almost immediately stand up to walk. This is the same principle. There are innate biologically ingrained understandings that help us along the path to understanding. It does not mean we are born understanding language as much as we are born with the building blocks of understanding the physical world in which we exist.

Anyway, interesting talk, but I immediately am skeptical of anyone who wholly dismisses an entire field of thought so casually.

For what it's worth, I didn't downvote you and I'm sorry people are doing so.
1 Antwort Letzte Antwort

10
L lfrith@lemmy.ca

They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.
G This user is from outside of this forum
G This user is from outside of this forum
gissamittjobb@lemmy.ml

schrieb zuletzt editiert von

#25

Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You're likely to see a lot more success using this approach.
J 1 Antwort Letzte Antwort

8
P pro@programming.dev

This post did not contain any content.

AI Chatbots Remain Overconfident — Even When They’re Wrong - Dietrich College of Humanities and Social Sciences - Carnegie Mellon University

Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

(www.cmu.edu)
F This user is from outside of this forum
F This user is from outside of this forum
fodor@lemmy.zip

schrieb zuletzt editiert von

#26

What a terrible headline. Self-aware? Really?
1 Antwort Letzte Antwort

10
O obinice@lemmy.world

People really do not like seeing opposing viewpoints, eh? There's disagreeing, and then there's downvoting to oblivion without even engaging in a discussion, haha.

Even if they're probably right, in such murky uncertain waters where we're not experts, one should have at least a little open mind, or live and let live.
F This user is from outside of this forum
F This user is from outside of this forum
fodor@lemmy.zip

schrieb zuletzt editiert von

#27

I think there's two basic mistakes that you made. First, you think that we aren't experts, but it's definitely true that some of us have studied these topics for years in college or graduate school, and surely many other people are well read on the subject. Obviously you can't easily confirm our backgrounds, but we exist. Second, people who are somewhat aware of the topic might realize that it's not particularly productive to engage in discussion on it here because there's too much background information that's missing. It's often the case that experts don't try to discuss things because it's the wrong venue, not because they feel superior.
1 Antwort Letzte Antwort

3
S shalafi@lemmy.world

Neither are our brains.

“Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

― Peter Watts, Blindsight (fiction)

Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

If nothing else, LLMs have woke me to how stupid humans are vs. the machines.
A This user is from outside of this forum
A This user is from outside of this forum
aesthelete@lemmy.world

schrieb zuletzt editiert von

#28

Every thread about LLMs has to have some guy like yourself saying how LLMs are like humans and smarter than humans for some reason.
D 1 Antwort Letzte Antwort

6
P pro@programming.dev

This post did not contain any content.

AI Chatbots Remain Overconfident — Even When They’re Wrong - Dietrich College of Humanities and Social Sciences - Carnegie Mellon University

Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

(www.cmu.edu)
C This user is from outside of this forum
C This user is from outside of this forum
cosmonova@lemmy.world

schrieb zuletzt editiert von

#29

Is that a recycled piece from 2023? Because we already knew that.
1 Antwort Letzte Antwort

6
P pro@programming.dev

This post did not contain any content.

AI Chatbots Remain Overconfident — Even When They’re Wrong - Dietrich College of Humanities and Social Sciences - Carnegie Mellon University

Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

(www.cmu.edu)
K This user is from outside of this forum
K This user is from outside of this forum
kameecoding@lemmy.world

schrieb zuletzt editiert von

#30

Oh shit, they do behave like humans after all.
1 Antwort Letzte Antwort

0
S shalafi@lemmy.world

Neither are our brains.

“Brains are survival engines, not truth detectors. If self-deception promotes fitness, the brain lies. Stops noticing—irrelevant things. Truth never matters. Only fitness. By now you don’t experience the world as it exists at all. You experience a simulation built from assumptions. Shortcuts. Lies. Whole species is agnosiac by default.”

― Peter Watts, Blindsight (fiction)

Starting to think we're really not much smarter. "But LLMs tell us what we want to hear!" Been on FaceBook lately, or lemmy?

If nothing else, LLMs have woke me to how stupid humans are vs. the machines.
S This user is from outside of this forum
S This user is from outside of this forum
saimen@feddit.org

schrieb zuletzt editiert von

#31

Just a moment...

(scienceandnonduality.com)
1 Antwort Letzte Antwort

2
P pro@programming.dev

This post did not contain any content.

AI Chatbots Remain Overconfident — Even When They’re Wrong - Dietrich College of Humanities and Social Sciences - Carnegie Mellon University

Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

(www.cmu.edu)
C This user is from outside of this forum
C This user is from outside of this forum
cley_faye@lemmy.world

schrieb zuletzt editiert von

#32

prompting concerns

Oh you.
1 Antwort Letzte Antwort

6
M modern_medicine_isnt@lemmy.world

It's easy, just ask the AI "are you sure"? Until it stops changing it's answer.

But seriously, LLMs are just advanced autocomplete.
C This user is from outside of this forum
C This user is from outside of this forum
cley_faye@lemmy.world

schrieb zuletzt editiert von

#33

Ah, the monte-carlo approach to truth.
1 Antwort Letzte Antwort

4
L lfrith@lemmy.ca

They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.
S This user is from outside of this forum
S This user is from outside of this forum
saimen@feddit.org

schrieb zuletzt editiert von

#34

I once gave some kind of math problem (how to break down a certain amount of money into bills) and the llm wrote a python script for it, ran it and thus gave me the correct answer. Kind of clever really.
1 Antwort Letzte Antwort

1
T thb@lemmy.world

It's like talking with someone who thinks the Earth is flat. There isn't anything to discuss. They're objectively wrong.

Humans like to anthropomorphize everything. It's why you can see a face on a car's front grille. LLMs are ultra advanced pattern matching algorithms. They do not think or reason or have any kind of opinion or sentience, yet they are being utilized as if they do. Let's see how it works out for the world, I guess.
S This user is from outside of this forum
S This user is from outside of this forum
saimen@feddit.org

schrieb zuletzt editiert von saimen@feddit.org

#35

I think so too, but I am really curious what will happen when we give them "bodies" with sensors so they can explore the world and make individual "experiences". I could imagine they would act much more human after a while and might even develop some kind of sentience.

Of course they would also need some kind of memory and self-actualization processes.
J 1 Antwort Letzte Antwort

0
A aesthelete@lemmy.world

Every thread about LLMs has to have some guy like yourself saying how LLMs are like humans and smarter than humans for some reason.
D This user is from outside of this forum
D This user is from outside of this forum
dontbelievethis@sh.itjust.works

schrieb zuletzt editiert von

#36

Some humans are not as smart as LLMs, I give them that.
A 1 Antwort Letzte Antwort

1
G gissamittjobb@lemmy.ml

Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You're likely to see a lot more success using this approach.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb zuletzt editiert von

#37

Also, generally the best interfaces for LLM will combine non-LLM facilities transparently. The LLM might be able to translate the prose to the format the math engine desires and then an intermediate layer recognizes a tag to submit an excerpt to a math engine and substitute the chunk with output from the math engine.

Even for servicing a request to generate an image, the text generation model runs independent of the image generation, and the intermediate layer combines them. Which can cause fun disconnects like the guy asking for a full glass of wine. The text generation half is completely oblivious to the image generation half. So it responds playing the role of a graphic artist dutifully doing the work without ever 'seeing' the image, but it assumes the image is good because that's consistent with training output, but then the user corrects it and it goes about admitting that the picture (that it never 'looked' at) was wrong and retrying the image generator with the additional context, to produce a similarly botched picture.
1 Antwort Letzte Antwort

2
P pro@programming.dev

This post did not contain any content.

AI Chatbots Remain Overconfident — Even When They’re Wrong - Dietrich College of Humanities and Social Sciences - Carnegie Mellon University

Large Language Models appear to be unaware of their own mistakes, prompting concerns about common uses for AI chatbots.

(www.cmu.edu)
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb zuletzt editiert von jj4211@lemmy.world

#38

They are not only unaware of their own mistakes, they are unaware of their successes. They are generating content that is, per their training corpus, consistent with the input. This gets eerie, and the 'uncanny valley' of the mistakes are all the more striking, but they are just generating content without concept of 'mistake' or' 'success' or the content being a model for something else and not just being a blend of stuff from the training data.

For example:

Me: Generate an image of a frog on a lilypad.
LLM: I'll try to create that — a peaceful frog on a lilypad in a serene pond scene. The image will appear shortly below.

<includes a perfectly credible picture of a frog on a lilypad, request successfully processed>

Me (lying): That seems to have produced a frog under a lilypad instead of on top.
LLM: Thanks for pointing that out! I'm generating a corrected version now with the frog clearly sitting on top of the lilypad. It’ll appear below shortly.

<includes another perfectly credible picture>

It didn't know anything about the picture, it just took the input at it's word. A human would have stopped to say "uhh... what do you mean, the lilypad is on water and frog is on top of that?" Or if the human were really trying to just do the request without clarification, they might have tried to think "maybe he wanted it from the perspective of a fish, and he wanted the frog underwater?". A human wouldn't have gone "you are right, I made a mistake, here I've tried again" and include almost the exact same thing.

But tha training data isn't predominantly people blatantly lying about such obvious things or second guessing things that were done so obviously normally correct.
V 1 Antwort Letzte Antwort

5

Anmelden zum Antworten

D

Elon Musk’s X platform investigated in France for alleged data tampering and fraud
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

255 Stimmen

4 Beiträge

41 Aufrufe

T

isnt merz kinda right wing, but not AFD-CRAZY.
B

Bubble-Wrapped Growth: The Expanded Polystyrene for Packaging Market Outlook
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

2

0 Stimmen

1 Beiträge

13 Aufrufe

Niemand hat geantwortet
P

The State of Consumer AI: AI’s Consumer Tipping Point Has Arrived - Only 3%* of US AI users are willing to pay for it.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
17

1

89 Stimmen

17 Beiträge

110 Aufrufe

E

No, I don't mean prompting users. Typical ways to increase conversion rate are locking popular features behind the subscription (like you need premium account to comment), making some content available only to premium users or limiting the amount of content you can access as a free user (like only 2h per day). So far I'm still watching videos on youtube without even creating an account and without ads (ad-block).
P

Colleges spend Millions to catch plagiarism and AI. Is Turnitin faulty and expensive tech that require students to let the company keep their papers forever, worth it?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
10

1

92 Stimmen

10 Beiträge

59 Aufrufe

_

No, TurnItIn is garbage.
P

ICE advances sole source deal with Palantir for new surveillance backbone
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

68 Stimmen

4 Beiträge

32 Aufrufe

J

Damn, I heard this mentioned somewhere as well! I don't remember where, though... The CIA is also involved with the cartels in Mexico as well as certain groups in the Middle East. They like to bring "democracy" to many countries that won't become a pawn of the Western regime.
D

Chinese tech firms freeze AI tools in national crackdown on exam cheats
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

43 Stimmen

1 Beiträge

15 Aufrufe

Niemand hat geantwortet
P

Nvidia debuts a native GeForce NOW app for Steam Deck, supporting games in up to 4K at 60 FPS; in testing, the app extended Steam Deck battery life by up to 50%
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
37

1

146 Stimmen

37 Beiträge

187 Aufrufe

D

Self hosted Sunshine and Moonlight is the way to go.
J

Small (web) is beautiful
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

0 Stimmen

6 Beiträge

38 Aufrufe

F

Will do thank you.