linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Technology

347 Beiträge 149 Kommentatoren 17 Aufrufe

X xthexder@l.sw0.com

I'm not sure how you arrived at lime the mineral being a more likely question than lime the fruit. I'd expect someone asking about kidney stones would also be asking about foods that are commonly consumed.

This kind of just goes to show there's multiple ways something can be interpreted. Maybe a smart human would ask for clarification, but for sure AIs today will just happily spit out the first answer that comes up. LLMs are extremely "good" at making up answers to leading questions, even if it's completely false.
K This user is from outside of this forum
K This user is from outside of this forum
knock_knock_lemmy_in@lemmy.world

schrieb zuletzt editiert von

#224

A well trained model should consider both types of lime. Failure is likely down to temperature and other model settings. This is not a measure of intelligence.
1 Antwort Letzte Antwort

1
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
M This user is from outside of this forum
M This user is from outside of this forum
melsaskca@lemmy.ca

schrieb zuletzt editiert von

#225

It's all "one instruction at a time" regardless of high processor speeds and words like "intelligent" being bandied about. "Reason" discussions should fall into the same query bucket as "sentience".
M 1 Antwort Letzte Antwort

8
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
M This user is from outside of this forum
M This user is from outside of this forum
mniot@programming.dev

schrieb zuletzt editiert von

#226

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.
1 Antwort Letzte Antwort

41
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb zuletzt editiert von

#227

It's not just the memorization of patterns that matters, it's the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that's value - that's the new Google.
C 1 Antwort Letzte Antwort

5
M melsaskca@lemmy.ca

It's all "one instruction at a time" regardless of high processor speeds and words like "intelligent" being bandied about. "Reason" discussions should fall into the same query bucket as "sentience".
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb zuletzt editiert von

#228

My impression of LLM training and deployment is that it's actually massively parallel in nature - which can be implemented one instruction at a time - but isn't in practice.
1 Antwort Letzte Antwort

2
R redacted@infosec.pub

What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it's no longer reasoning? I feel like at this point a more relevant question is "What exactly is reasoning?". Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.

Reasoning system - Wikipedia

(en.wikipedia.org)
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb zuletzt editiert von

#229

I think as we approach the uncanny valley of machine intelligence, it's no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.
T 1 Antwort Letzte Antwort

1
M mangocats@feddit.it

It's not just the memorization of patterns that matters, it's the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that's value - that's the new Google.
C This user is from outside of this forum
C This user is from outside of this forum
cactopuses@lemm.ee

schrieb zuletzt editiert von

#230

While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.

Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It's gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.
M 1 Antwort Letzte Antwort

8
X xatolos@reddthat.com

So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

Huh.
C This user is from outside of this forum
C This user is from outside of this forum
coolmojo@lemmy.world

schrieb zuletzt editiert von

#231

The AI stands for Actually Indians /s
1 Antwort Letzte Antwort

1
K knock_knock_lemmy_in@lemmy.world

When given explicit instructions to follow models failed because they had not seen similar instructions before.

This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb zuletzt editiert von

#232

I'm not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.

If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.
K 1 Antwort Letzte Antwort

5
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
B This user is from outside of this forum
B This user is from outside of this forum
billwashere@lemmy.world

schrieb zuletzt editiert von

#233

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
S J N X S 5 Antworten Letzte Antwort

52
K knock_knock_lemmy_in@lemmy.world

do we know that they don't and are incapable of reasoning.

"even when we provide the
algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve"
C This user is from outside of this forum
C This user is from outside of this forum
communist@lemmy.frozeninferno.xyz

schrieb zuletzt editiert von communist@lemmy.frozeninferno.xyz

#234

That indicates that this particular model does not follow instructions, not that it is architecturally fundamentally incapable.
K 1 Antwort Letzte Antwort

0
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
N This user is from outside of this forum
N This user is from outside of this forum
nostradavid@programming.dev

schrieb zuletzt editiert von

#235

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".
B F T 3 Antworten Letzte Antwort

4
N nostradavid@programming.dev

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".
B This user is from outside of this forum
B This user is from outside of this forum
brutticus@midwest.social

schrieb zuletzt editiert von

#236

Then use a different word. "AI" and "reasoning" makes people think of Skynet, which is what the weird tech bros want the lay person to think of. LLMs do not "think", but that's not to say I might not be persuaded of their utility. But thats not the way they are being marketed.
1 Antwort Letzte Antwort

9
K kreskin@lemmy.world

Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.
W This user is from outside of this forum
W This user is from outside of this forum
wetbeardhairs@lemmy.dbzer0.com

schrieb zuletzt editiert von wetbeardhairs@lemmy.dbzer0.com

#237

Machine learning based pattern matching is indeed very useful and profitable when applied correctly. Identify (with confidence levels) features in data that would otherwise take an extremely well trained person. And even then it's just for the cursory search that takes the longest before presenting the highest confidence candidate results to a person for evaluation. Think: scanning medical data for indicators of cancer, reading live data from machines to predict failure, etc.

And what we call "AI" right now is just a much much more user friendly version of pattern matching - the primary feature of LLMs is that they natively interact with plain language prompts.
1 Antwort Letzte Antwort

3
C communist@lemmy.frozeninferno.xyz

That indicates that this particular model does not follow instructions, not that it is architecturally fundamentally incapable.
K This user is from outside of this forum
K This user is from outside of this forum
knock_knock_lemmy_in@lemmy.world

schrieb zuletzt editiert von

#238

Not "This particular model". Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.

The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.
C 1 Antwort Letzte Antwort

3
M mangocats@feddit.it

I'm not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.

If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.
K This user is from outside of this forum
K This user is from outside of this forum
knock_knock_lemmy_in@lemmy.world

schrieb zuletzt editiert von

#239

Sure. We weren't discussing if AI creates value or not. If you ask a different question then you get a different answer.
M 1 Antwort Letzte Antwort

6
K kadup@lemmy.world

By that metric, you can argue Kasparov isn’t thinking during chess

Kasparov's thinking fits pretty much all biological definitions of thinking. Which is the entire point.
L This user is from outside of this forum
L This user is from outside of this forum
llewellyn@lemm.ee

schrieb zuletzt editiert von

#240

Is thinking necessarily biologic?
1 Antwort Letzte Antwort

0
E elbarto777@lemmy.world

LLMs deal with tokens. Essentially, predicting a series of bytes.

Humans do much, much, much, much, much, much, much more than that.
Z This user is from outside of this forum
Z This user is from outside of this forum
zexks@lemmy.world

schrieb zuletzt editiert von

#241

No. They don't. We just call them proteins.
S E 2 Antworten Letzte Antwort

0
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
S This user is from outside of this forum
S This user is from outside of this forum
softestsapphic@lemmy.world

schrieb zuletzt editiert von

#242

Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.
Z T B A 4 Antworten Letzte Antwort

98
N nostradavid@programming.dev

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".
F This user is from outside of this forum
F This user is from outside of this forum
fishy@lemmy.today

schrieb zuletzt editiert von

#243

The guy selling the car doesn't tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it's utility is nearly limitless because Tesla has demonstrated there's no actual penalty for lying to investors.
1 Antwort Letzte Antwort

15

Anmelden zum Antworten

J

Looking elsewhere
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

7 Stimmen

3 Beiträge

2 Aufrufe

J

That's a valid point! I've been searching for places to hangout for a while, sometimes called "campfires". Found a cool Discord with generous front-end folks (that's a broad spectrum!), on frontend.horse.
G

In North Korea, your phone secretly takes screenshots every 5 minutes for government surveillance
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
278

1

579 Stimmen

278 Beiträge

13 Aufrufe

V

The main difference being the consequences that might result from the surveillance.
L

I'm looking for an article showing that LLMs don't know how they work internally
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
80

133 Stimmen

80 Beiträge

5 Aufrufe

G

Indeed I did not, we’re at a stalemate because you and I do not believe what the other is saying! So we can’t move anywhere since it’s two walls. Buuuut Tim Apple got my back for once, just saw this now!: https://lemmy.blahaj.zone/post/27197259 I’ll leave it at that, as thanks to that white paper I win! Yay internet points!
P

Li-Cycle’s quest to recycle lithium-ion batteries ends in bankruptcy
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

22 Stimmen

1 Beiträge

1 Aufrufe

Niemand hat geantwortet
S

Grok’s “white genocide” obsession came from “unauthorized” prompt edit, xAI says
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

1 Stimmen

4 Beiträge

2 Aufrufe

N

that's probably not true. I imagine it was someone trying to harm the guy. a hilarious prank
N

First Look at Google’s Unfinished DeX-Like Desktop Mode for Android
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

27 Stimmen

4 Beiträge

2 Aufrufe

C

I really wish their whole lap-dock concept had succeeded. Or at least ran a few more generations, so I could get an upgraded model with USBc
S

California Bill Would Require That AT&T And Comcast Make Broadband Affordable For Poor People
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

1

300 Stimmen

9 Beiträge

3 Aufrufe

K

Internet access should be a utility like electricity and water until all three, along with housing, medicine, and food, can be free to all.
W

Microsoft CEO says up to 30% of the company's code was written by AI | TechCrunch
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

0 Stimmen

6 Beiträge

2 Aufrufe

P

Outlook.... Ok Pretty solid Bahaha hahahahaha Sorry. Outlook is a lot of things. "Gooey crap" would be one way to describe it, but "solid"? Yeah, no. Gmail is (well, was) pretty solid. There are a lot of other webmail providers out there, including self hosted options and most are pretty solid, yeah. Outlook, though? It's a shit show, it's annoying. Do you love me? Please love me, please give feedback, please give feedback again, please look at this, hey am I the best? Am I.. STFU YOU PIECE OF CRAP! Can you PLEASE just let me do my email without being an attention whore every hour? Even down to the basics. Back button? "What is that? Never heard of it, can't go back to the message I just was on because I'm Microsoft software and so half baked." Having two tabs open? "Oh noes, now I get scawed, now I don't know how to manage sessions anymore, better just sign you out everywhere." What is it with Microsoft and not being able to do something basic as sessions normal? I'm not even asking for good, definitely not "awesome", just normal, and that is already too much to ask. Try running it in Firefox! I'm sure it's totally not on purpose, just "oopsie woopsie poopsie" accidentally bwoken. Maybe it's working again today, who knows, tomorrow it'll be broken again. I run everything on Firefox except the Microsoft sites, they have to be in chrome because fuck you, that's why. Seriously, I can't take any Microsoft software seriously at this point, and all of it is on its way out in our company, I'm making sure of that