linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Technology

356 Beiträge 149 Kommentatoren 3.1k Aufrufe

A allah@lemm.ee
8. Juni 2025, 10:59

LOOK MAA I AM ON FRONT PAGE
S This user is from outside of this forum
S This user is from outside of this forum
skisnow@lemmy.ca

schrieb am 9. Juni 2025, 04:43 zuletzt editiert von

#193

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.
T 1 Antwort Letzte Antwort 9. Juni 2025, 16:17

26
A allah@lemm.ee
8. Juni 2025, 10:59

LOOK MAA I AM ON FRONT PAGE
F This user is from outside of this forum
F This user is from outside of this forum
freakinsteve@lemmy.world

schrieb am 9. Juni 2025, 04:59 zuletzt editiert von

#194

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK
8 J T 3 Antworten Letzte Antwort 9. Juni 2025, 06:00

32
G gamechld@lemmy.world
8. Juni 2025, 22:36

Most humans don't reason. They just parrot shit too. The design is very human.
S This user is from outside of this forum
S This user is from outside of this forum
skisnow@lemmy.ca

schrieb am 9. Juni 2025, 05:08 zuletzt editiert von

#195

I hate this analogy. As a throwaway whimsical quip it'd be fine, but it's specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it's lowered my tolerance for it as a topic even if you did intend it flippantly.
G 1 Antwort Letzte Antwort 10. Juni 2025, 12:56

9
F freakinsteve@lemmy.world
9. Juni 2025, 04:59

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK
8 This user is from outside of this forum
8 This user is from outside of this forum
800xl@lemmy.world

schrieb am 9. Juni 2025, 06:00 zuletzt editiert von

#196

Extept for Siri, right? Lol
T 1 Antwort Letzte Antwort 9. Juni 2025, 06:10

1
8 800xl@lemmy.world
9. Juni 2025, 06:00

Extept for Siri, right? Lol
T This user is from outside of this forum
T This user is from outside of this forum
threeme2189@lemmy.world

schrieb am 9. Juni 2025, 06:10 zuletzt editiert von

#197

Apple Intelligence
1 Antwort Letzte Antwort

1
J jdpoz@lemmy.world
9. Juni 2025, 04:22

It’s an expensive carbon spewing parrot.
T This user is from outside of this forum
T This user is from outside of this forum
threeme2189@lemmy.world

schrieb am 9. Juni 2025, 06:11 zuletzt editiert von

#198

It's a very resource intensive autocomplete
1 Antwort Letzte Antwort

9
I intensely_human@lemm.ee
9. Juni 2025, 00:10

Fair, but the same is true of me. I don't actually "reason"; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a "nasty logic error" pattern match at some point in the process, I "know" I've found a "flaw in the argument" or "bug in the design".

But there's no from-first-principles method by which I developed all these patterns; it's just things that have survived the test of time when other patterns have failed me.

I don't think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.
C This user is from outside of this forum
C This user is from outside of this forum
conicalscientist@lemmy.world

schrieb am 9. Juni 2025, 06:34 zuletzt editiert von

#199

This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).

Higher reasoning is taught to humans. We have the capability. That's why we spend the first quarter of our lives in education. Sometimes not all of us are able.

I'm sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.
1 Antwort Letzte Antwort

1
A allah@lemm.ee
8. Juni 2025, 10:59

LOOK MAA I AM ON FRONT PAGE
M This user is from outside of this forum
M This user is from outside of this forum
minoscopede@lemmy.world

schrieb am 9. Juni 2025, 07:20 zuletzt editiert von minoscopede@lemmy.world 6. Sept. 2025, 09:24

#200

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.
Z T T R K 7 Antworten Letzte Antwort 9. Juni 2025, 07:54

69
V vrighter@discuss.tchncs.de
9. Juni 2025, 03:54

an llm also works on fixed transition probabilities. All the training is done during the generation of the weights, which are the compressed state transition table. After that, it's just a regular old markov chain. I don't know why you seem so fixated on getting different output if you provide different input (as I said, each token generated is a separate independent invocation of the llm with a different input). That is true of most computer programs.

It's just an implementation detail. The markov chains we are used to has a very short context, due to combinatorial explosion when generating the state transition table. With llms, we can use a much much longer context. Put that context in, it runs through the completely immutable model, and out comes a probability distribution. Any calculations done during the calculation of this probability distribution is then discarded, the chosen token added to the context, and the program is run again with zero prior knowledge of any reasoning about the token it just generated. It's a seperate execution with absolutely nothing shared between them, so there can't be any "adapting" going on
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb am 9. Juni 2025, 07:28 zuletzt editiert von auraithx@lemmy.dbzer0.com 6. Sept. 2025, 09:28

#201

Because transformer architecture is not equivalent to a probabilistic lookup. A Markov chain assigns probabilities based on a fixed-order state transition, without regard to deeper structure or token relationships. An LLM processes the full context through many layers of non-linear functions and attention heads, each layer dynamically weighting how each token influences every other token.

Although weights do not change during inference, the behavior of the model is not fixed in the way a Markov chain’s state table is. The same model can respond differently to very similar prompts, not just because the inputs differ, but because the model interprets structure, syntax, and intent in ways that are contextually dependent. That is not just longer context-it is fundamentally more expressive computation.

The process is stateless across calls, yes, but it is not blind. All relevant information lives inside the prompt, and the model uses the attention mechanism to extract meaning from relationships across the sequence. Each new input changes the internal representation, so the output reflects contextual reasoning, not a static response to a matching pattern. Markov chains cannot replicate this kind of behavior no matter how many states they include.
V 1 Antwort Letzte Antwort 9. Juni 2025, 07:47

0
A allah@lemm.ee
8. Juni 2025, 10:59

LOOK MAA I AM ON FRONT PAGE
X This user is from outside of this forum
X This user is from outside of this forum
xatolos@reddthat.com

schrieb am 9. Juni 2025, 07:38 zuletzt editiert von

#202

So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

Huh.
C 1 Antwort Letzte Antwort 9. Juni 2025, 12:52

8
A auraithx@lemmy.dbzer0.com
9. Juni 2025, 07:28

Because transformer architecture is not equivalent to a probabilistic lookup. A Markov chain assigns probabilities based on a fixed-order state transition, without regard to deeper structure or token relationships. An LLM processes the full context through many layers of non-linear functions and attention heads, each layer dynamically weighting how each token influences every other token.

Although weights do not change during inference, the behavior of the model is not fixed in the way a Markov chain’s state table is. The same model can respond differently to very similar prompts, not just because the inputs differ, but because the model interprets structure, syntax, and intent in ways that are contextually dependent. That is not just longer context-it is fundamentally more expressive computation.

The process is stateless across calls, yes, but it is not blind. All relevant information lives inside the prompt, and the model uses the attention mechanism to extract meaning from relationships across the sequence. Each new input changes the internal representation, so the output reflects contextual reasoning, not a static response to a matching pattern. Markov chains cannot replicate this kind of behavior no matter how many states they include.
V This user is from outside of this forum
V This user is from outside of this forum
vrighter@discuss.tchncs.de

schrieb am 9. Juni 2025, 07:47 zuletzt editiert von

#203

an llm works the same way! Once it's trained,none of what you said applies anymore. The same model can respond differently with the same inputs specifically because after the llm does its job, sometimes we intentionally don't pick the most likely token, but choose a different one instead. RANDOMLY. Set the temperature to 0 and it will always reply with the same answer. And llms also have a fixed order state transition. Just because you only typed one word doesn't mean that that token is not preceded by n-1 null tokens. The llm always receives the same number of tokens. It cannot work with an arbitrary number of tokens.

all relevant information "remains in the prompt" only until it slides out of the context window, just like any markov chain.
A 1 Antwort Letzte Antwort 9. Juni 2025, 08:09

1
T turmacar@lemmy.world
9. Juni 2025, 04:29
"Computer" meaning a mechanical/electro-mechanical/electrical machine wasn't used until around after WWII.

Babbag's difference/analytical engines weren't confusing because people called them a computer, they didn't.
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
- Charles Babbage
If you give any computer, human or machine, random numbers, it will not give you "correct answers".

It's possible Babbage lacked the social skills to detect sarcasm. We also have several high profile cases of people just trusting LLMs to file legal briefs and official government 'studies' because the LLM "said it was real".
A This user is from outside of this forum
A This user is from outside of this forum
appletea@lemmy.zip

schrieb am 9. Juni 2025, 07:48 zuletzt editiert von

#204

What they mean is that before Turing, "computer" was literally a person's job description. You hand a professional a stack of calculations with some typos, part of the job is correcting those out. Newfangled machine comes along with the same name as the job, among the first thing people are gonna ask about is where it fall short.

Like, if I made a machine called "assistant", it'd be natural for people to point out and ask about all the things a person can do that a machine just never could.
T 1 Antwort Letzte Antwort 9. Juni 2025, 15:54

0
M minoscopede@lemmy.world
9. Juni 2025, 07:20

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.
Z This user is from outside of this forum
Z This user is from outside of this forum
zacryon@feddit.org

schrieb am 9. Juni 2025, 07:54 zuletzt editiert von

#205

Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.
K J 2 Antworten Letzte Antwort 9. Juni 2025, 10:02

7
M minoscopede@lemmy.world
9. Juni 2025, 07:20

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.
T This user is from outside of this forum
T This user is from outside of this forum
theherk@lemmy.world

schrieb am 9. Juni 2025, 07:59 zuletzt editiert von

#206
Yeah these comments have the three hallmarks of Lemmy:
- AI is just autocomplete mantras.
- Apple is always synonymous with bad and dumb.
- Rare pockets of really thoughtful comments.
Thanks for being at least the latter.
1 Antwort Letzte Antwort

19
V vrighter@discuss.tchncs.de
9. Juni 2025, 07:47

an llm works the same way! Once it's trained,none of what you said applies anymore. The same model can respond differently with the same inputs specifically because after the llm does its job, sometimes we intentionally don't pick the most likely token, but choose a different one instead. RANDOMLY. Set the temperature to 0 and it will always reply with the same answer. And llms also have a fixed order state transition. Just because you only typed one word doesn't mean that that token is not preceded by n-1 null tokens. The llm always receives the same number of tokens. It cannot work with an arbitrary number of tokens.

all relevant information "remains in the prompt" only until it slides out of the context window, just like any markov chain.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb am 9. Juni 2025, 08:09 zuletzt editiert von

#207

Your conflating surface-level architectural limits with core functional behaviour. Yes, an LLM is deterministic at temperature 0 and produces the same output for the same input, but that does not make it equivalent to a Markov chain. A Markov chain defines transitions based on fixed-order memory and static probabilities. An LLM generates output by applying a series of matrix multiplications, activations, and attention-weighted context aggregations across multiple layers, where the representation of each token is conditioned on the entire input sequence, not just on recent tokens.

While the model has a maximum token limit, it does not receive a fixed-length input filled with nulls. It processes variable-length input sequences up to the context limit, and attention masks control which positions are used. These are not hardcoded state transitions; they are dynamically computed weightings over continuous embeddings, where meaning arises from the interaction of tokens, not from simple position or order alone.

Saying that output diversity is just randomness misunderstands why random sampling exists: to explore the rich distribution the model has learned from data, not to fake intelligence. The depth of its output space comes from how it models relationships, hierarchies, syntax, and semantics through training. Markov chains do not do any of this. They map sequences to likely next symbols without modeling internal structure. An LLM’s output reflects high-dimensional reasoning over the prompt. That behavior cannot be reduced to fixed transition logic.
V 1 Antwort Letzte Antwort 9. Juni 2025, 09:23

0
C clent@lemmy.dbzer0.com
8. Juni 2025, 22:04

Intellegence has a very clear definition.

It's requires the ability to acquire knowledge, understand knowledge and use knowledge.

No one has been able to create an system that can understand knowledge, therefor me none of it is artificial intelligence. Each generation is merely more and more complex knowledge models. Useful in many ways but never intelligent.
8 This user is from outside of this forum
8 This user is from outside of this forum
8uurg@lemmy.world

schrieb am 9. Juni 2025, 08:13 zuletzt editiert von

#208

Wouldn't the algorithm that creates these models in the first place fit the bill? Given that it takes a bunch of text data, and manages to organize this in such a fashion that the resulting model can combine knowledge from pieces of text, I would argue so.

What is understanding knowledge anyways? Wouldn't humans not fit the bill either, given that for most of our knowledge we do not know why it is the way it is, or even had rules that were - in hindsight - incorrect?

If a model is more capable of solving a problem than an average human being, isn't it, in its own way, some form of intelligent? And, to take things to the utter extreme, wouldn't evolution itself be intelligent, given that it causes intelligent behavior to emerge, for example, viruses adapting to external threats? What about an (iterative) optimization algorithm that finds solutions that no human would be able to find?

Intellegence has a very clear definition.

I would disagree, it is probably one of the most hard to define things out there, which has changed greatly with time, and is core to the study of philosophy. Every time a being or thing fits a definition of intelligent, the definition often altered to exclude, as has been done many times.
1 Antwort Letzte Antwort

2
A allah@lemm.ee
8. Juni 2025, 10:59

LOOK MAA I AM ON FRONT PAGE
M This user is from outside of this forum
M This user is from outside of this forum
mwa@thelemmy.club

schrieb am 9. Juni 2025, 09:02 zuletzt editiert von

#209

i also assume local LLMS Do this aswell (Cause local llms can "Reason" Now)
i wont call them Open-Source LLMS cause its not fully true
1 Antwort Letzte Antwort

0
A auraithx@lemmy.dbzer0.com
9. Juni 2025, 08:09

Your conflating surface-level architectural limits with core functional behaviour. Yes, an LLM is deterministic at temperature 0 and produces the same output for the same input, but that does not make it equivalent to a Markov chain. A Markov chain defines transitions based on fixed-order memory and static probabilities. An LLM generates output by applying a series of matrix multiplications, activations, and attention-weighted context aggregations across multiple layers, where the representation of each token is conditioned on the entire input sequence, not just on recent tokens.

While the model has a maximum token limit, it does not receive a fixed-length input filled with nulls. It processes variable-length input sequences up to the context limit, and attention masks control which positions are used. These are not hardcoded state transitions; they are dynamically computed weightings over continuous embeddings, where meaning arises from the interaction of tokens, not from simple position or order alone.

Saying that output diversity is just randomness misunderstands why random sampling exists: to explore the rich distribution the model has learned from data, not to fake intelligence. The depth of its output space comes from how it models relationships, hierarchies, syntax, and semantics through training. Markov chains do not do any of this. They map sequences to likely next symbols without modeling internal structure. An LLM’s output reflects high-dimensional reasoning over the prompt. That behavior cannot be reduced to fixed transition logic.
V This user is from outside of this forum
V This user is from outside of this forum
vrighter@discuss.tchncs.de

schrieb am 9. Juni 2025, 09:23 zuletzt editiert von vrighter@discuss.tchncs.de 6. Sept. 2025, 11:25

#210

the probabilities are also fixed after training. You seem to be conflating running the llm with different input to the model somehow adapting. The new context goes into the same fixed model. And yes, it can be reduced to fixed transition logic, you just need to have all possible token combinations in the table. This is obviously intractable due to space issues, so we came up with a lossy compression scheme for it. The table itself is learned once, then it's fixed. The training goes into generating a huge markov chain. Just because the table is learned from data, doesn't change what it actually is.
A 1 Antwort Letzte Antwort 9. Juni 2025, 09:44

0
V vrighter@discuss.tchncs.de
9. Juni 2025, 09:23

the probabilities are also fixed after training. You seem to be conflating running the llm with different input to the model somehow adapting. The new context goes into the same fixed model. And yes, it can be reduced to fixed transition logic, you just need to have all possible token combinations in the table. This is obviously intractable due to space issues, so we came up with a lossy compression scheme for it. The table itself is learned once, then it's fixed. The training goes into generating a huge markov chain. Just because the table is learned from data, doesn't change what it actually is.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb am 9. Juni 2025, 09:44 zuletzt editiert von auraithx@lemmy.dbzer0.com 6. Sept. 2025, 11:45

#211

This argument collapses the entire distinction between parametric modeling and symbolic lookup. Yes, the weights are fixed after training, but the key point is that an LLM does not store or retrieve a state transition table. It learns to approximate the probability of the next token given a sequence through function approximation, not by memorizing discrete transitions. What appears to be a "table" is actually a deep, distributed representation compressed into continuous weight matrices. It is not indexing state transitions, it is computing probabilities from patterns in the input space.

A true Markov chain defines transition probabilities over explicit states. An LLM embeds tokens into high-dimensional vectors, then transforms them repeatedly using self-attention and feedforward layers that can capture subtle syntactic, semantic, and structural features. These features interact in nonlinear ways that go far beyond what any finite transition table could express. You cannot meaningfully represent an LLM’s behavior as a finite Markov model, even in principle, because its representations are not enumerable states but regions of a continuous latent space.

Saying “you just need all token combinations in a table” ignores the fact that the model generalizes to combinations never seen during training. That is the core of its power. It doesn’t look up learned transitions-it constructs responses by interpolating through an embedding space guided by attention and weight structure. No Markov chain does this. A lossy compressor of a transition table still implies a symbolic map; a neural network is a differentiable function trained to fit a distribution, not to encode it explicitly.
V 1 Antwort Letzte Antwort 9. Juni 2025, 10:03

0
Z zacryon@feddit.org
9. Juni 2025, 07:54

Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.
K This user is from outside of this forum
K This user is from outside of this forum
kreskin@lemmy.world

schrieb am 9. Juni 2025, 10:02 zuletzt editiert von kreskin@lemmy.world 6. Sept. 2025, 12:06

#212

Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.
W Z 2 Antworten Letzte Antwort 9. Juni 2025, 13:34

1

Anmelden zum Antworten

202/356

9. Juni 2025, 07:38

H

Reverse engineering the mysterious Up-Data Link Test Set from Apollo
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 vor 11 Tagen
vor 11 Tagen
1

42 Stimmen

2 Beiträge

27 Aufrufe

B vor 11 Tagen

Tech archeology like this is pretty neat.
I

Oracle, OpenAI Expand Stargate Deal for More US Data Centers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 vor 27 Tagen
vor 27 Tagen

18 Stimmen

4 Beiträge

53 Aufrufe

M vor 27 Tagen

Is the 30B calculated before or after Oracle arbitrarily increases their pricing for no reason?
A

US Senate strikes AI regulation ban from Trump megabill
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 2. Juli 2025, 12:55
2. Juli 2025, 00:51

73 Stimmen

3 Beiträge

38 Aufrufe

C 2. Juli 2025, 12:55

No one likes little teddy, it appears.
Z

The racist tendencies within ICE agencies directly affect law enforcement fairness
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 19. Juni 2025, 02:48
19. Juni 2025, 02:48

2 Stimmen

1 Beiträge

16 Aufrufe

Niemand hat geantwortet
A

Amazon is reportedly training humanoid robots to deliver packages
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 9. Juni 2025, 08:41
6. Juni 2025, 05:24
1

300 Stimmen

143 Beiträge

1k Aufrufe

M 9. Juni 2025, 08:41

Yup, and people seem to frequently underestimate how ridiculously expensive running a fleet of humanoid robots would be (and don’t seem to realize how comparatively low the manual labor it’d replace is paid.)
P

Gemini will now automatically summarize your emails unless you opt out
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 5. Juni 2025, 09:07
30. Mai 2025, 16:34
1

215 Stimmen

118 Beiträge

955 Aufrufe

A 5. Juni 2025, 09:07

Outlook has search?!
P

YouTube tops Disney and Netflix in TV viewing
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 28. Mai 2025, 19:06
27. Mai 2025, 19:53
1

215 Stimmen

96 Beiträge

2k Aufrufe

C 28. Mai 2025, 19:06

"Not Interested" is just free data for them to fill out your account's advertising profile.
A

Research shows more than 80% of AI projects fail, wasting billions of dollars in capital and resources: Report
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
356 10. Juni 2025, 23:03
29. Aug. 2024, 06:53
1

0 Stimmen

2 Beiträge

31 Aufrufe

P 10. Juni 2025, 23:03

It's a shame. AI has potential but most people just want to exploit its development for their own gain.