linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Technology

356 Beiträge 149 Kommentatoren 3.2k Aufrufe

R redacted@infosec.pub

What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it's no longer reasoning? I feel like at this point a more relevant question is "What exactly is reasoning?". Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.

Reasoning system - Wikipedia

(en.wikipedia.org)
S This user is from outside of this forum
S This user is from outside of this forum
stickly@lemmy.world

schrieb am zuletzt editiert von

#246

If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It's like comparing PhD reasoning to a dog's reasoning.

While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).

Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it's designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don't have the tech to make a synthetic human.
1 Antwort Letzte Antwort

4
K knock_knock_lemmy_in@lemmy.world

Not "This particular model". Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.

The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.
C This user is from outside of this forum
C This user is from outside of this forum
communist@lemmy.frozeninferno.xyz

schrieb am zuletzt editiert von communist@lemmy.frozeninferno.xyz

#247

those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.

this proves the issue is widespread, not fundamental.
0 K 2 Antworten Letzte Antwort

0
Z zexks@lemmy.world

No. They don't. We just call them proteins.
S This user is from outside of this forum
S This user is from outside of this forum
stickly@lemmy.world

schrieb am zuletzt editiert von

#248

You are either vastly overestimating the Language part of an LLM or simplifying human physiology back to the Greek's Four Humours theory.
Z 1 Antwort Letzte Antwort

1
Z zexks@lemmy.world

No. They don't. We just call them proteins.
E This user is from outside of this forum
E This user is from outside of this forum
elbarto777@lemmy.world

schrieb am zuletzt editiert von

#249

"They".

What are you?
1 Antwort Letzte Antwort

0
S surph_ninja@lemmy.world

That’s absolutely what it is. It’s a pattern on here. Any acknowledgment of humans being animals or less than superior gets hit with pushback.
E This user is from outside of this forum
E This user is from outside of this forum
elbarto777@lemmy.world

schrieb am zuletzt editiert von

#250

I didn't say we aren't animals or that we don't follow physics rules.

But what you're saying is the equivalent of "everything that goes up will eventually go down - that's how physics works and you don't see that, you're in denial!!!11!!!1"
1 Antwort Letzte Antwort

1
C clent@lemmy.dbzer0.com

Proving it matters. Science is constantly proving any other thing that people believe is obvious because people have an uncanning ability to believe things that are false. Some people will believe things long after science has proven them false.
E This user is from outside of this forum
E This user is from outside of this forum
eatspancakes84@lemmy.world

schrieb am zuletzt editiert von

#251

I mean… “proving” is also just marketing speak. There is no clear definition of reasoning, so there’s also no way to prove or disprove that something/someone reasons.
C 1 Antwort Letzte Antwort

1
C cactopuses@lemm.ee

While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.

Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It's gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb am zuletzt editiert von mangocats@feddit.it

#252

Hallucinations and the cost of running the models.

So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from "a book" or "the TV" has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.

The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What's the price of the resulting increased ignorance of the general population due to the high cost of information access?

What good is a bunch of knowledge stuck behind a search engine when people don't know how to access it, or access it efficiently?

Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.

Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.

I also worry that "easy access" to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don't know because they're dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead...
1 Antwort Letzte Antwort

0
K knock_knock_lemmy_in@lemmy.world

Sure. We weren't discussing if AI creates value or not. If you ask a different question then you get a different answer.
M This user is from outside of this forum
M This user is from outside of this forum
mangocats@feddit.it

schrieb am zuletzt editiert von

#253

Well - if you want to devolve into argument, you can argue all day long about "what is reasoning?"
T K 2 Antworten Letzte Antwort

2
B billwashere@lemmy.world

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
S This user is from outside of this forum
S This user is from outside of this forum
saturdaymorning@lemmy.ca

schrieb am zuletzt editiert von

#254

I agree with you. In its current state, LLM is not sentient, and thus not "Intelligence".
M 1 Antwort Letzte Antwort

2
V vrighter@discuss.tchncs.de

"lacks internal computation" is not part of the definition of markov chains. Only that the output depends only on the current state (the whole context, not just the last token) and no previous history, just like llms do. They do not consider tokens that slid out of the current context, because they are not part of the state anymore.

And it wouldn't be a cache unless you decide to start invalidating entries, which you could just, not do.. it would be a table with token-alphabet-size^context length size, with each entry being a vector of size token_alphabet_size. Because that would be too big to realistically store, we do not precompute the whole thing, and just approximate what each table entry should be using a neural network.

The pi example was just to show that how you implement a function (any function) does not matter, as long as the inputs and outputs are the same. Or to put it another way if you give me an index, then you wouldn't know whether I got the result by doing some computations or using a precomputed table.

Likewise, if you give me a sequence of tokens and I give you a probability distribution, you can't tell whether I used A NN or just consulted a precomputed table. The point is that given the same input, the table will always give the same result, and crucially, so will an llm. A table is just one type of implementation for an arbitrary function.

There is also no requirement for the state transiiltion function (a table is a special type of function) to be understandable by humans. Just because it's big enough to be beyond human comprehension, doesn't change its nature.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb am zuletzt editiert von

#255

You're correct that the formal definition of a Markov process does not exclude internal computation, and that it only requires the next state to depend solely on the current state. But what defines a classical Markov chain in practice is not just the formal dependency structure but how the transition function is structured and used. A traditional Markov chain has a discrete and enumerable state space with explicit, often simple transition probabilities between those states. LLMs do not operate this way.

The claim that an LLM is "just" a large compressed Markov chain assumes that its function is equivalent to a giant mapping of input sequences to output distributions. But this interpretation fails to account for the fundamental difference in how those distributions are generated. An LLM is not indexing a symbolic structure. It is computing results using recursive transformations across learned embeddings, where those embeddings reflect complex relationships between tokens, concepts, and tasks. That is not reducible to discrete symbolic transitions without losing the model’s generalization capabilities. You could record outputs for every sequence, but the moment you present a sequence that wasn't explicitly in that set, the Markov table breaks. The LLM does not.

Yes, you can say a table is just one implementation of a function, and from a purely mathematical perspective, any function can be implemented as a table given enough space. But the LLM’s function is general-purpose. It extrapolates. A precomputed table cannot do this unless those extrapolations are already baked in, in which case you are no longer talking about a classical Markov system. You are describing a model that encodes relationships far beyond discrete transitions.

The pi analogy applies to deterministic functions with fixed outputs, not to learned probabilistic functions that approximate conditional distributions over language. If you give an LLM a new input, it will return a meaningful distribution even if it has never seen anything like it. That behavior depends on internal structure, not retrieval. Just because a function is deterministic at temperature 0 does not mean it is a transition table. The fact that the same input yields the same output is true for any deterministic function. That does not collapse the distinction between generalization and enumeration.

So while yes, you can implement any deterministic function as a lookup table, the nature of LLMs lies in how they model relationships and extrapolate from partial information. That ability is not captured by any classical Markov model, no matter how large.
V 1 Antwort Letzte Antwort

1
B billwashere@lemmy.world

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb am zuletzt editiert von

#256

And that's pretty damn useful, but obnoxious to have expectations wildly set incorrectly.
1 Antwort Letzte Antwort

0
C communist@lemmy.frozeninferno.xyz

those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.

this proves the issue is widespread, not fundamental.
0 This user is from outside of this forum
0 This user is from outside of this forum
0ops@lemm.ee

schrieb am zuletzt editiert von

#257

Is "model" not defined as architecture+weights? Those models certainly don't share the same architecture. I might just be confused about your point though
C 1 Antwort Letzte Antwort

1
B billwashere@lemmy.world

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
N This user is from outside of this forum
N This user is from outside of this forum
notasharkinamansuit@lemmy.world

schrieb am zuletzt editiert von

#258

People think they want AI, but they don’t even know what AI is on a conceptual level.
T B 2 Antworten Letzte Antwort

4
S surph_ninja@lemmy.world

Funny how triggering it is for some people when anyone acknowledges humans are just evolved primates doing the same pattern matching.
N This user is from outside of this forum
N This user is from outside of this forum
notasharkinamansuit@lemmy.world

schrieb am zuletzt editiert von

#259

We actually have sentience, though, and are capable of creating new things and having realizations. AI isn’t real and LLMs and dispersion models are simply reiterating algorithmic patterns, no LLM or dispersion model can create anything original or expressive.

Also, we aren’t “evolved primates.” We are just primates, the thing is, primates are the most socially and cognitively evolved species on the planet, so that’s not a denigrating sentiment unless your a pompous condescending little shit.
S 1 Antwort Letzte Antwort

1
S surph_ninja@lemmy.world

It’s built by animals, and it reflects them. That’s impressive on its own. Doesn’t need to be exaggerated.
N This user is from outside of this forum
N This user is from outside of this forum
notasharkinamansuit@lemmy.world

schrieb am zuletzt editiert von

#260

Impressive = / = substantial or beneficial.
S 1 Antwort Letzte Antwort

0
A appletea@lemmy.zip

What they mean is that before Turing, "computer" was literally a person's job description. You hand a professional a stack of calculations with some typos, part of the job is correcting those out. Newfangled machine comes along with the same name as the job, among the first thing people are gonna ask about is where it fall short.

Like, if I made a machine called "assistant", it'd be natural for people to point out and ask about all the things a person can do that a machine just never could.
T This user is from outside of this forum
T This user is from outside of this forum
turmacar@lemmy.world

schrieb am zuletzt editiert von turmacar@lemmy.world

#261

And what I mean is that prior to the mid 1900s the etymology didn't exist to cause that confusion of terms. Neither Babbage's machines nor prior adding engines were called computers or calculators. They were 'machines' or 'engines'.

Babbage's machines were novel in that they could do multiple types of operations, but 'mechanical calculators' and counting machines were ~200 years old. Other mathematical tools like the abacus are obviously far older. They were not novel enough to cause confusion in anyone with even passing interest.

But there will always be people who just assume 'magic', and/or "it works like I want it to".
1 Antwort Letzte Antwort

0
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
T This user is from outside of this forum
T This user is from outside of this forum
technocrit@lemmy.dbzer0.com

schrieb am zuletzt editiert von technocrit@lemmy.dbzer0.com

#262

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.
X 1 Antwort Letzte Antwort

29
Z zacryon@feddit.org

Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb am zuletzt editiert von

#263

Particularly to counter some more baseless marketing assertions about the nature of the technology.
1 Antwort Letzte Antwort

1
S softestsapphic@lemmy.world

Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.
T This user is from outside of this forum
T This user is from outside of this forum
technocrit@lemmy.dbzer0.com

schrieb am zuletzt editiert von

#264

It's hard to to be heard when you're buried under all that sweet VC/grant money.
1 Antwort Letzte Antwort

4
T technocrit@lemmy.dbzer0.com

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.
X This user is from outside of this forum
X This user is from outside of this forum
x0x7@lemmy.world

schrieb am zuletzt editiert von x0x7@lemmy.world

#265

Even defining reason is hard and becomes a matter of philosophy more than science. For example, apply the same claims to people. Now I've given you something to think about. Or should I say the Markov chain in your head has a new topic to generate thought states for.
I 1 Antwort Letzte Antwort

8