linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Technology

347 Beiträge 149 Kommentatoren 17 Aufrufe

M melvin_ferd@lemmy.world

This is why I say these articles are so similar to how right wing media covers issues about immigrants.

There's some weird media push to convince the left to hate AI. Think of all the headlines for these issues. There are so many similarities. They're taking jobs. They are a threat to our way of life. The headlines talk about how they will sexual assault your wife, your children, you. Threats to the environment. There's articles like this where they take something known as twist it to make it sound nefarious to keep the story alive and avoid decay of interest.

Then when they pass laws, we're all primed to accept them removing whatever it is that advantageous them and disadvantageous us.
H This user is from outside of this forum
H This user is from outside of this forum
hansolo@lemmy.today

schrieb zuletzt editiert von

#28

Because it's a fear-mongering angle that still sells. AI has been a vehicle for scifi for so long that trying to convince Boomers that of won't kill us all is the hard part.

I'm a moderate user for code and skeptic of LLM abilities, but 5 years from now when we are leveraging ML models for groundbreaking science and haven't been nuked by SkyNet, all of this will look quaint and silly.
T 1 Antwort Letzte Antwort

8
N nanook@lemm.ee

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.
J This user is from outside of this forum
J This user is from outside of this forum
johnedwa@sopuli.xyz

schrieb zuletzt editiert von johnedwa@sopuli.xyz

#29

"It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'." -Pamela McCorduck´.
It's called the AI Effect.

As Larry Tesler puts it, "AI is whatever hasn't been done yet.".
K T V 3 Antworten Letzte Antwort

26
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
B This user is from outside of this forum
B This user is from outside of this forum
brsrklf@jlai.lu

schrieb zuletzt editiert von

#30

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.
A T 2 Antworten Letzte Antwort

38
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
N This user is from outside of this forum
N This user is from outside of this forum
naich@lemmings.world

schrieb zuletzt editiert von

#31

So they have worked out that LLMs do what they were programmed to do in the way that they were programmed? Shocking.
1 Antwort Letzte Antwort

2
1 1rre@discuss.tchncs.de

The difference between reasoning models and normal models is reasoning models are two steps, to oversimplify it a little they prompt "how would you go about responding to this" then prompt "write the response"

It's still predicting the most likely thing to come next, but the difference is that it gives the chance for the model to write the most likely instructions to follow for the task, then the most likely result of following the instructions - both of which are much more conformant to patterns than a single jump from prompt to response.
K This user is from outside of this forum
K This user is from outside of this forum
kescusay@lemmy.world

schrieb zuletzt editiert von

#32

But it still manages to fuck it up.

I've been experimenting with using Claude's Sonnet model in Copilot in agent mode for my job, and one of the things that's become abundantly clear is that it has certain types of behavior that are heavily represented in the model, so it assumes you want that behavior even if you explicitly tell it you don't.

Say you're working in a yarn workspaces project, and you instruct Copilot to build and test a new dashboard using an instruction file. You'll need to include explicit and repeated reminders all throughout the file to use yarn, not NPM, because even though yarn is very popular today, there are so many older examples of using NPM in its model that it's just going to assume that's what you actually want - thereby fucking up your codebase.

I've also had lots of cases where I tell it I don't want it to edit any code, just to analyze and explain something that's there and how to update it... and then I have to stop it from editing code anyway, because halfway through it forgot that I didn't want edits, just explanations.
S R 2 Antworten Letzte Antwort

3
M mfed1122@discuss.tchncs.de

This sort of thing has been published a lot for awhile now, but why is it assumed that this isn't what human reasoning consists of? Isn't all our reasoning ultimately a form of pattern memorization? I sure feel like it is. So to me all these studies that prove they're "just" memorizing patterns don't prove anything other than that, unless coupled with research on the human brain to prove we do something different.
E This user is from outside of this forum
E This user is from outside of this forum
endmaker@ani.social

schrieb zuletzt editiert von

#33

You've hit the nail on the head.

Personally, I wish that there's more progress in our understanding of human intelligence.
T 1 Antwort Letzte Antwort

9
M mfed1122@discuss.tchncs.de

This sort of thing has been published a lot for awhile now, but why is it assumed that this isn't what human reasoning consists of? Isn't all our reasoning ultimately a form of pattern memorization? I sure feel like it is. So to me all these studies that prove they're "just" memorizing patterns don't prove anything other than that, unless coupled with research on the human brain to prove we do something different.
L This user is from outside of this forum
L This user is from outside of this forum
lesserabe@lemmy.world

schrieb zuletzt editiert von

#34

Agreed. We don't seem to have a very cohesive idea of what human consciousness is or how it works.
T 1 Antwort Letzte Antwort

10
4 4am@lemm.ee

No, and to make that work using the current structures we use for creating AI models we’d probably need all the collective computing power on earth at once.
S This user is from outside of this forum
S This user is from outside of this forum
sarge@startrek.website

schrieb zuletzt editiert von

#35

...... So you're saying there's a chance?
A 1 Antwort Letzte Antwort

6
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE

archive.is

(archive.is)
A This user is from outside of this forum
A This user is from outside of this forum
atlien51@lemm.ee

schrieb zuletzt editiert von

#36

Employers who are foaming at the mouth at the thought of replacing their workers with cheap AI:

🫢
M 1 Antwort Letzte Antwort

9
B brsrklf@jlai.lu

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#37

This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.
1 Antwort Letzte Antwort

34
S sev@nullterra.org

Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#38

Unlike Markov models, modern LLMs use transformers that attend to full contexts, enabling them to simulate structured, multi-step reasoning (albeit imperfectly). While they don’t initiate reasoning like humans, they can generate and refine internal chains of thought when prompted, and emerging frameworks (like ReAct or Toolformer) allow them to update working memory via external tools. Reasoning is limited, but not physically impossible, it’s evolving beyond simple pattern-matching toward more dynamic and compositional processing.
S R V 3 Antworten Letzte Antwort

13
R reksas@sopuli.xyz

does ANY model reason at all?
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#39

Define reason.

Like humans? Of course not. They lack intent, awareness, and grounded meaning. They don’t “understand” problems, they generate token sequences.
R 1 Antwort Letzte Antwort

6
D dojan@pawb.social

There are search engines that do this better. There’s a world out there beyond Google.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#40

Like what?

I don’t think there’s any search engine better than Perplexity. And for scientific research Consensus is miles ahead.
C D 2 Antworten Letzte Antwort

2
A auraithx@lemmy.dbzer0.com

Unlike Markov models, modern LLMs use transformers that attend to full contexts, enabling them to simulate structured, multi-step reasoning (albeit imperfectly). While they don’t initiate reasoning like humans, they can generate and refine internal chains of thought when prompted, and emerging frameworks (like ReAct or Toolformer) allow them to update working memory via external tools. Reasoning is limited, but not physically impossible, it’s evolving beyond simple pattern-matching toward more dynamic and compositional processing.
S This user is from outside of this forum
S This user is from outside of this forum
spankmonkey@lemmy.world

schrieb zuletzt editiert von

#41

Reasoning is limited

Most people wouldn't call zero of something 'limited'.
A 1 Antwort Letzte Antwort

9
S spankmonkey@lemmy.world

Reasoning is limited

Most people wouldn't call zero of something 'limited'.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#42

The paper doesn’t say LLMs can’t reason, it shows that their reasoning abilities are limited and collapse under increasing complexity or novel structure.
S T 2 Antworten Letzte Antwort

12
K kescusay@lemmy.world

But it still manages to fuck it up.

I've been experimenting with using Claude's Sonnet model in Copilot in agent mode for my job, and one of the things that's become abundantly clear is that it has certain types of behavior that are heavily represented in the model, so it assumes you want that behavior even if you explicitly tell it you don't.

Say you're working in a yarn workspaces project, and you instruct Copilot to build and test a new dashboard using an instruction file. You'll need to include explicit and repeated reminders all throughout the file to use yarn, not NPM, because even though yarn is very popular today, there are so many older examples of using NPM in its model that it's just going to assume that's what you actually want - thereby fucking up your codebase.

I've also had lots of cases where I tell it I don't want it to edit any code, just to analyze and explain something that's there and how to update it... and then I have to stop it from editing code anyway, because halfway through it forgot that I didn't want edits, just explanations.
S This user is from outside of this forum
S This user is from outside of this forum
spankmonkey@lemmy.world

schrieb zuletzt editiert von

#43

I’ve also had lots of cases where I tell it I don’t want it to edit any code, just to analyze and explain something that’s there and how to update it… and then I have to stop it from editing code anyway, because halfway through it forgot that I didn’t want edits, just explanations.

I find it hilarious that the only people these LLMs mimic are the incompetent ones. I had a coworker that changed things when asked to explain constantly.
1 Antwort Letzte Antwort

2
A auraithx@lemmy.dbzer0.com

The paper doesn’t say LLMs can’t reason, it shows that their reasoning abilities are limited and collapse under increasing complexity or novel structure.
S This user is from outside of this forum
S This user is from outside of this forum
spankmonkey@lemmy.world

schrieb zuletzt editiert von

#44

I agree with the author.

If these models were truly "reasoning," they should get better with more compute and clearer instructions.

The fact that they only work up to a certain point despite increased resources is proof that they are just pattern matching, not reasoning.
A 1 Antwort Letzte Antwort

6
M mnbychoice@midwest.social

The "Apple" part. CEOs only care what companies say.
K This user is from outside of this forum
K This user is from outside of this forum
kadup@lemmy.world

schrieb zuletzt editiert von

#45

Apple is significantly behind and arrived late to the whole AI hype, so of course it's in their absolute best interest to keep showing how LLMs aren't special or amazingly revolutionary.

They're not wrong, but the motivation is also pretty clear.
D M H V 4 Antworten Letzte Antwort

52
J johnedwa@sopuli.xyz

"It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'." -Pamela McCorduck´.
It's called the AI Effect.

As Larry Tesler puts it, "AI is whatever hasn't been done yet.".
K This user is from outside of this forum
K This user is from outside of this forum
kadup@lemmy.world

schrieb zuletzt editiert von

#46

That entire paragraph is much better at supporting the precise opposite argument. Computers can beat Kasparov at chess, but they're clearly not thinking when making a move - even if we use the most open biological definitions for thinking.
G C 2 Antworten Letzte Antwort

18
S spankmonkey@lemmy.world

I agree with the author.

If these models were truly "reasoning," they should get better with more compute and clearer instructions.

The fact that they only work up to a certain point despite increased resources is proof that they are just pattern matching, not reasoning.
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb zuletzt editiert von

#47

Performance eventually collapses due to architectural constraints, this mirrors cognitive overload in humans: reasoning isn’t just about adding compute, it requires mechanisms like abstraction, recursion, and memory. The models’ collapse doesn’t prove “only pattern matching”, it highlights that today’s models simulate reasoning in narrow bands, but lack the structure to scale it reliably. That is a limitation of implementation, not a disproof of emergent reasoning.
T 1 Antwort Letzte Antwort

10

Anmelden zum Antworten

J

Palantir Exposed: The New Deep State [27:27 | JUN 10 2025 | Glenn Greenwald]
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
27

54 Stimmen

27 Beiträge

0 Aufrufe

A

We've crossed it a long time ago. Out of the ~1.6 million russian men who took part/are taking part (dead, wounded, currently fighting) in the full scale invasion of Ukraine, how many were children in 2014 when the invasion started? While adults russians are clearly responsible, when are you fighting the russians you cannot put your head in the sand and wish away the fact that there is a ~65% chance a russian child will grow up as an open supporter of genocidal imperialism and another 20% chance (for a total 85%) that they will be supporters of imperialism but perhaps not open supporters of genocidal aims. P.S. Just an FYI, standard critiques about polling numbers are not relevant here, as these numbers exclude preference falsification (i.e. someone being afraid to state their true view), the "nominal" results are even higher.
E

Why Silicon Valley Needs Immigration
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

36 Stimmen

4 Beiträge

0 Aufrufe

A

"Because theyŕe greedy fucks". There, saved you a click.
S

Bill Atkinson, Who Made Computers Easier to Use, Is Dead at 74
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

0 Stimmen

1 Beiträge

0 Aufrufe

Niemand hat geantwortet
P

Is it OK to leave device chargers plugged in all the time? An expert explains
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
41

1

21 Stimmen

41 Beiträge

2 Aufrufe

W

that's because phone makers were pumping out garbage chargers with bare minimum performance for every single phone, isn't it?
P

This Printer company served you malware for months, called them false positives
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
24

1

297 Stimmen

24 Beiträge

8 Aufrufe

S

This is not a typical home or office printer, very specialized.
D

The Document Foundation is proud to release LibreOffice 25.2.3
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

265 Stimmen

7 Beiträge

2 Aufrufe

S

View -> User Interface -> Tabs It already exists but is nowhere near as good as MS Office (like everything with LO).
F

The Fitbit Sense line is cooked because it was too good for Google
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
3

1

12 Stimmen

3 Beiträge

2 Aufrufe

F

The new Pebble watches look interesting. Relatively basic, but long battery life (they promise) and open-source operating system.
S

Microsoft's AI Secretly Copying All Your Private Messages
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

0 Stimmen

4 Beiträge

2 Aufrufe

S

Forgive me for not explaining better. Here are the terms potentially needing explanation. Provisioning in this case is initial system setup, the kind of stuff you would do manually after a fresh install, but usually implies a regimented and repeatable process. Virtual Machine (VM) snapshots are like a save state in a game, and are often used to reset a virtual machine to a particular known-working condition. Preboot Execution Environment (PXE, aka ‘network boot’) is a network adapter feature that lets you boot a physical machine from a hosted network image rather than the usual installation on locally attached storage. It’s probably tucked away in your BIOS settings, but many computers have the feature since it’s a common requirement in commercial deployments. As with the VM snapshot described above, a PXE image is typically a known-working state that resets on each boot. Non-virtualized means not using hardware virtualization, and I meant specifically not running inside a virtual machine. Local-only means without a network or just not booting from a network-hosted image. Telemetry refers to data collecting functionality. Most software has it. Windows has a lot. Telemetry isn’t necessarily bad since it can, for example, help reveal and resolve bugs and usability problems, but it is easily (and has often been) abused by data-hungry corporations like MS, so disabling it is an advisable precaution. MS = Microsoft OSS = Open Source Software Group policies are administrative settings in Windows that control standards (for stuff like security, power management, licensing, file system and settings access, etc.) for user groups on a machine or network. Most users stick with the defaults but you can edit these yourself for a greater degree of control. Docker lets you run software inside “containers” to isolate them from the rest of the environment, exposing and/or virtualizing just the resources they need to run, and Compose is a related tool for defining one or more of these containers, how they interact, etc. To my knowledge there is no one-to-one equivalent for Windows. Obviously, many of these concepts relate to IT work, as are the use-cases I had in mind, but the software is simple enough for the average user if you just pick one of the premade playbooks. (The Atlas playbook is popular among gamers, for example.) Edit: added explanations for docker and telemetry