linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Technology

356 Beiträge 149 Kommentatoren 4.1k Aufrufe

A antonim@lemmy.dbzer0.com

Right now the hype from most is finding issues with chatgpt

hype noun (1)

publicity

especially : promotional publicity of an extravagant or contrived kind

You're abusing the meaning of "hype" in order to make the two sides appear the same, because you do understand that "hype" really describes the pro-AI discourse much better.

It did find the fallacies based on what it was asked to do.

It didn't. Put the text of your comment back into GPT and tell it to argue why the fallacies are misidentified.

You act like this is fire and forget.

But you did fire and forget it. I don't even think you read the output yourself.

First I wanted to be honest with the output and not modify it.

Or maybe you were just lazy?

Personally I'm starting to find these copy-pasted AI responses to be insulting. It has the "let me Google that for you" sort of smugness around it. I can put in the text in ChatGPT myself and get the same shitty output, you know. If you can't be bothered to improve it, then there's absolutely no point in pasting it.

Given what this output gave me, I can easily keep working this to get better and better arguments.

That doesn't sound terribly efficient. Polishing a turd, as they say. These great successes of AI are never actually visible or demonstrated, they're always put off - the tech isn't quite there yet, but it's just around the corner, just you wait, just one more round of asking the AI to elaborate, just one more round of polishing the turd, just a bit more faith on the unbelievers' part...

I just feel like you can’t honestly tell me that within 10 seconds having that summary is not beneficial.

Oh sure I can tell you that, assuming that your argumentative goals are remotely honest and you're not just posting stupid AI-generated criticism to waste my time. You didn't even notice one banal way in which AI misinterpreted my comment (I didn't say SMBC is bad), and you'd probably just accept that misreading in your own supposed rewrite of the text. Misleading summaries that you have to spend additional time and effort double checking for these subtle or not so subtle failures are NOT beneficial.
M This user is from outside of this forum
M This user is from outside of this forum
melvin_ferd@lemmy.world

schrieb am zuletzt editiert von

#183

Ok let's give a test here. Let's start with understand logic. Give me a paragraph and let's see if it can find any logical fallacies. You can provide the paragraph. Only constraint is that the context has to exist within the paragraph.
1 Antwort Letzte Antwort

0
T turmacar@lemmy.world

I think because it's language.

There's a famous quote from Charles Babbage when he presented his difference engine (gear based calculator) and someone asking "if you put in the wrong figures, will the correct ones be output" and Babbage not understanding how someone can so thoroughly misunderstand that the machine is, just a machine.

People are people, the main thing that's changed since the Cuneiform copper customer complaint is our materials science and networking ability. Most things that people interact with every day, most people just assume work like it appears to on the surface.

And nothing other than a person can do math problems or talk back to you. So people assume that means intelligence.
L This user is from outside of this forum
L This user is from outside of this forum
leftzero@lemmynsfw.com

schrieb am zuletzt editiert von

#184

"if you put in the wrong figures, will the correct ones be output"

To be fair, an 1840 “computer” might be able to tell there was something wrong with the figures and ask about it or even correct them herself.

Babbage was being a bit obtuse there; people weren't familiar with computing machines yet. Computer was a job, and computers were expected to be fairly intelligent.

In fact I'd say that if anything this question shows that the questioner understood enough about the new machine to realise it was not the same as they understood a computer to be, and lacked many of their abilities, and was just looking for Babbage to confirm their suspicions.
T 1 Antwort Letzte Antwort

1
A auraithx@lemmy.dbzer0.com

While both Markov models and LLMs forget information outside their window, that’s where the similarity ends. A Markov model relies on fixed transition probabilities and treats the past as a chain of discrete states. An LLM evaluates every token in relation to every other using learned, high-dimensional attention patterns that shift dynamically based on meaning, position, and structure.

Changing one word in the input can shift the model’s output dramatically by altering how attention layers interpret relationships across the entire sequence. It’s a fundamentally richer computation that captures syntax, semantics, and even task intent, which a Markov chain cannot model regardless of how much context it sees.
V This user is from outside of this forum
V This user is from outside of this forum
vrighter@discuss.tchncs.de

schrieb am zuletzt editiert von

#185

an llm also works on fixed transition probabilities. All the training is done during the generation of the weights, which are the compressed state transition table. After that, it's just a regular old markov chain. I don't know why you seem so fixated on getting different output if you provide different input (as I said, each token generated is a separate independent invocation of the llm with a different input). That is true of most computer programs.

It's just an implementation detail. The markov chains we are used to has a very short context, due to combinatorial explosion when generating the state transition table. With llms, we can use a much much longer context. Put that context in, it runs through the completely immutable model, and out comes a probability distribution. Any calculations done during the calculation of this probability distribution is then discarded, the chosen token added to the context, and the program is run again with zero prior knowledge of any reasoning about the token it just generated. It's a seperate execution with absolutely nothing shared between them, so there can't be any "adapting" going on
A 1 Antwort Letzte Antwort

1
G gamechld@lemmy.world

Most humans don't reason. They just parrot shit too. The design is very human.
J This user is from outside of this forum
J This user is from outside of this forum
joel_feila@lemmy.world

schrieb am zuletzt editiert von

#186

Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive
1 Antwort Letzte Antwort

5
N nalivai@lemmy.world

You either an llm, or don't know how your brain works.
A This user is from outside of this forum
A This user is from outside of this forum
and009@lemmynsfw.com

schrieb am zuletzt editiert von

#187

LLMs don't know how how they work
1 Antwort Letzte Antwort

1
T thefriar@lemm.ee

Yeah, well there are a ton of people literally falling into psychosis, led by LLMs. So it’s unfortunately not that many people that already knew it.
J This user is from outside of this forum
J This user is from outside of this forum
joel_feila@lemmy.world

schrieb am zuletzt editiert von

#188

Dude they made chat gpt a little more boit licky and now many people are convinced they are literal messiahs. All it took for them was a chat bot and a few hours of talk.
1 Antwort Letzte Antwort

2
H homura1650@lemm.ee

LLMs (at least in their current form) are proper neural networks.
K This user is from outside of this forum
K This user is from outside of this forum
kescusay@lemmy.world

schrieb am zuletzt editiert von

#189

Well, technically, yes. You're right. But they're a specific, narrow type of neural network, while I was thinking of the broader class and more traditional applications, like data analysis. I should have been more specific.
1 Antwort Letzte Antwort

0
R rampantparanoia2365@lemmy.world

Fucking obviously. Until Data's positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.
J This user is from outside of this forum
J This user is from outside of this forum
jdpoz@lemmy.world

schrieb am zuletzt editiert von

#190

It’s an expensive carbon spewing parrot.
T 1 Antwort Letzte Antwort

12
L leftzero@lemmynsfw.com

"if you put in the wrong figures, will the correct ones be output"

To be fair, an 1840 “computer” might be able to tell there was something wrong with the figures and ask about it or even correct them herself.

Babbage was being a bit obtuse there; people weren't familiar with computing machines yet. Computer was a job, and computers were expected to be fairly intelligent.

In fact I'd say that if anything this question shows that the questioner understood enough about the new machine to realise it was not the same as they understood a computer to be, and lacked many of their abilities, and was just looking for Babbage to confirm their suspicions.
T This user is from outside of this forum
T This user is from outside of this forum
turmacar@lemmy.world

schrieb am zuletzt editiert von

#191
"Computer" meaning a mechanical/electro-mechanical/electrical machine wasn't used until around after WWII.

Babbag's difference/analytical engines weren't confusing because people called them a computer, they didn't.
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

Charles Babbage
If you give any computer, human or machine, random numbers, it will not give you "correct answers".

It's possible Babbage lacked the social skills to detect sarcasm. We also have several high profile cases of people just trusting LLMs to file legal briefs and official government 'studies' because the LLM "said it was real".
A 1 Antwort Letzte Antwort

0
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
C This user is from outside of this forum
C This user is from outside of this forum
communist@lemmy.frozeninferno.xyz

schrieb am zuletzt editiert von communist@lemmy.frozeninferno.xyz

#192

I think it's important to note (i'm not an llm I know that phrase triggers you to assume I am) that they haven't proven this as an inherent architectural issue, which I think would be the next step to the assertion.

do we know that they don't and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don't? That's the big question that needs answered. It's still possible that we just haven't properly incentivized reason over memorization during training.

if someone can objectively answer "no" to that, the bubble collapses.
K M 2 Antworten Letzte Antwort

14
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
S This user is from outside of this forum
S This user is from outside of this forum
skisnow@lemmy.ca

schrieb am zuletzt editiert von

#193

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.
T 1 Antwort Letzte Antwort

26
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
F This user is from outside of this forum
F This user is from outside of this forum
freakinsteve@lemmy.world

schrieb am zuletzt editiert von

#194

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK
8 J T 3 Antworten Letzte Antwort

32
G gamechld@lemmy.world

Most humans don't reason. They just parrot shit too. The design is very human.
S This user is from outside of this forum
S This user is from outside of this forum
skisnow@lemmy.ca

schrieb am zuletzt editiert von

#195

I hate this analogy. As a throwaway whimsical quip it'd be fine, but it's specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it's lowered my tolerance for it as a topic even if you did intend it flippantly.
G 1 Antwort Letzte Antwort

9
F freakinsteve@lemmy.world

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK
8 This user is from outside of this forum
8 This user is from outside of this forum
800xl@lemmy.world

schrieb am zuletzt editiert von

#196

Extept for Siri, right? Lol
T 1 Antwort Letzte Antwort

1
8 800xl@lemmy.world

Extept for Siri, right? Lol
T This user is from outside of this forum
T This user is from outside of this forum
threeme2189@lemmy.world

schrieb am zuletzt editiert von

#197

Apple Intelligence
1 Antwort Letzte Antwort

1
J jdpoz@lemmy.world

It’s an expensive carbon spewing parrot.
T This user is from outside of this forum
T This user is from outside of this forum
threeme2189@lemmy.world

schrieb am zuletzt editiert von

#198

It's a very resource intensive autocomplete
1 Antwort Letzte Antwort

9
I intensely_human@lemm.ee

Fair, but the same is true of me. I don't actually "reason"; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a "nasty logic error" pattern match at some point in the process, I "know" I've found a "flaw in the argument" or "bug in the design".

But there's no from-first-principles method by which I developed all these patterns; it's just things that have survived the test of time when other patterns have failed me.

I don't think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.
C This user is from outside of this forum
C This user is from outside of this forum
conicalscientist@lemmy.world

schrieb am zuletzt editiert von

#199

This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).

Higher reasoning is taught to humans. We have the capability. That's why we spend the first quarter of our lives in education. Sometimes not all of us are able.

I'm sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.
1 Antwort Letzte Antwort

1
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
M This user is from outside of this forum
M This user is from outside of this forum
minoscopede@lemmy.world

schrieb am zuletzt editiert von minoscopede@lemmy.world

#200

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.
Z T T R K 7 Antworten Letzte Antwort

68
V vrighter@discuss.tchncs.de

an llm also works on fixed transition probabilities. All the training is done during the generation of the weights, which are the compressed state transition table. After that, it's just a regular old markov chain. I don't know why you seem so fixated on getting different output if you provide different input (as I said, each token generated is a separate independent invocation of the llm with a different input). That is true of most computer programs.

It's just an implementation detail. The markov chains we are used to has a very short context, due to combinatorial explosion when generating the state transition table. With llms, we can use a much much longer context. Put that context in, it runs through the completely immutable model, and out comes a probability distribution. Any calculations done during the calculation of this probability distribution is then discarded, the chosen token added to the context, and the program is run again with zero prior knowledge of any reasoning about the token it just generated. It's a seperate execution with absolutely nothing shared between them, so there can't be any "adapting" going on
A This user is from outside of this forum
A This user is from outside of this forum
auraithx@lemmy.dbzer0.com

schrieb am zuletzt editiert von auraithx@lemmy.dbzer0.com

#201

Because transformer architecture is not equivalent to a probabilistic lookup. A Markov chain assigns probabilities based on a fixed-order state transition, without regard to deeper structure or token relationships. An LLM processes the full context through many layers of non-linear functions and attention heads, each layer dynamically weighting how each token influences every other token.

Although weights do not change during inference, the behavior of the model is not fixed in the way a Markov chain’s state table is. The same model can respond differently to very similar prompts, not just because the inputs differ, but because the model interprets structure, syntax, and intent in ways that are contextually dependent. That is not just longer context-it is fundamentally more expressive computation.

The process is stateless across calls, yes, but it is not blind. All relevant information lives inside the prompt, and the model uses the attention mechanism to extract meaning from relationships across the sequence. Each new input changes the internal representation, so the output reflects contextual reasoning, not a static response to a matching pattern. Markov chains cannot replicate this kind of behavior no matter how many states they include.
V 1 Antwort Letzte Antwort

0
A allah@lemm.ee

LOOK MAA I AM ON FRONT PAGE
X This user is from outside of this forum
X This user is from outside of this forum
xatolos@reddthat.com

schrieb am zuletzt editiert von

#202

So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

Huh.
C 1 Antwort Letzte Antwort

8

Anmelden zum Antworten

R

Adding Text to Your Ebitengine Game
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

0 Stimmen

1 Beiträge

6 Aufrufe

Niemand hat geantwortet
D

Grok AI to be available in Tesla vehicles next week, Elon Musk says
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
67

1

205 Stimmen

67 Beiträge

845 Aufrufe

R

The prevalence of Nazis among founders of car makers historically is definitely worth noting. With both Ford and Musk as Nazi sympathizers, it's a definite majority in the USA.
W

Dubai to debut restaurant operated by an AI chef
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

1 Stimmen

1 Beiträge

20 Aufrufe

Niemand hat geantwortet
A

You Don't Need a Big Budget for Big Security: Secure Your App with a Free, Powerful WAF
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

2

1 Stimmen

2 Beiträge

29 Aufrufe

A

If you're a developer, a startup founder, or part of a small team, you've poured countless hours into building your web application. You've perfected the UI, optimized the database, and shipped features your users love. But in the rush to build and deploy, a critical question often gets deferred: is your application secure? For many, the answer is a nervous "I hope so." The reality is that without a proper defense, your application is exposed to a barrage of automated attacks hitting the web every second. Threats like SQL Injection, Cross-Site Scripting (XSS), and Remote Code Execution are not just reserved for large enterprises; they are constant dangers for any application with a public IP address. The Security Barrier: When Cost and Complexity Get in the Way The standard recommendation is to place a Web Application Firewall (WAF) in front of your application. A WAF acts as a protective shield, inspecting incoming traffic and filtering out malicious requests before they can do any damage. It’s a foundational piece of modern web security. So, why doesn't everyone have one? Historically, robust WAFs have been complex and expensive. They required significant budgets, specialized knowledge to configure, and ongoing maintenance, putting them out of reach for students, solo developers, non-profits, and early-stage startups. This has created a dangerous security divide, leaving the most innovative and resource-constrained projects the most vulnerable. But that is changing. Democratizing Security: The Power of a Community WAF Security should be a right, not a privilege. Recognizing this, the landscape is shifting towards more accessible, community-driven tools. The goal is to provide powerful, enterprise-grade protection to everyone, for free. This is the principle behind the HaltDos Community WAF. It's a no-cost, perpetually free Web Application Firewall designed specifically for the community that has been underserved for too long. It’s not a stripped-down trial version; it’s a powerful security tool designed to give you immediate and effective protection against the OWASP Top 10 and other critical web threats. What Can You Actually Do with It? With a community WAF, you can deploy a security layer in minutes that: Blocks Malicious Payloads: Get instant, out-of-the-box protection against common attack patterns like SQLi, XSS, RCE, and more. Stops Bad Bots: Prevent malicious bots from scraping your content, attempting credential stuffing, or spamming your forms. Gives You Visibility: A real-time dashboard shows you exactly who is trying to attack your application and what methods they are using, providing invaluable security intelligence. Allows Customization: You can add your own custom security rules to tailor the protection specifically to your application's logic and technology stack. The best part? It can be deployed virtually anywhere—on-premises, in a private cloud, or with any major cloud provider like AWS, Azure, or Google Cloud. Get Started in Minutes You don't need to be a security guru to use it. The setup is straightforward, and the value is immediate. Protecting the project, you've worked so hard on is no longer a question of budget. Download: Get the free Community WAF from the HaltDos site. Deploy: Follow the simple instructions to set it up with your web server (it’s compatible with Nginx, Apache, and others). Secure: Watch the dashboard as it begins to inspect your traffic and block threats in real-time. Security is a journey, but it must start somewhere. For developers, startups, and anyone running a web application on a tight budget, a community WAF is the perfect first step. It's powerful, it's easy, and it's completely free.
B

Covert Web-to-App Tracking via Localhost on Android
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

43 Stimmen

2 Beiträge

32 Aufrufe

M

Thanks for sharing this, it is an interesting read (though an additional comment about what this about would have been helpful). I want to say I am glad I do not use either of these services but Yandex implementation seems so bad that it does not matter, as any app could receive their data
A

Palantir’s Idea of Peace
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
12

22 Stimmen

12 Beiträge

120 Aufrufe

A

"Totally not a narc, inc."
P

German court sends Volkswagen execs to prison over Dieselgate scandal
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
133

1

2k Stimmen

133 Beiträge

1k Aufrufe

S

Tokyo banned diesel motors in the late 90s. As far as I know that didn't kill Toyota. At the same time European car makers started to lobby for particle filters that were supposed to solve everything. The politics who where naive enough to believe them do share responsibility, but not as much as the european auto industry that created this whole situation. Also, you implies that laws are made by politicians without any intervention of the industries whatsoever. I think you know that it is not how it works.
A

HMD, Lava to launch feature phones with direct-to-mobile technology, Developed in collaboration with Tejas Networks and powered by Saankhya's chipset, these phones can stream content without internet
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

9 Stimmen

6 Beiträge

65 Aufrufe

N

So they.just reinvented the DVB-T tuner. Edit: I looked it up and it's literally just that. The fact they're shoving it into feature phones is interesting.