linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

I'm looking for an article showing that LLMs don't know how they work internally

Technology

80 Beiträge 32 Kommentatoren 5 Aufrufe

G glizzyguzzler@lemmy.blahaj.zone

You can prove it’s not by doing some matrix multiplication and seeing its matrix multiplication. Much easier way to go about it
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#34

Yes, neural networks can be implemented with matrix operations. What does that have to do with proving or disproving the ability to reason? You didn't post a relevant or complete thought

Your comment is like saying an audio file isn't really music because it's just a series of numbers.
G 1 Antwort Letzte Antwort

14
T theunknownmuncher@lemmy.world

You're confusing the confirmation that the LLM cannot explain it's under-the-hood reasoning as text output, with a confirmation of not being able to reason at all. Anthropic is not claiming that it cannot reason. They actually find that it performs complex logic and behavior like planning ahead.
A This user is from outside of this forum
A This user is from outside of this forum
adespoton@lemmy.ca

schrieb zuletzt editiert von

#35

No, they really don’t. It’s a large language model. Input cues instruct it as to which weighted path through the matrix to take. Those paths are complex enough that the human mind can’t hold all the branches and weights at the same time. But there’s no planning going on; the model can’t backtrack a few steps, consider different outcomes and run a meta analysis. Other reasoning models can do that, but not language models; language models are complex predictive translators.
T 1 Antwort Letzte Antwort

2
A adespoton@lemmy.ca

No, they really don’t. It’s a large language model. Input cues instruct it as to which weighted path through the matrix to take. Those paths are complex enough that the human mind can’t hold all the branches and weights at the same time. But there’s no planning going on; the model can’t backtrack a few steps, consider different outcomes and run a meta analysis. Other reasoning models can do that, but not language models; language models are complex predictive translators.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von

#36

To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.

Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.

actually read the research?
G 1 Antwort Letzte Antwort

5
T treczoks@lemmy.world

I've read that article. They used something they called an "MRI for AIs", and checked e.g. how an AI handled math questions, and then asked the AI how it came to that answer, and the pathways actually differed. While the AI talked about using a textbook answer, it actually did a different approach. That's what I remember of that article.

But yes, it exists, and it is science, not TicTok
L This user is from outside of this forum
L This user is from outside of this forum
lgsp@feddit.it

schrieb zuletzt editiert von

#37

Thank you. I found the article, linkin the OP
1 Antwort Letzte Antwort

0
M markovs_gun@lemmy.world

"Researchers" did a thing I did the first day I was actually able to ChatGPT and came to a conclusion that is in the disclaimers on the ChatGPT website. Can I get paid to do this kind of "research?" If you've even read a cursory article about how LLMs work you'd know that asking them what their reasoning is for anything doesn't work because the answer would just always be an explanation of how LLMs work generally.
L This user is from outside of this forum
L This user is from outside of this forum
lgsp@feddit.it

schrieb zuletzt editiert von

#38

Very arrogant answer. Good that you have intuition, but the article is serious, especially given how LLMs are used today. The link to it is in the OP now, but I guess you already know everything...
1 Antwort Letzte Antwort

2
F franzcoz@feddit.cl

There was a study by Anthropic, the company behind Claude, that developed another AI that they used as a sort of "brain scanner" for the LLM, in the sense that allowed them to see sort of a model of how the LLM "internal process" worked
L This user is from outside of this forum
L This user is from outside of this forum
lgsp@feddit.it

schrieb zuletzt editiert von

#39

Yes, that's it. I added the link in the OP,
1 Antwort Letzte Antwort

1
V voldemort@lemmy.world

I agree. This is the exact problem I think people need to face with nural network AIs. They work the exact same way we do. Even if we analysed the human brain it would look like wires connected to wires with different resistances all over the place with some other chemical influences.

I think everyone forgets that nural networks were used in AI to replicate how animal brains work, and clearly if it worked for us to get smart then it should work for something synthetic. Well we've certainly answered that now.

Everyone being like "oh it's just a predictive model and it's all math and math can't be intelligent" are questioning exactly how their own brains work. We are just prediction machines, the brain releases dopamine when it correctly predicts things, it self learns from correctly assuming how things work. We modelled AI off of ourselves. And if we don't understand how we work, of course we're not gonna understand how it works.
L This user is from outside of this forum
L This user is from outside of this forum
lgsp@feddit.it

schrieb zuletzt editiert von

#40

Even if LLM "neurons" and their interconnections are modeled to the biological ones, LLMs aren't modeled on human brain, where a lot is not understood.

The first thing is that how the neurons are organized is completely different. Think about the cortex and the transformer.

Second is the learning process. Nowhere close.

The fact explained in the article about how we do math, through logical steps while LLMs use resemblance is a small but meaningful example. And it also shows that you can see how LLMs work, it's just very difficult
B 1 Antwort Letzte Antwort

3
T theunknownmuncher@lemmy.world

Yes, neural networks can be implemented with matrix operations. What does that have to do with proving or disproving the ability to reason? You didn't post a relevant or complete thought

Your comment is like saying an audio file isn't really music because it's just a series of numbers.
G This user is from outside of this forum
G This user is from outside of this forum
glizzyguzzler@lemmy.blahaj.zone

schrieb zuletzt editiert von

#41

Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data

“An LLM model isn’t really an LLM because it’s just a series of numbers”

But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed

And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think
D T B 3 Antworten Letzte Antwort

1
V voldemort@lemmy.world

I agree. This is the exact problem I think people need to face with nural network AIs. They work the exact same way we do. Even if we analysed the human brain it would look like wires connected to wires with different resistances all over the place with some other chemical influences.

I think everyone forgets that nural networks were used in AI to replicate how animal brains work, and clearly if it worked for us to get smart then it should work for something synthetic. Well we've certainly answered that now.

Everyone being like "oh it's just a predictive model and it's all math and math can't be intelligent" are questioning exactly how their own brains work. We are just prediction machines, the brain releases dopamine when it correctly predicts things, it self learns from correctly assuming how things work. We modelled AI off of ourselves. And if we don't understand how we work, of course we're not gonna understand how it works.
S This user is from outside of this forum
S This user is from outside of this forum
saleh@feddit.org

schrieb zuletzt editiert von

#42

LLMs among other things lack the whole neurotransmitter "live" regulation aspect and plasticity of the brain.

We are nowhere near a close representation of actual brains. LLMs to brains are like a horse carriage compared to modern cars. Yes they have four wheels and they move, and cars also need four wheels and move, but that is far from being close to each other.
G 1 Antwort Letzte Antwort

5
G glizzyguzzler@lemmy.blahaj.zone

Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data

“An LLM model isn’t really an LLM because it’s just a series of numbers”

But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed

And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think
D This user is from outside of this forum
D This user is from outside of this forum
darkdarkhouse@lemmy.sdf.org

schrieb zuletzt editiert von

#43

Do LLMs not exhibit emergent behaviour? But who am I, a simple skin-bag of chemicals, to really say.
G 1 Antwort Letzte Antwort

3
S saleh@feddit.org

LLMs among other things lack the whole neurotransmitter "live" regulation aspect and plasticity of the brain.

We are nowhere near a close representation of actual brains. LLMs to brains are like a horse carriage compared to modern cars. Yes they have four wheels and they move, and cars also need four wheels and move, but that is far from being close to each other.
G This user is from outside of this forum
G This user is from outside of this forum
genius@lemmy.zip

schrieb zuletzt editiert von

#44

So LLMs are like a human with anterograde amnesia. They're like Dory.
1 Antwort Letzte Antwort

1
V voldemort@lemmy.world

I agree. This is the exact problem I think people need to face with nural network AIs. They work the exact same way we do. Even if we analysed the human brain it would look like wires connected to wires with different resistances all over the place with some other chemical influences.

I think everyone forgets that nural networks were used in AI to replicate how animal brains work, and clearly if it worked for us to get smart then it should work for something synthetic. Well we've certainly answered that now.

Everyone being like "oh it's just a predictive model and it's all math and math can't be intelligent" are questioning exactly how their own brains work. We are just prediction machines, the brain releases dopamine when it correctly predicts things, it self learns from correctly assuming how things work. We modelled AI off of ourselves. And if we don't understand how we work, of course we're not gonna understand how it works.
P This user is from outside of this forum
P This user is from outside of this forum
patatahooligan@lemmy.world

schrieb zuletzt editiert von

#45

They work the exact same way we do.

Two things being difficult to understand does not mean that they are the exact same.
V 1 Antwort Letzte Antwort

7
G glizzyguzzler@lemmy.blahaj.zone

You can prove it’s not by doing some matrix multiplication and seeing its matrix multiplication. Much easier way to go about it
W This user is from outside of this forum
W This user is from outside of this forum
whaleross@lemmy.world

schrieb zuletzt editiert von

#46

People that can not do Matrix multiplication do not possess the basic concepts of intelligence now? Or is software that can do matrix multiplication intelligent?
F G 2 Antworten Letzte Antwort

4
T theunknownmuncher@lemmy.world

https://www.anthropic.com/research/tracing-thoughts-language-model its this one
A This user is from outside of this forum
A This user is from outside of this forum
anzo@programming.dev

schrieb zuletzt editiert von

#47

Here's a book for a different audience. Explains in layman terms why to be wary about this tech, https://thebullshitmachines.com/
1 Antwort Letzte Antwort

2
L lgsp@feddit.it
I found the aeticle in a post on the fediverse, and I can't find it anymore.

The reaserchers asked a simple mathematical question to an LLM ( like 7+4) and then could see how internally it worked by finding similar paths, but nothing like performing mathematical reasoning, even if the final answer was correct.

Then they asked the LLM to explain how it found the result, what was it's internal reasoning. The answer was detailed step by step mathematical logic, like a human explaining how to perform an addition.

This showed 2 things:
- LLM don't "know" how they work
- the second answer was a rephrasing of original text used for training that explain how math works, so LLM just used that as an explanation
I think it was a very interesting an meaningful analysis

Can anyone help me find this?

EDIT: thanks to @theunknownmuncher
@lemmy.world
https://www.anthropic.com/research/tracing-thoughts-language-model its this one

EDIT2: I'm aware LLM dont "know" anything and don't reason, and it's exactly why I wanted to find the article. Some more details here: https://feddit.it/post/18191686/13815095
K This user is from outside of this forum
K This user is from outside of this forum
keenflame@feddit.nu

schrieb zuletzt editiert von

#48

It's the anthropic article you are looking for, where they performed open brain surgery equivalent to find out that they do maths in very strange and eerily humanlike operations, like they will estimate, then if it goes over calculate the last digit like I do. It sucks as a counting technique though
1 Antwort Letzte Antwort

6
G glizzyguzzler@lemmy.blahaj.zone

Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data

“An LLM model isn’t really an LLM because it’s just a series of numbers”

But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed

And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#49

LOL you didn't really make the point you thought you did. It isn't an "improper comparison" (it's called a false equivalency FYI), because there isn't a real distinction between information and this thing you just made up called "basic action on data", but anyway have it your way:

Your comment is still exactly like saying an audio pipeline isn't really playing music because it's actually just doing basic math.
G 1 Antwort Letzte Antwort

3
V voldemort@lemmy.world

I agree. This is the exact problem I think people need to face with nural network AIs. They work the exact same way we do. Even if we analysed the human brain it would look like wires connected to wires with different resistances all over the place with some other chemical influences.

I think everyone forgets that nural networks were used in AI to replicate how animal brains work, and clearly if it worked for us to get smart then it should work for something synthetic. Well we've certainly answered that now.

Everyone being like "oh it's just a predictive model and it's all math and math can't be intelligent" are questioning exactly how their own brains work. We are just prediction machines, the brain releases dopamine when it correctly predicts things, it self learns from correctly assuming how things work. We modelled AI off of ourselves. And if we don't understand how we work, of course we're not gonna understand how it works.
I This user is from outside of this forum
I This user is from outside of this forum
ipkpjersi@lemmy.ml

schrieb zuletzt editiert von

#50

I agree. This is the exact problem I think people need to face with nural network AIs. They work the exact same way we do.

I don't think this is a fair way of summarizing it. You're making it sound like we have AGI, which we do not have AGI and we may never have AGI.
V 1 Antwort Letzte Antwort

4
L lgsp@feddit.it

Even if LLM "neurons" and their interconnections are modeled to the biological ones, LLMs aren't modeled on human brain, where a lot is not understood.

The first thing is that how the neurons are organized is completely different. Think about the cortex and the transformer.

Second is the learning process. Nowhere close.

The fact explained in the article about how we do math, through logical steps while LLMs use resemblance is a small but meaningful example. And it also shows that you can see how LLMs work, it's just very difficult
B This user is from outside of this forum
B This user is from outside of this forum
bodilessgaze@sh.itjust.works

schrieb zuletzt editiert von

#51

I agree, but I'm not sure it matters when it comes to the big questions, like "what separates us from the LLMs?" Answering that basically amounts to answering "what does it mean to be human?", which has been stumping philosophers for millennia.

It's true that artificial neurons are significant different than biological ones, but are biological neurons what make us human? I'd argue no. Animals have neurons, so are they human? Also, if we ever did create a brain simulation that perfectly replicated someone's brain down to the cellular level, and that simulation behaved exactly like the original, I would characterize that as a human.

It's also true LLMs can't learn, but there are plenty of people with anterograde amnesia that can't either.

This feels similar to the debates about what separates us from other animal species. It used to be thought that humans were qualitatively different than other species by virtue of our use of tools, language, and culture. Then it was discovered that plenty of other animals use tools, have language, and something resembling a culture. These discoveries were ridiculed by many throughout the 20th century, even by scientists, because they wanted to keep believing humans are special in some qualitative way. I see the same thing happening with LLMs.
1 Antwort Letzte Antwort

2
G glizzyguzzler@lemmy.blahaj.zone

Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data

“An LLM model isn’t really an LLM because it’s just a series of numbers”

But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed

And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think
B This user is from outside of this forum
B This user is from outside of this forum
bb84@mander.xyz

schrieb zuletzt editiert von

#52

Can humans think?
1 Antwort Letzte Antwort

1
T theparadox@lemmy.world

More than enough people who claim to know how it works think it might be "evolving" into a sentient being inside it's little black box. Example from a conversation I gave up on...
https://sh.itjust.works/comment/18759960
A This user is from outside of this forum
A This user is from outside of this forum
annebonny@lemmy.dbzer0.com

schrieb zuletzt editiert von

#53

Maybe I should rephrase my question:

Outside of comment sections on the internet, who has claimed or is claiming that LLMs have the capacity to reason?
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

S

Bill Atkinson, Who Made Computers Easier to Use, Is Dead at 74
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

1

0 Stimmen

1 Beiträge

0 Aufrufe

Niemand hat geantwortet
F

The Arc Browser Is Dead
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
53

160 Stimmen

53 Beiträge

0 Aufrufe

P

When i left Chrome, one of the things I was looking for was vertical tabs and was willing to try anything. I wasn't fond of a mac first option, but I decided to try it. Installed it and the first thing it did was to force me to make an account, uninstalled it instantly. I'm not against the option of having an account, but forcing it makes me distrust them. Was not long after that there were also some major security flaws found as well. They really didn't make it easy for people to change, almost like they thought the apple form over function would appeal more broadly.
F

Front Brake Lights Could Drastically Diminish Road Accident Rates
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
336

1

591 Stimmen

336 Beiträge

2 Aufrufe

B

Well yes they will, but at least it's an option . Also a lot of vehicles have auto dimming now but they don't work well and don't last more than 6 years before the sensors get borked
T

Klarna’s AI replaced 700 workers — Now the fintech CEO wants humans back after $40B fall
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
30

158 Stimmen

30 Beiträge

2 Aufrufe

D

These are the 700 Actually Indians
E

@chrlschn - Beware the Complexity Merchants
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
6

1

58 Stimmen

6 Beiträge

0 Aufrufe

S

I'm a big fan of the manta "Make your designs as simple as possible and no simpler". Pointless complexity drives me nuts, but others take it too far and remove functionality by making things too minimal. It doesn't help that a lot of businesses optimize for people who make changes, so the positive feedback loop is change for the sake of change rather than improving the product.
D

I just came across an AI called Sesame that appears to have been explicitly trained to deny and lie about the Palestinian genocide
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

36 Stimmen

9 Beiträge

2 Aufrufe

T

It's also much easier to implement.
T

New York Mayor Eric Adams to Crypto Industry: Come Build an Empire in NYC
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
26

1

88 Stimmen

26 Beiträge

7 Aufrufe

M

I really can't stand this guy. What a slag.
F

Meta will delete all non-migrated Oculus accounts in March - The Verge
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

0 Stimmen

2 Beiträge

2 Aufrufe

G

Wow... Just learned about that NOW. I wanted to play some games today and wondered why my account doesnt work nor the "forgot password"-Function... Fuck Meta. Fuck Oculus... Fuck this whole Enshittification that is going on lately... Is there ANY Way, to get my CV1 to run Without an account?!