linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Technology

204 Beiträge 136 Kommentatoren 5.9k Aufrufe

N nova_ad_vitum@lemmy.ca

Gotham chess has a video of making chatgpt play chess against stockfish. Spoiler: chatgpt does not do well. It plays okay for a few moves but then the moment it gets in trouble it straight up cheats. Telling it to follow the rules of chess doesn't help.

This sort of gets to the heart of LLM-based "AI". That one example to me really shows that there's no actual reasoning happening inside. It's producing answers that statistically look like answers that might be given based on that input.

For some things it even works. But calling this intelligence is dubious at best.
P This user is from outside of this forum
P This user is from outside of this forum
propitiouspanda@lemmy.cafe

schrieb am zuletzt editiert von

#193

It plays okay for a few moves but then the moment it gets in trouble it straight up cheats.

Lol. More comparisons to how AI is currently like a young child.
1 Antwort Letzte Antwort

0
N neilbru@lemmy.world

Absolutely interested. Thank you for your time to share that.

My career path in neural networks began as a researcher for cancerous tissue object detection in medical diagnostic imaging. Now it is switched to generative models for CAD (architecture, product design, game assets, etc.). I don't really mess about with fine-tuning LLMs.

However, I do self-host my own LLMs as code assistants. Thus, I'm only tangentially involved with the current LLM craze.

But it does interest me, nonetheless!
T This user is from outside of this forum
T This user is from outside of this forum
takapapatapaka@lemmy.world

schrieb am zuletzt editiert von

#194

Here is the main blog post that i remembered : it has a follow up, a more scientific version, and uses two other articles as a basis, so you might want to dig around what they mention in the introduction.

It is indeed a quite technical discovery, and it still lacks complete and wider analysis, but it is very interesting for the fact that it kinda invalidates the common gut feeling that llms are pure lucky random.
1 Antwort Letzte Antwort

0
J jsomae@lemmy.ml

Using an LLM as a chess engine is like using a power tool as a table leg. Pretty funny honestly, but it's obviously not going to be good at it, at least not without scaffolding.
K This user is from outside of this forum
K This user is from outside of this forum
kent_eh@lemmy.ca

schrieb am zuletzt editiert von

#195

is like using a power tool as a table leg.

Then again, our corporate lords and masters are trying to replace all manner of skilled workers with those same LLM "AI" tools.

And clearly that will backfire on them and they'll eventually scramble to find people with the needed skills, but in the meantime tons of people will have lost their source of income.
J 1 Antwort Letzte Antwort

2
L lifecoach5000@lemmy.world

This post did not contain any content.
F This user is from outside of this forum
F This user is from outside of this forum
fourwaveforms@lemm.ee

schrieb am zuletzt editiert von

#196

If you don't play chess, the Atari is probably going to beat you as well.

LLMs are only good at things to the extent that they have been well-trained in the relevant areas. Not just learning to predict text string sequences, but reinforcement learning after that, where a human or some other agent says "this answer is better than that one" enough times in enough of the right contexts. It mimics the way humans learn, which is through repeated and diverse exposure.

If they set up a system to train it against some chess program, or (much simpler) simply gave it a tool call, it would do much better. Tool calling already exists and would be by far the easiest way.

It could also be instructed to write a chess solver program and then run it, at which point it would be on par with the Atari, but it wouldn't compete well with a serious chess solver.
1 Antwort Letzte Antwort

4
K kent_eh@lemmy.ca

is like using a power tool as a table leg.

Then again, our corporate lords and masters are trying to replace all manner of skilled workers with those same LLM "AI" tools.

And clearly that will backfire on them and they'll eventually scramble to find people with the needed skills, but in the meantime tons of people will have lost their source of income.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb am zuletzt editiert von jsomae@lemmy.ml

#197

If you believe LLMs are not good at anything then there should be relatively little to worry about in the long-term, but I am more concerned.

It's not obvious to me that it will backfire for them, because I believe LLMs are good at some things (that is, when they are used correctly, for the correct tasks). Currently they're being applied to far more use cases than they are likely to be good at -- either because they're overhyped or our corporate lords and masters are just experimenting to find out what they're good at and what not. Some of these cases will be like chess, but others will be like code*.

(* not saying LLMs are good at code in general, but for some coding applications I believe they are vastly more efficient than humans, even if a human expert can currently write higher-quality less-buggy code.)
K 1 Antwort Letzte Antwort

0
A arc99@lemmy.world

Hardly surprising. Llms aren't -thinking- they're just shitting out the next token for any given input of tokens.
S This user is from outside of this forum
S This user is from outside of this forum
stevedice@sh.itjust.works

schrieb am zuletzt editiert von

#198

That's exactly what thinking is, though.
A 1 Antwort Letzte Antwort

0
L lifecoach5000@lemmy.world

This post did not contain any content.
S This user is from outside of this forum
S This user is from outside of this forum
stevedice@sh.itjust.works

schrieb am zuletzt editiert von

#199

2025 Mazda MX-5 Miata 'got absolutely wrecked' by Inflatable Boat in beginner's boat racing match — Mazda's newest model bamboozled by 1930s technology.
1 Antwort Letzte Antwort

7
L lifecoach5000@lemmy.world

This post did not contain any content.
U This user is from outside of this forum
U This user is from outside of this forum
untakenusername@sh.itjust.works

schrieb am zuletzt editiert von

#200

this is because an LLM is not made for playing chess
1 Antwort Letzte Antwort

1
J jsomae@lemmy.ml

If you believe LLMs are not good at anything then there should be relatively little to worry about in the long-term, but I am more concerned.

It's not obvious to me that it will backfire for them, because I believe LLMs are good at some things (that is, when they are used correctly, for the correct tasks). Currently they're being applied to far more use cases than they are likely to be good at -- either because they're overhyped or our corporate lords and masters are just experimenting to find out what they're good at and what not. Some of these cases will be like chess, but others will be like code*.

(* not saying LLMs are good at code in general, but for some coding applications I believe they are vastly more efficient than humans, even if a human expert can currently write higher-quality less-buggy code.)
K This user is from outside of this forum
K This user is from outside of this forum
kent_eh@lemmy.ca

schrieb am zuletzt editiert von

#201

I believe LLMs are good at some things

The problem is that they're being used for all the things, including a large number of tasks that thwy are not well suited to.
J 1 Antwort Letzte Antwort

0
K kent_eh@lemmy.ca

I believe LLMs are good at some things

The problem is that they're being used for all the things, including a large number of tasks that thwy are not well suited to.
J This user is from outside of this forum
J This user is from outside of this forum
jsomae@lemmy.ml

schrieb am zuletzt editiert von

#202

yeah, we agree on this point. In the short term it's a disaster. In the long-term, assuming AI's capabilities don't continue to improve at the rate they have been, our corporate overlords will only replace people for whom it's actually worth it to them to replace with AI.
1 Antwort Letzte Antwort

1
S stevedice@sh.itjust.works

That's exactly what thinking is, though.
A This user is from outside of this forum
A This user is from outside of this forum
arc99@lemmy.world

schrieb am zuletzt editiert von arc99@lemmy.world

#203
An LLM is an ordered series of parameterized / weighted nodes which are fed a bunch of tokens, and millions of calculations later result generates the next token to append and repeat the process. It's like turning a handle on some complex Babbage-esque machine. LLMs use a tiny bit of randomness ("temperature") when choosing the next token so the responses are not identical each time.

But it is not thinking. Not even remotely so. It's a simulacrum. If you want to see this, run ollama with the temperature set to 0 e.g.
```
ollama run gemma3:4b
>>> /set parameter temperature 0
>>> what is a leaf
```
You will get the same answer every single time.
S 1 Antwort Letzte Antwort

0
A arc99@lemmy.world
An LLM is an ordered series of parameterized / weighted nodes which are fed a bunch of tokens, and millions of calculations later result generates the next token to append and repeat the process. It's like turning a handle on some complex Babbage-esque machine. LLMs use a tiny bit of randomness ("temperature") when choosing the next token so the responses are not identical each time.

But it is not thinking. Not even remotely so. It's a simulacrum. If you want to see this, run ollama with the temperature set to 0 e.g.
```
ollama run gemma3:4b
>>> /set parameter temperature 0
>>> what is a leaf
```
You will get the same answer every single time.
S This user is from outside of this forum
S This user is from outside of this forum
stevedice@sh.itjust.works

schrieb am zuletzt editiert von stevedice@sh.itjust.works

#204

I know what an LLM is doing. You don't know what your brain is doing.
1 Antwort Letzte Antwort

0

Anmelden zum Antworten

C

Japan sets new internet speed world record — 4 million times faster than average US speeds
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
102

1

550 Stimmen

102 Beiträge

614 Aufrufe

L

Not surprising it's already ahead, as about 20 years ago they offered 100mbps to anyone who could pay for it (a certain Danny Choo comes to mind).
P

Federal judge declines to order Trump officials to recover deleted Signal messages
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

112 Stimmen

2 Beiträge

28 Aufrufe

W

...the ruling stopped short of ordering the government to recover past messages that may already have been lost. How would somebody be meant to comply with an order to recover a message that has been deleted? Or is that the point? Can't comply and you're in contempt of court.
N

Perovskite-based image sensors promise higher sensitivity and resolution than silicon
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
23

1

112 Stimmen

23 Beiträge

128 Aufrufe

E

I mean no more live view via the screen
T

Atom-Thin Tech Replaces Silicon in the World’s First 2D Computer
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
18

1

125 Stimmen

18 Beiträge

101 Aufrufe

L

The 'laptop' is s conceptual illustration. The image shown on the laptop screen is an actual SEM image.
D

Frequent TikTok users in Taiwan more likely to agree with pro-China narratives, study finds
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
33

1

149 Stimmen

33 Beiträge

161 Aufrufe

B

That’s not the right analogy here. The better analogy would be something like: Your scary mafia-related neighbor shows up with a document saying your house belongs to his land. You said no way, you have connections with someone important that assured you your house is yours only and they’ll help you with another mafia if they want to invade your house. The whole neighborhood gets scared of an upcoming bloodbath that might drag everyone into it. But now your son says he actually agrees that your house belongs to your neighbor, and he’s likely waiting until you’re old enough to possibly give it up to him.
L

I'm looking for an article showing that LLMs don't know how they work internally
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
80

133 Stimmen

80 Beiträge

803 Aufrufe

G

Indeed I did not, we’re at a stalemate because you and I do not believe what the other is saying! So we can’t move anywhere since it’s two walls. Buuuut Tim Apple got my back for once, just saw this now!: https://lemmy.blahaj.zone/post/27197259 I’ll leave it at that, as thanks to that white paper I win! Yay internet points!
P

Why doesn't Nvidia have more competition?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
22

1

33 Stimmen

22 Beiträge

200 Aufrufe

B

It’s funny how the article asks the question, but completely fails to answer it. About 15 years ago, Nvidia discovered there was a demand for compute in datacenters that could be met with powerful GPU’s, and they were quick to respond to it, and they had the resources to focus on it strongly, because of their huge success and high profitability in the GPU market. AMD also saw the market, and wanted to pursue it, but just over a decade ago where it began to clearly show the high potential for profitability, AMD was near bankrupt, and was very hard pressed to finance developments on GPU and compute in datacenters. AMD really tried the best they could, and was moderately successful from a technology perspective, but Nvidia already had a head start, and the proprietary development system CUDA was already an established standard that was very hard to penetrate. Intel simply fumbled the ball from start to finish. After a decade of trying to push ARM down from having the mobile crown by far, investing billions or actually the equivalent of ARM’s total revenue. They never managed to catch up to ARM despite they had the better production process at the time. This was the main focus of Intel, and Intel believed that GPU would never be more than a niche product. So when intel tried to compete on compute for datacenters, they tried to do it with X86 chips, One of their most bold efforts was to build a monstrosity of a cluster of Celeron chips, which of course performed laughably bad compared to Nvidia! Because as it turns out, the way forward at least for now, is indeed the massively parralel compute capability of a GPU, which Nvidia has refined for decades, only with (inferior) competition from AMD. But despite the lack of competition, Nvidia did not slow down, in fact with increased profits, they only grew bolder in their efforts. Making it even harder to catch up. Now AMD has had more money to compete for a while, and they do have some decent compute units, but Nvidia remains ahead and the CUDA problem is still there, so for AMD to really compete with Nvidia, they have to be better to attract customers. That’s a very tall order against Nvidia that simply seems to never stop progressing. So the only other option for AMD is to sell a bit cheaper. Which I suppose they have to. AMD and Intel were the obvious competitors, everybody else is coming from even further behind. But if I had to make a bet, it would be on Huawei. Huawei has some crazy good developers, and Trump is basically forcing them to figure it out themselves, because he is blocking Huawei and China in general from using both AMD and Nvidia AI chips. And the chips will probably be made by Chinese SMIC, because they are also prevented from using advanced production in the west, most notably TSMC. China will prevail, because it’s become a national project, of both prestige and necessity, and they have a massive talent mass and resources, so nothing can stop it now. IMO USA would clearly have been better off allowing China to use American chips. Now China will soon compete directly on both production and design too.
S

Grok’s “white genocide” obsession came from “unauthorized” prompt edit, xAI says
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

1 Stimmen

4 Beiträge

32 Aufrufe

N

that's probably not true. I imagine it was someone trying to harm the guy. a hilarious prank