linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Technology

204 Beiträge 136 Kommentatoren 5.9k Aufrufe

P pelespirit@sh.itjust.works
9. Juni 2025, 23:21

Not to help the AI companies, but why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff? It's obvious they're shit at it, why do they answer anyway? It's because they're programmed by know-it-all programmers, isn't it.
F This user is from outside of this forum
F This user is from outside of this forum
fmstrat@lemmy.nowsci.com

schrieb am 10. Juni 2025, 10:57 zuletzt editiert von

#115

This is where MCP comes in. It's a protocol for LLMs to call standard tools. Basically the LLM would figure out the tool to use from the context, then figure out the order of parameters from those the MCP server says is available, send the JSON, and parse the response.
1 Antwort Letzte Antwort

1
O otp@sh.itjust.works
10. Juni 2025, 00:50

That is more a failure of the person who made that decision than a failing of ChatBots, lol
W This user is from outside of this forum
W This user is from outside of this forum
wewbull@feddit.uk

schrieb am 10. Juni 2025, 11:20 zuletzt editiert von

#116

Agreed, which is why it's important to have articles out in the wild that show the shortcomings of AI. If all people read is all the positive crap coming out of companies like OpenAI then they will make stupid decisions.
1 Antwort Letzte Antwort

2
L lifecoach5000@lemmy.world
9. Juni 2025, 22:38

This post did not contain any content.
A This user is from outside of this forum
A This user is from outside of this forum
arc99@lemmy.world

schrieb am 10. Juni 2025, 11:22 zuletzt editiert von

#117

Hardly surprising. Llms aren't -thinking- they're just shitting out the next token for any given input of tokens.
S 1 Antwort Letzte Antwort 11. Juni 2025, 21:43

18
A alecsadler@sh.itjust.works
10. Juni 2025, 05:38

ChatGPT has been, hands down, the worst AI coding assistant I've ever used.

It regularly suggests code that doesn't compile or isn't even for the language.

It generally suggests AC of code that is just a copy of the lines I just wrote.

Sometimes it likes to suggest setting the same property like 5 times.

It is absolute garbage and I do not recommend it to anyone.
A This user is from outside of this forum
A This user is from outside of this forum
arc99@lemmy.world

schrieb am 10. Juni 2025, 11:26 zuletzt editiert von

#118

All AIs are the same. They're just scraping content from GitHub, stackoverflow etc with a bunch of guardrails slapped on to spew out sentences that conform to their training data but there is no intelligence. They're super handy for basic code snippets but anyone using them anything remotely complex or nuanced will regret it.
A N 2 Antworten Letzte Antwort 10. Juni 2025, 16:12

5
S seven_phone@lemmy.world
10. Juni 2025, 01:29

You say you produce good oranges but my machine for testing apples gave your oranges a very low score.
W This user is from outside of this forum
W This user is from outside of this forum
wizardbeard@lemmy.dbzer0.com

schrieb am 10. Juni 2025, 11:27 zuletzt editiert von

#119

No, more like "Your marketing team, sales team, the news media at large, and random hype men all insist your orange machine works amazing on any fruit if you know how to use it right. It didn't work my strawberries when I gave it all the help I could, and was outperformed by my 40 year old strawberry machine. Please stop selling the idea it works on all fruit."

This study is specifically a counter to the constant hype that these LLMs will revolutionize absolutely everything, and the constant word choices used in discussion of LLMs that imply they have reasoning capabilities.
1 Antwort Letzte Antwort

3
N nutsack@lemmy.dbzer0.com
10. Juni 2025, 07:30

my favorite thing is to constantly be implementing libraries that don't exist
A This user is from outside of this forum
A This user is from outside of this forum
arc99@lemmy.world

schrieb am 10. Juni 2025, 11:27 zuletzt editiert von

#120

It's even worse when AI soaks up some project whose APIs are constantly changing. Try using AI to code against jetty for example and you'll be weeping.
1 Antwort Letzte Antwort

1
L lifecoach5000@lemmy.world
9. Juni 2025, 22:38

This post did not contain any content.
H This user is from outside of this forum
H This user is from outside of this forum
halosheep@lemm.ee

schrieb am 10. Juni 2025, 11:31 zuletzt editiert von

#121

I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
D I L 3 Antworten Letzte Antwort 10. Juni 2025, 11:43

48
X xavier666@lemm.ee
10. Juni 2025, 09:57

Have you tried feeding the toddler gallons of baby-food? Maybe then it can play chess
B This user is from outside of this forum
B This user is from outside of this forum
baggachipz@sh.itjust.works

schrieb am 10. Juni 2025, 11:33 zuletzt editiert von

#122

They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
X 1 Antwort Letzte Antwort 10. Juni 2025, 11:46

2
I isaamoonkhgdt_6143@lemmy.zip
9. Juni 2025, 23:02

They used ChatGPT 4o, instead of using o1 or o3.

Obviously it was going to fail.
W This user is from outside of this forum
W This user is from outside of this forum
wizardbeard@lemmy.dbzer0.com

schrieb am 10. Juni 2025, 11:39 zuletzt editiert von wizardbeard@lemmy.dbzer0.com 6. Okt. 2025, 13:49

#123

Other studies (not all chess based or against this old chess AI) show similar lackluster results when using reasoning models.

Edit: When comparing reasoning models to existing algorithmic solutions.
1 Antwort Letzte Antwort

0
H halosheep@lemm.ee
10. Juni 2025, 11:31

I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
D This user is from outside of this forum
D This user is from outside of this forum
drspod@lemmy.ml

schrieb am 10. Juni 2025, 11:43 zuletzt editiert von

#124

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
P I M 3 Antworten Letzte Antwort 10. Juni 2025, 12:14

38
B baggachipz@sh.itjust.works
10. Juni 2025, 11:33

They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
X This user is from outside of this forum
X This user is from outside of this forum
xavier666@lemm.ee

schrieb am 10. Juni 2025, 11:46 zuletzt editiert von

#125

"If we have to ask every time before stealing a little baby food, our morbidly obese toddler cannot survive"
1 Antwort Letzte Antwort

3
A alecsadler@sh.itjust.works
10. Juni 2025, 05:38

ChatGPT has been, hands down, the worst AI coding assistant I've ever used.

It regularly suggests code that doesn't compile or isn't even for the language.

It generally suggests AC of code that is just a copy of the lines I just wrote.

Sometimes it likes to suggest setting the same property like 5 times.

It is absolute garbage and I do not recommend it to anyone.
I This user is from outside of this forum
I This user is from outside of this forum
ilikeboobies@lemmy.ca

schrieb am 10. Juni 2025, 12:03 zuletzt editiert von

#126

I’ve had success with splitting a function into 2 and planning out an overview, though that’s more like talking to myself

I wouldn’t use it to generate stuff though
1 Antwort Letzte Antwort

0
D drspod@lemmy.ml
10. Juni 2025, 11:43

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
P This user is from outside of this forum
P This user is from outside of this forum
pushbutton@lemmy.world

schrieb am 10. Juni 2025, 12:14 zuletzt editiert von

#127

You get 2 triangles in a single square mate...

CHECKMATE!
A 1 Antwort Letzte Antwort 10. Juni 2025, 15:02

6
M monkdervierte@lemmy.zip
10. Juni 2025, 09:09

LLM are not built for logic.
P This user is from outside of this forum
P This user is from outside of this forum
pushbutton@lemmy.world

schrieb am 10. Juni 2025, 12:16 zuletzt editiert von

#128

And yet everybody is selling to write code.

The last time I checked, coding was requiring logic.
J S 2 Antworten Letzte Antwort 10. Juni 2025, 14:00

15
F furbag@lemmy.world
10. Juni 2025, 04:26

Can ChatGPT actually play chess now? Last I checked, it couldn't remember more than 5 moves of history so it wouldn't be able to see the true board state and would make illegal moves, take it's own pieces, materialize pieces out of thin air, etc.
P This user is from outside of this forum
P This user is from outside of this forum
pamasich@kbin.earth

schrieb am 10. Juni 2025, 12:20 zuletzt editiert von

#129

There are custom GPTs which claim to play at a stockfish level or be literally stockfish under the hood (I assume the former is still the latter just not explicitly). Haven't tested them, but if they work, I'd say yes. An LLM itself will never be able to play chess or do anything similar, unless they outsource that task to another tool that can. And there seem to be GPTs that do exactly that.

As for why we need ChatGPT then when the result comes from Stockfish anyway, it's for the natural language prompts and responses.
N 1 Antwort Letzte Antwort 10. Juni 2025, 16:23

0
D drspod@lemmy.ml
10. Juni 2025, 11:43

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
I This user is from outside of this forum
I This user is from outside of this forum
inconel@lemmy.ca

schrieb am 10. Juni 2025, 12:27 zuletzt editiert von

#130

It's also from a company claiming they're getting closer to create morphing shape that can match any hole.
D 1 Antwort Letzte Antwort 10. Juni 2025, 13:45

17
D drspod@lemmy.ml
10. Juni 2025, 11:43

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
M This user is from outside of this forum
M This user is from outside of this forum
mrsqueezles@lemmy.world

schrieb am 10. Juni 2025, 12:40 zuletzt editiert von

#131

The press release where OpenAI said we'd never need chess players again
1 Antwort Letzte Antwort

5
L lifecoach5000@lemmy.world
9. Juni 2025, 22:38

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
pamasich@kbin.earth

schrieb am 10. Juni 2025, 12:43 zuletzt editiert von

#132

Isn't the Atari just a game console, not a chess engine?

Like, Wikipedia doesn't mention anything about the Atari 2600 having a built-in chess engine.

If they were willing to run a chess game on the Atari 2600, why did they not apply the same to ChatGPT? There are custom GPTs which claim to use a stockfish API or play at a similar level.

Like this, it's just unfair. Both platforms are not designed to deal with the task by themselves, but one of them is given the necessary tooling, the other one isn't. No matter what you think of ChatGPT, that's not a fair comparison.
J 1 Antwort Letzte Antwort 10. Juni 2025, 13:55

0
L lifecoach5000@lemmy.world
9. Juni 2025, 22:38

This post did not contain any content.
H This user is from outside of this forum
H This user is from outside of this forum
harbinger01173430@lemmy.world

schrieb am 10. Juni 2025, 12:53 zuletzt editiert von

#133

Llms useless confirmed once again
1 Antwort Letzte Antwort

2
X x00z@lemmy.world
10. Juni 2025, 01:00

In all fairness. Machine learning in chess engines is actually pretty strong.

AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind).

AlphaZero - Chess Engines

Learn all about the AlphaZero chess program. Everything you need to know about AlphaZero, including what it is, why it is important, and more!

Chess.com (www.chess.com)
J This user is from outside of this forum
J This user is from outside of this forum
jeeva@lemmy.world

schrieb am 10. Juni 2025, 12:57 zuletzt editiert von

#134

Sure, but machine learning like that is very different to how LLMs are trained and their output.
1 Antwort Letzte Antwort

1

Anmelden zum Antworten

124/204

10. Juni 2025, 11:43

I

NPR Sunday Story - an approachable story on why privacy matters and the invasiveness of surveillance capitalism - Lemmy.World
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 vor 2 Stunden
vor 14 Stunden

130 Stimmen

13 Beiträge

0 Aufrufe

S vor 2 Stunden

I feel like I’m going insane Oh, you are. You are.
3

12ft.io down?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 vor 23 Tagen
vor 23 Tagen

42 Stimmen

22 Beiträge

377 Aufrufe

I vor 23 Tagen

How do you do that? (ELI5, please)
P

Social media can support or undermine democracy — it comes down to how it’s designed: Platform design is a silent pilot steering human behavior.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 10. Juli 2025, 22:15
7. Juli 2025, 18:46

70 Stimmen

5 Beiträge

59 Aufrufe

R 10. Juli 2025, 22:15

Yep It is a design choice to offer a news feed that combines verified news sources with tankie memes — interspersed with photos generated by AI I've really tried to provide tools to tame the meme flood and put them into effect on https://PieFed.social - compare that with the front-page (or All feed) of any Lemmy instance (or most PieFed instances, to be fair). Gen AI filter is coming.
P

Facebook is asking to use Meta AI on photos in your camera roll you haven’t yet shared
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 1. Juli 2025, 16:35
28. Juni 2025, 07:49
2

623 Stimmen

73 Beiträge

1k Aufrufe

S 1. Juli 2025, 16:35

Swappa is good for tech.
P

The female TikTokers silenced through murder: Women influencers around the world are killed for simply speaking online
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 15. Juni 2025, 18:12
13. Juni 2025, 10:05
1

131 Stimmen

6 Beiträge

69 Aufrufe

P 15. Juni 2025, 18:12

This is a tough one for me: I'm opposed to femicide, but I only wish the absolute worst on influencers.
P

Google services, Spotify, Cloudflare, and other services appear to be in a partial outage.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 12. Juni 2025, 21:54
12. Juni 2025, 21:54
1

94 Stimmen

1 Beiträge

21 Aufrufe

Niemand hat geantwortet
P

Google updated its ranking algorithms for explicit videos and explicit content
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 7. Juni 2025, 06:58
7. Juni 2025, 00:11
1

45 Stimmen

7 Beiträge

85 Aufrufe

A 7. Juni 2025, 06:58

Googlebot sad when disallowed access to 18+ videos
A

X blocks 8,000 accounts in India under government order
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
204 9. Mai 2025, 01:30
8. Mai 2025, 17:37
1

58 Stimmen

2 Beiträge

32 Aufrufe

G 9. Mai 2025, 01:30

'member Aug 6 2024: https://www.ft.com/content/31919b4e-4a5a-4eba-ada7-88d3fec455f8 ;D UK faces resistance from X over taking down disinformation during riots Social media site owner Elon Musk has also been posting jibes at UK Prime Minister Keir Starmer Waiting to see those jibes at Modi... And who could forget in April 11, 2024: https://apnews.com/article/brazil-musk-x-twitter-moraes-bef06c0dbbb8ed87495b1afbb0edf211 What to know about Elon Musk’s ‘free speech’ feud with a Brazilian judge gotta see that feud with Indian judges, nobody asked him to block 8000 accounts, including western media outlets, whatever is he gonna do?