linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Technology

204 Beiträge 136 Kommentatoren 5.9k Aufrufe

D drspod@lemmy.ml

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
P This user is from outside of this forum
P This user is from outside of this forum
pushbutton@lemmy.world

schrieb am zuletzt editiert von

#127

You get 2 triangles in a single square mate...

CHECKMATE!
A 1 Antwort Letzte Antwort

6
M monkdervierte@lemmy.zip

LLM are not built for logic.
P This user is from outside of this forum
P This user is from outside of this forum
pushbutton@lemmy.world

schrieb am zuletzt editiert von

#128

And yet everybody is selling to write code.

The last time I checked, coding was requiring logic.
J S 2 Antworten Letzte Antwort

15
F furbag@lemmy.world

Can ChatGPT actually play chess now? Last I checked, it couldn't remember more than 5 moves of history so it wouldn't be able to see the true board state and would make illegal moves, take it's own pieces, materialize pieces out of thin air, etc.
P This user is from outside of this forum
P This user is from outside of this forum
pamasich@kbin.earth

schrieb am zuletzt editiert von

#129

There are custom GPTs which claim to play at a stockfish level or be literally stockfish under the hood (I assume the former is still the latter just not explicitly). Haven't tested them, but if they work, I'd say yes. An LLM itself will never be able to play chess or do anything similar, unless they outsource that task to another tool that can. And there seem to be GPTs that do exactly that.

As for why we need ChatGPT then when the result comes from Stockfish anyway, it's for the natural language prompts and responses.
N 1 Antwort Letzte Antwort

0
D drspod@lemmy.ml

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
I This user is from outside of this forum
I This user is from outside of this forum
inconel@lemmy.ca

schrieb am zuletzt editiert von

#130

It's also from a company claiming they're getting closer to create morphing shape that can match any hole.
D 1 Antwort Letzte Antwort

17
D drspod@lemmy.ml

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
M This user is from outside of this forum
M This user is from outside of this forum
mrsqueezles@lemmy.world

schrieb am zuletzt editiert von

#131

The press release where OpenAI said we'd never need chess players again
1 Antwort Letzte Antwort

5
L lifecoach5000@lemmy.world

This post did not contain any content.
P This user is from outside of this forum
P This user is from outside of this forum
pamasich@kbin.earth

schrieb am zuletzt editiert von

#132

Isn't the Atari just a game console, not a chess engine?

Like, Wikipedia doesn't mention anything about the Atari 2600 having a built-in chess engine.

If they were willing to run a chess game on the Atari 2600, why did they not apply the same to ChatGPT? There are custom GPTs which claim to use a stockfish API or play at a similar level.

Like this, it's just unfair. Both platforms are not designed to deal with the task by themselves, but one of them is given the necessary tooling, the other one isn't. No matter what you think of ChatGPT, that's not a fair comparison.
J 1 Antwort Letzte Antwort

0
L lifecoach5000@lemmy.world

This post did not contain any content.
H This user is from outside of this forum
H This user is from outside of this forum
harbinger01173430@lemmy.world

schrieb am zuletzt editiert von

#133

Llms useless confirmed once again
1 Antwort Letzte Antwort

2
X x00z@lemmy.world

In all fairness. Machine learning in chess engines is actually pretty strong.

AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind).

AlphaZero - Chess Engines

Learn all about the AlphaZero chess program. Everything you need to know about AlphaZero, including what it is, why it is important, and more!

Chess.com (www.chess.com)
J This user is from outside of this forum
J This user is from outside of this forum
jeeva@lemmy.world

schrieb am zuletzt editiert von

#134

Sure, but machine learning like that is very different to how LLMs are trained and their output.
1 Antwort Letzte Antwort

1
O objection@lemmy.ml

Tbf, the article should probably mention the fact that machine learning programs designed to play chess blow everything else out of the water.
A This user is from outside of this forum
A This user is from outside of this forum
andallthat@lemmy.world

schrieb am zuletzt editiert von andallthat@lemmy.world

#135

Machine learning has existed for many years, now. The issue is with these funding-hungry new companies taking their LLMs, repackaging them as "AI" and attributing every ML win ever to "AI".

ML programs designed and trained specifically to identify tumors in medical imaging have become good diagnostic tools. But if you read in news that "AI helps cure cancer", it makes it sound like it was a lone researcher who spent a few minutes engineering the right prompt for Copilot.

Yes a specifically-designed and finely tuned ML program can now beat the best human chess player, but calling it "AI" and bundling it together with the latest Gemini or Claude iteration's "reasoning capabilities" is intentionally misleading. That's why articles like this one are needed. ML is a useful tool but far from the "super-human general intelligence" that is meant to replace half of human workers by the power of wishful prompting
1 Antwort Letzte Antwort

11
L lifecoach5000@lemmy.world

This post did not contain any content.
N This user is from outside of this forum
N This user is from outside of this forum
nednobbins@lemm.ee

schrieb am zuletzt editiert von

#136

Sometimes it seems like most of these AI articles are written by AIs with bad prompts.

Human journalists would hopefully do a little research. A quick search would reveal that researches have been publishing about this for over a year so there's no need to sensationalize it. Perhaps the human journalist could have spent a little time talking about why LLMs are bad at chess and how researchers are approaching the problem.

LLMs on the other hand, are very good at producing clickbait articles with low information content.
N L 2 Antworten Letzte Antwort

48
L lifecoach5000@lemmy.world

This post did not contain any content.
I This user is from outside of this forum
I This user is from outside of this forum
icastfist@programming.dev

schrieb am zuletzt editiert von

#137

So, it fares as well as the average schmuck, proving it is human

/s
1 Antwort Letzte Antwort

2
I inconel@lemmy.ca

It's also from a company claiming they're getting closer to create morphing shape that can match any hole.
D This user is from outside of this forum
D This user is from outside of this forum
dragontypewyvern@midwest.social

schrieb am zuletzt editiert von

#138

And yet the company offers no explanation for how, exactly, they're going to get wood to do that.
1 Antwort Letzte Antwort

5
A alecsadler@sh.itjust.works

I've found Claude 3.7 and 4.0 and sometimes Gemini variants still leagues better than ChatGPT/Copilot.

Still not perfect, but night and day difference.

I feel like ChatGPT didn't focus on coding and instead focused on mainstream, but I am not an expert.
D This user is from outside of this forum
D This user is from outside of this forum
dragontypewyvern@midwest.social

schrieb am zuletzt editiert von dragontypewyvern@midwest.social

#139

Gemini will get basic C++, probably the best documented language for beginners out there, right about half of the time.

I think that might even be the problem, honestly, a bunch of new coders post bad code and it's fixed in comments but the LLM CAN'T realize that.
1 Antwort Letzte Antwort

0
P pixelatedsaturn@lemmy.world

Do they though? No one I talked to, not my coworkers that use it for work, not my friends, not my 72 year old mother think they are sentient.
T This user is from outside of this forum
T This user is from outside of this forum
towardsthefuture@lemmy.zip

schrieb am zuletzt editiert von

#140

Okay I maybe exaggerated a bit, but a lot of people think it actually knows things, or is actually smart. Which… it’s not… at all. It’s just pattern recognition. Which was I assume the point of showing it can’t even beat the goddamn Atari because it cannot think or reason, it’s all just copy pasta and pattern recognition.
1 Antwort Letzte Antwort

0
P pamasich@kbin.earth

Isn't the Atari just a game console, not a chess engine?

Like, Wikipedia doesn't mention anything about the Atari 2600 having a built-in chess engine.

If they were willing to run a chess game on the Atari 2600, why did they not apply the same to ChatGPT? There are custom GPTs which claim to use a stockfish API or play at a similar level.

Like this, it's just unfair. Both platforms are not designed to deal with the task by themselves, but one of them is given the necessary tooling, the other one isn't. No matter what you think of ChatGPT, that's not a fair comparison.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb am zuletzt editiert von

#141

GPTs which claim to use a stockfish API

Then the actual chess isn't LLM. If you are going stockfish, then the LLM doesn't add anything, stockfish is doing everything.

The whole point is the marketing rage is that LLMs can do all kinds of stuff, doubling down on this with the branding of some approaches as "reasoning" models, which are roughly "similar to 'pre-reasoning', but forcing use of more tokens on disposable intermediate generation steps". With this facet of LLM marketing, the promise would be that the LLM can "reason" itself through a chess game without particular enablement. In practice, people trying to feed in gobs of chess data to an LLM end up with an LLM that doesn't even comply to the rules of the game, let alone provide reasonable competitive responses to an oppone.
P 1 Antwort Letzte Antwort

6
N nednobbins@lemm.ee

Sometimes it seems like most of these AI articles are written by AIs with bad prompts.

Human journalists would hopefully do a little research. A quick search would reveal that researches have been publishing about this for over a year so there's no need to sensationalize it. Perhaps the human journalist could have spent a little time talking about why LLMs are bad at chess and how researchers are approaching the problem.

LLMs on the other hand, are very good at producing clickbait articles with low information content.
N This user is from outside of this forum
N This user is from outside of this forum
nova_ad_vitum@lemmy.ca

schrieb am zuletzt editiert von

#142

Gotham chess has a video of making chatgpt play chess against stockfish. Spoiler: chatgpt does not do well. It plays okay for a few moves but then the moment it gets in trouble it straight up cheats. Telling it to follow the rules of chess doesn't help.

This sort of gets to the heart of LLM-based "AI". That one example to me really shows that there's no actual reasoning happening inside. It's producing answers that statistically look like answers that might be given based on that input.

For some things it even works. But calling this intelligence is dubious at best.
N U I J P 5 Antworten Letzte Antwort

23
P pushbutton@lemmy.world

And yet everybody is selling to write code.

The last time I checked, coding was requiring logic.
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb am zuletzt editiert von

#143

To be fair, a decent chunk of coding is stupid boilerplate/minutia that varies environment to environment, language to language, library to library.

So LLM can do some code completion, filling out a bunch of boilerplate that is blatantly obvious, generating the redundant text mandated by certain patterns, and keeping straight details between languages like "does this language want join as a method on a list with a string argument, or vice versa?"

Problem is this can be sometimes more annoying than it's worth, as miscompletions are annoying.
P L 2 Antworten Letzte Antwort

8
L lifecoach5000@lemmy.world

This post did not contain any content.
F This user is from outside of this forum
F This user is from outside of this forum
finitebanjo@lemmy.world

schrieb am zuletzt editiert von

#144

All these comments asking "why don't they just have chatgpt go and look up the correct answer".

That's not how it works, you buffoons, it trains off of datasets long before it releases. It doesn't think. It doesn't learn after release, it won't remember things you try to teach it.

Really lowering my faith in humanity when even the AI skeptics don't understand that it generates statistical representations of an answer based on answers given in the past.
1 Antwort Letzte Antwort

13
N nutsack@lemmy.dbzer0.com

my favorite thing is to constantly be implementing libraries that don't exist
J This user is from outside of this forum
J This user is from outside of this forum
jj4211@lemmy.world

schrieb am zuletzt editiert von

#145

Oh man, I feel this. A couple of times I've had to field questions about some REST API I support and they ask why they get errors when they supply a specific attribute. Now that attribute never existed, not in our code, not in our documentation, we never thought of it. So I say "Well, that attribute is invalid, I'm not sure where you saw to do that". They get insistent that the code is generated by a very good LLM, so we must be missing something...
1 Antwort Letzte Antwort

1
H halosheep@lemm.ee

I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
I This user is from outside of this forum
I This user is from outside of this forum
ipkpjersi@lemmy.ml

schrieb am zuletzt editiert von

#146

That's just clickbait in general these days lol
1 Antwort Letzte Antwort

1

Anmelden zum Antworten

S

Flowing Futures: Trends in the Monoethylene Glycol Market
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

2

0 Stimmen

1 Beiträge

2 Aufrufe

Niemand hat geantwortet
K

How did Facebook intercept their competitor's encrypted mobile app traffic? [Old]
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

1

71 Stimmen

9 Beiträge

5 Aufrufe

N

Some people like being unpaid OnlyFans models whose intimate details go to corporations instead of pervy guys
D

Russian lawmakers say 'security threat' WhatsApp should prepare to leave Russia
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
24

1

120 Stimmen

24 Beiträge

389 Aufrufe

M

A little background info: Russia's been sponsoring one of its oligarchs' business by eliminating their competition. First, they restricted YouTube's speed to an unusable state to force people to switch to RuTube (they didn't) Now they're trying to force people to switch from WhatsApp (and potentially Telegram) to MAX, which they want to be Russia's version of WeChat. Add the fact that our politicians are obsessed with controlling all of the media and you'll get the gist of it.
D

'I can't drink the water' - life next to a US data centre
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
21

1

262 Stimmen

21 Beiträge

305 Aufrufe

C

They use adiabatic coolers to minimize electrical cost for cooling and maximize cooling capacity. The water isn't directly used as the cooling fluid. It's just used to provide evaporative cooling to boost the efficiency of a conventional refrigeration system. I also suspect that many of them are starting to switch to CO2 based refrigeration systems which heavily benefit from adiabatic gas coolers due to the low critical temp of CO2. Without an adiabatic cooler the efficiency of a CO2 based system starts dropping heavily when the ambient temp gets much above 80F. They could acheive the same results without using water, however their refrigeration systems would need larger gas coolers which would increase their electricity usage.
D

Frequent TikTok users in Taiwan more likely to agree with pro-China narratives, study finds
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
33

1

149 Stimmen

33 Beiträge

286 Aufrufe

B

That’s not the right analogy here. The better analogy would be something like: Your scary mafia-related neighbor shows up with a document saying your house belongs to his land. You said no way, you have connections with someone important that assured you your house is yours only and they’ll help you with another mafia if they want to invade your house. The whole neighborhood gets scared of an upcoming bloodbath that might drag everyone into it. But now your son says he actually agrees that your house belongs to your neighbor, and he’s likely waiting until you’re old enough to possibly give it up to him.
P

Google is going ‘all in’ on AI. It’s part of a troubling trend in big tech
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
119

1

219 Stimmen

119 Beiträge

2k Aufrufe

L

Okay, I'd be interested to hear what you think is wrong with this, because I'm pretty sure it's more or less correct. Some sources for you to help you understand these concepts a bit better: What DLSS is and how it works as a starter: https://en.wikipedia.org/wiki/Deep_Learning_Super_Sampling Issues with modern "optimization", including DLSS: https://www.youtube.com/watch?v=lJu_DgCHfx4 TAA comparisons (yes, biased, but accurate): https://old.reddit.com/r/FuckTAA/comments/1e7ozv0/rfucktaa_resource/
A

I am disappointed in the AI discourse
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
27

7 Stimmen

27 Beiträge

221 Aufrufe

A

I apologize that apparently Lemmy/Reddit people do not have enough self-awareness to accept good criticism, especially if it was just automatically generated and have downloaded that to oblivion. Though I don't really think you should respond to comments with a chatGPT link, not exactly helpful. Comes off a tad bit AI Bro...
D

The technology to end traffic deaths exists. Why aren’t we using it?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
36

43 Stimmen

36 Beiträge

336 Aufrufe

M

You’re seriously attempting to argue with me about whether or not transportation existed before cars?