ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic
-
I like tab coding, writing small blocks of code that it thinks I need. Its On point almost all the time. This speeds me up.
Bingo. If anything what you're finding is the people bitching are the same people that if given a bike wouldn't know how to ride it, which is fair. Some people understand quicker how to use the tools they are given.
Edit - a poor carpenter blames his tools.
-
Not to help the AI companies, but why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff? It's obvious they're shit at it, why do they answer anyway? It's because they're programmed by know-it-all programmers, isn't it.
This is where MCP comes in. It's a protocol for LLMs to call standard tools. Basically the LLM would figure out the tool to use from the context, then figure out the order of parameters from those the MCP server says is available, send the JSON, and parse the response.
-
That is more a failure of the person who made that decision than a failing of ChatBots, lol
Agreed, which is why it's important to have articles out in the wild that show the shortcomings of AI. If all people read is all the positive crap coming out of companies like OpenAI then they will make stupid decisions.
-
This post did not contain any content.
Hardly surprising. Llms aren't -thinking- they're just shitting out the next token for any given input of tokens.
-
ChatGPT has been, hands down, the worst AI coding assistant I've ever used.
It regularly suggests code that doesn't compile or isn't even for the language.
It generally suggests AC of code that is just a copy of the lines I just wrote.
Sometimes it likes to suggest setting the same property like 5 times.
It is absolute garbage and I do not recommend it to anyone.
All AIs are the same. They're just scraping content from GitHub, stackoverflow etc with a bunch of guardrails slapped on to spew out sentences that conform to their training data but there is no intelligence. They're super handy for basic code snippets but anyone using them anything remotely complex or nuanced will regret it.
-
You say you produce good oranges but my machine for testing apples gave your oranges a very low score.
No, more like "Your marketing team, sales team, the news media at large, and random hype men all insist your orange machine works amazing on any fruit if you know how to use it right. It didn't work my strawberries when I gave it all the help I could, and was outperformed by my 40 year old strawberry machine. Please stop selling the idea it works on all fruit."
This study is specifically a counter to the constant hype that these LLMs will revolutionize absolutely everything, and the constant word choices used in discussion of LLMs that imply they have reasoning capabilities.
-
my favorite thing is to constantly be implementing libraries that don't exist
It's even worse when AI soaks up some project whose APIs are constantly changing. Try using AI to code against jetty for example and you'll be weeping.
-
This post did not contain any content.
I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
-
Have you tried feeding the toddler gallons of baby-food? Maybe then it can play chess
They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
-
They used ChatGPT 4o, instead of using o1 or o3.
Obviously it was going to fail.
Other studies (not all chess based or against this old chess AI) show similar lackluster results when using reasoning models.
Edit: When comparing reasoning models to existing algorithmic solutions.
-
I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
-
They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
"If we have to ask every time before stealing a little baby food, our morbidly obese toddler cannot survive"
-
ChatGPT has been, hands down, the worst AI coding assistant I've ever used.
It regularly suggests code that doesn't compile or isn't even for the language.
It generally suggests AC of code that is just a copy of the lines I just wrote.
Sometimes it likes to suggest setting the same property like 5 times.
It is absolute garbage and I do not recommend it to anyone.
I’ve had success with splitting a function into 2 and planning out an overview, though that’s more like talking to myself
I wouldn’t use it to generate stuff though
-
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
You get 2 triangles in a single square mate...
CHECKMATE!
-
LLM are not built for logic.
And yet everybody is selling to write code.
The last time I checked, coding was requiring logic.
-
Can ChatGPT actually play chess now? Last I checked, it couldn't remember more than 5 moves of history so it wouldn't be able to see the true board state and would make illegal moves, take it's own pieces, materialize pieces out of thin air, etc.
There are custom GPTs which claim to play at a stockfish level or be literally stockfish under the hood (I assume the former is still the latter just not explicitly). Haven't tested them, but if they work, I'd say yes. An LLM itself will never be able to play chess or do anything similar, unless they outsource that task to another tool that can. And there seem to be GPTs that do exactly that.
As for why we need ChatGPT then when the result comes from Stockfish anyway, it's for the natural language prompts and responses.
-
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
It's also from a company claiming they're getting closer to create morphing shape that can match any hole.
-
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
The press release where OpenAI said we'd never need chess players again
-
This post did not contain any content.
Isn't the Atari just a game console, not a chess engine?
Like, Wikipedia doesn't mention anything about the Atari 2600 having a built-in chess engine.
If they were willing to run a chess game on the Atari 2600, why did they not apply the same to ChatGPT? There are custom GPTs which claim to use a stockfish API or play at a similar level.
Like this, it's just unfair. Both platforms are not designed to deal with the task by themselves, but one of them is given the necessary tooling, the other one isn't. No matter what you think of ChatGPT, that's not a fair comparison.
-
This post did not contain any content.
Llms useless confirmed once again