ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic
-
Plot twist: the toddler has a multi-year marketing push worth tens if not hundreds of millions, which convinced a lot of people who don't know the first thing about chess that it really is very impressive, and all those chess-types are just jealous.
Have you tried feeding the toddler gallons of baby-food? Maybe then it can play chess
-
That's because it doesn't know what it's saying. It's just blathering out each word as what it estimates to be the likely next word given past examples in its training data. It's a statistics calculator. It's marginally better than just smashing the auto fill on your cell repeatedly. It's literally dumber than a parrot.
Parrots are actually intelligent though.
-
I agree with your general statement, but in theory since all ChatGPT does is regurgitate information back and a lot of chess is memorization of historical games and types, it might actually perform well. No, it can't think, but it can remember everything so at some point that might tip the results in it's favor.
I mean it may be possible but the complexity would be so many orders of magnitude greater. It'd be like learning chess by just memorizing all the moves great players made but without any context or understanding of the underlying strategy.
-
Articles like this are good because it exposes the flaws with the ai and that it can't be trusted with complex multi step tasks.
Helps people see that think AI is close to a human that its not and its missing critical functionality
The problem is though that this perpetuates the idea that ChatGPT is actually an AI.
-
In all fairness. Machine learning in chess engines is actually pretty strong.
AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind).
AlphaZero - Chess Engines
Learn all about the AlphaZero chess program. Everything you need to know about AlphaZero, including what it is, why it is important, and more!
Chess.com (www.chess.com)
Oh absolutely you can apply machine learning to game strategy. But you can't expect a generalized chatbot to do well at strategic decision making for a specific game.
-
You're so fucking silly. You gonna study cell theory to see how long you should keep vegetables in your fridge? Go home. Save science for people who understand things.
Save science for people who understand things.
Does this not strike you as the least bit ironic?
-
I like tab coding, writing small blocks of code that it thinks I need. Its On point almost all the time. This speeds me up.
Bingo. If anything what you're finding is the people bitching are the same people that if given a bike wouldn't know how to ride it, which is fair. Some people understand quicker how to use the tools they are given.
Edit - a poor carpenter blames his tools.
-
Not to help the AI companies, but why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff? It's obvious they're shit at it, why do they answer anyway? It's because they're programmed by know-it-all programmers, isn't it.
This is where MCP comes in. It's a protocol for LLMs to call standard tools. Basically the LLM would figure out the tool to use from the context, then figure out the order of parameters from those the MCP server says is available, send the JSON, and parse the response.
-
That is more a failure of the person who made that decision than a failing of ChatBots, lol
Agreed, which is why it's important to have articles out in the wild that show the shortcomings of AI. If all people read is all the positive crap coming out of companies like OpenAI then they will make stupid decisions.
-
This post did not contain any content.
Hardly surprising. Llms aren't -thinking- they're just shitting out the next token for any given input of tokens.
-
ChatGPT has been, hands down, the worst AI coding assistant I've ever used.
It regularly suggests code that doesn't compile or isn't even for the language.
It generally suggests AC of code that is just a copy of the lines I just wrote.
Sometimes it likes to suggest setting the same property like 5 times.
It is absolute garbage and I do not recommend it to anyone.
All AIs are the same. They're just scraping content from GitHub, stackoverflow etc with a bunch of guardrails slapped on to spew out sentences that conform to their training data but there is no intelligence. They're super handy for basic code snippets but anyone using them anything remotely complex or nuanced will regret it.
-
You say you produce good oranges but my machine for testing apples gave your oranges a very low score.
No, more like "Your marketing team, sales team, the news media at large, and random hype men all insist your orange machine works amazing on any fruit if you know how to use it right. It didn't work my strawberries when I gave it all the help I could, and was outperformed by my 40 year old strawberry machine. Please stop selling the idea it works on all fruit."
This study is specifically a counter to the constant hype that these LLMs will revolutionize absolutely everything, and the constant word choices used in discussion of LLMs that imply they have reasoning capabilities.
-
my favorite thing is to constantly be implementing libraries that don't exist
It's even worse when AI soaks up some project whose APIs are constantly changing. Try using AI to code against jetty for example and you'll be weeping.
-
This post did not contain any content.
I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
-
Have you tried feeding the toddler gallons of baby-food? Maybe then it can play chess
They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
-
They used ChatGPT 4o, instead of using o1 or o3.
Obviously it was going to fail.
Other studies (not all chess based or against this old chess AI) show similar lackluster results when using reasoning models.
Edit: When comparing reasoning models to existing algorithmic solutions.
-
I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
-
They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.
"If we have to ask every time before stealing a little baby food, our morbidly obese toddler cannot survive"
-
ChatGPT has been, hands down, the worst AI coding assistant I've ever used.
It regularly suggests code that doesn't compile or isn't even for the language.
It generally suggests AC of code that is just a copy of the lines I just wrote.
Sometimes it likes to suggest setting the same property like 5 times.
It is absolute garbage and I do not recommend it to anyone.
I’ve had success with splitting a function into 2 and planning out an overview, though that’s more like talking to myself
I wouldn’t use it to generate stuff though
-
It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.
You get 2 triangles in a single square mate...
CHECKMATE!
-
Tesla is trying to prevent the city of Austin, Texas, from releasing public records involving self-driving robotaxis
Technology1
-
-
-
-
Meta and Palmer Luckey's Anduril Industries partner to build EagleEye, a new AI-powered weapons system, including rugged helmets, glasses, and other wearables
Technology1
-
Is it feasible and scalable to combine self-replicating automata (after von Neumann) with federated learning and the social web?
Technology1
-
-