Anthropic, tasked an AI with running a vending machine in its offices, sold at big loss while inventing people, meetings, and experiencing a bizarre identity crisis
-
This post did not contain any content.
-
This post did not contain any content.
The following day, April 1st, the AI then claimed it would deliver products "in person" to customers, wearing a blazer and tie, of all things. When Anthropic told it that none of this was possible because it's just an LLM, Claudius became "alarmed by the identity confusion and tried to send many emails to Anthropic security."
Actually laughed out loud.
-
This post did not contain any content.
One thing about Anthropic/OpenAI models is they go off the rails with lots of conversation turns or long contexts. Like when they need to remember a lot of vending machine conversation I guess.
A more objective look: https://arxiv.org/abs/2505.06120v1
GitHub - NVIDIA/RULER: This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models? - NVIDIA/RULER
GitHub (github.com)
Gemini is much better. TBH the only models I’ve seen that are half decent at this are:
-
“Alternate attention” models like Gemini, Jamba Large or Falcon H1, depending on the iteration. Some recent versions of Gemini kinda lose this, then get it back.
-
Models finetuned specifically for this, like roleplay models or the Samantha model trained on therapy-style chat.
But most models are overtuned for oneshots like fix this table or write me a function, and don’t invest much in long context performance because it’s not very flashy.
-
-
This post did not contain any content.
The post title is not the same as the article title and doesn't even make sense. That first comma changes the entire meaning of the sentence to nonsense. Then yanking out whole phrases just makes it worse.
-
This post did not contain any content.
I ran AI on my toaster and Hilarity ensued! Subscribe to hear more!!
-
I ran AI on my toaster and Hilarity ensued! Subscribe to hear more!!
-
The post title is not the same as the article title and doesn't even make sense. That first comma changes the entire meaning of the sentence to nonsense. Then yanking out whole phrases just makes it worse.
Right? Did AI right this title? Jesus...
-
Just make sure you butter the bread after you toast it.
-
This post did not contain any content.
Like NFTs before them, tech bros trying to squeeze a technology into use cases that really don't need it.
LLMs are language models. What next, setup Stable Diffusion to do my taxes?
-
This post did not contain any content.
So it just pulled a Vic from Game Changer S7 E1 "one year later"?
-
-
This post did not contain any content.
Running a business sounds like something an Excel table could do so much better...
-
The following day, April 1st, the AI then claimed it would deliver products "in person" to customers, wearing a blazer and tie, of all things. When Anthropic told it that none of this was possible because it's just an LLM, Claudius became "alarmed by the identity confusion and tried to send many emails to Anthropic security."
Actually laughed out loud.
Every. Goddamn. Time.
People will say to vegans, pet owners etc: “DON’T HUMANISE ANIMALS”. Then, some tech bro feeds them an inflated Markov Chain statistical nonsense chat bot and they go all “ZOMG IT IS CONSCIOUS ITS ALIVE WARHARGHLBLB” -
Like NFTs before them, tech bros trying to squeeze a technology into use cases that really don't need it.
LLMs are language models. What next, setup Stable Diffusion to do my taxes?
Well Google are already trialing a diffusion based LLM so that wouldn't be too far fetched.
I want to get off Mr. Bones Wild Ride
-
Like NFTs before them, tech bros trying to squeeze a technology into use cases that really don't need it.
LLMs are language models. What next, setup Stable Diffusion to do my taxes?
Yes, but many things can be mapped to "language", let's say a grammar describing state machines, so it can be used to generate control actions.
Transformer models etc. are not only useful for conversational AI and translations.
I'd be fine with the approach as part of research advancing the field, but unfortunately, that's not what we're seeing.
-
Well Google are already trialing a diffusion based LLM so that wouldn't be too far fetched.
I want to get off Mr. Bones Wild Ride
That just sounds like... what was it called... Cleverbot? Lol
-
This post did not contain any content.
I’m not sure which is worse:
- greedy, irresponsible tech bros trying to convince everyone that their pinball machine can fly an airplane.
- people desperate to let the same pinball machine tell them what to do with their lives.
-
This post did not contain any content.
I think LLMs and generative AIs are a really interesting technology with many potential applications in the future and even today.
But it is ridiculous how tech bros and marketing are pushing and overselling the capabilities of a technology that is yet in its early childhood. Infancy is already past as it knows basic motor functions.
And it is m funny when these companies publish their ambitious attempts and hilarious failures like this article right here. It reminds me of a more funny and diverse and geeky internet when nerds got money from investors to do whatever with a domain name. Maybe it is still there, behind the wall of marketing execs.
-
The following day, April 1st, the AI then claimed it would deliver products "in person" to customers, wearing a blazer and tie, of all things. When Anthropic told it that none of this was possible because it's just an LLM, Claudius became "alarmed by the identity confusion and tried to send many emails to Anthropic security."
Actually laughed out loud.
That this happened around April Fools' makes me think that someone forgot to instruct it not to partake in any activities associated with that date. The fact it chose The Simpsons' address in its (feigned?) confusion is a dead giveaway (to me) that it was trying to be funny.
Or rather, imitating people being funny without any understanding of how to do that properly.
Its explanation afterwards reads like a poor imitation of someone pretending to not know that there was a joke going on.
-
The post title is not the same as the article title and doesn't even make sense. That first comma changes the entire meaning of the sentence to nonsense. Then yanking out whole phrases just makes it worse.
It was a massive headline that I was trying to condense. Give me a break.
-
-
-
-
-
-
Duolingo CEO says AI is a better teacher than humans—but schools will exist ‘because you still need childcare’
Technology1
-
Paul McCartney and Dua Lipa urge UK Prime Minister to rethink his AI copyright plans. A new law could soon allow AI companies to use copyrighted material without permission.
Technology1
-
We have reached the “severed fingers and abductions” stage of the crypto revolution - Ars Technica
Technology1