Anthropic tested Claude's(LLM, AI Chatbot) ability to manage a physical “storefront” to mixed results, as the AI struggled with pricing strategy and inventory management
-
This post did not contain any content.
Project Vend: Can Claude run a small shop? (And why does that matter?)
We let Claude run a small shop in the Anthropic office. Here's what happened.
(www.anthropic.com)
-
This post did not contain any content.
Project Vend: Can Claude run a small shop? (And why does that matter?)
We let Claude run a small shop in the Anthropic office. Here's what happened.
(www.anthropic.com)
Anybody who thought the answer could have been even remotely close to Yes is delusional.
-
Anybody who thought the answer could have been even remotely close to Yes is delusional.
I doubt anyone expected it to work completely, but it is interesting to see to what extent it worked and how it failed (halucinations and sycophancy)
-
I doubt anyone expected it to work completely, but it is interesting to see to what extent it worked and how it failed (halucinations and sycophancy)
True; I just hate headlines that ask stupid questions.
But then again, there's always the premise that it could work, in such attempts, which annoys me no less.
-
This post did not contain any content.
Project Vend: Can Claude run a small shop? (And why does that matter?)
We let Claude run a small shop in the Anthropic office. Here's what happened.
(www.anthropic.com)
This shit needs to start being regulated.
-
This post did not contain any content.
Project Vend: Can Claude run a small shop? (And why does that matter?)
We let Claude run a small shop in the Anthropic office. Here's what happened.
(www.anthropic.com)
It is an interesting article, even if it's conclusions are entirely too rosy. The "storefront" was a single vending machine, and the bot was instructed to interact with Anthropic employees (with an hourly cost attached) to do all physical interactions. While the bot did a decent job managing the stock most of the time, it made a lot of bad decisions based on trying to be too helpful to it's customers. It also frequently hallucinated, with some hilarious results I wont spoil here. But as anyone who owns a small business knows, one bad decision could put it under, so saying that an AI can manage a vending machine well "most of the time" is equivalent to saying it cant do the job at all.
Their conclusion is that with a bit more work, Claude might be able to perform as a middle-manager. To me, that says more about how useless middle-management is than how capable their AI is.
-
It is an interesting article, even if it's conclusions are entirely too rosy. The "storefront" was a single vending machine, and the bot was instructed to interact with Anthropic employees (with an hourly cost attached) to do all physical interactions. While the bot did a decent job managing the stock most of the time, it made a lot of bad decisions based on trying to be too helpful to it's customers. It also frequently hallucinated, with some hilarious results I wont spoil here. But as anyone who owns a small business knows, one bad decision could put it under, so saying that an AI can manage a vending machine well "most of the time" is equivalent to saying it cant do the job at all.
Their conclusion is that with a bit more work, Claude might be able to perform as a middle-manager. To me, that says more about how useless middle-management is than how capable their AI is.
So what you are saying is the AI is ready to replace tech CEOs.