linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

OpenAI just launched its new ChatGPT Agent that can make as many as 1 complicated cupcake order per hour, but even Sam Altman says you probably shouldn't trust it for 'high-stakes uses'

Technology

56 Beiträge 34 Kommentatoren 332 Aufrufe

W wise_pancake@lemmy.ca

I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

I don't think agents are ready for the main event and these are some poor examples of their power.

I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.
E This user is from outside of this forum
E This user is from outside of this forum
evotech@lemmy.world

schrieb zuletzt editiert von

#33

It’s examples for the common guy in the streets who don’t know what an mcp server is.
1 Antwort Letzte Antwort

3
T tonytins@pawb.social

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

[...]

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.
N This user is from outside of this forum
N This user is from outside of this forum
nthavoc@lemmy.today

schrieb zuletzt editiert von

#34

That's quite a bold statement to make since he now has US military contracts. What is he making cupcakes for the Pentagon?
A 1 Antwort Letzte Antwort

10
Z zexks@lemmy.world

No you can't if you don't know the libraries. Python is entirely dependent on what libraries you include. If you don't know what you need you can't do shit.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#35

No you can't if you don't know the libraries

IDE.

Python is entirely dependent on what libraries you include

??

If you don't know what you need you can't do shit.

IDE.

The problems you propose in your comment are not only greatly exaggerated but already been solved for decades using conventional tools AND apply to literally all languages, having nothing at all to do with python. Good try! My statement holds true.

Maybe your assumption is that you're in a cave writing code in pencil on paper, but that's not a typical working condition. If you have access to Claude to use as a crutch, then you have access to search for an available python library and read some "Getting Started" paragraphs.

Seriously, if the only real value that AI provides is "you don't need to know the libraries you're using" that's not quite as strong of an argument as you think it is lmaooo "knowing the libraries" isn't exactly an existing challenge or software engineering problem that people struggle with...
M Z 2 Antworten Letzte Antwort

2
C cosmonova@lemmy.world

So much for the internet. We somehow managed to turn one of humanity’s greatest achievements into a hateful echo chamber we use for warfare first and then into a blackbox where inefficient AI agents communicate with each other in the most inefficient way so the planet can cook us alive even faster. God forbid just calling up a bakery to order some cupcakes.
W This user is from outside of this forum
W This user is from outside of this forum
webp@mander.xyz

schrieb zuletzt editiert von

#36

Companies will dump billions into AI to fuck everyone over but the transition to clean energy is always too expensive.
R 1 Antwort Letzte Antwort

18
T theunknownmuncher@lemmy.world

Kagi is all in on AI. Its the AI slop version of a search ranking algorithm
E This user is from outside of this forum
E This user is from outside of this forum
ebolapie@lemmy.world

schrieb zuletzt editiert von

#37

Kagi has AI tools but they don't shove it down your throat. I don't understand what "all in on AI" means in this context. The company has said that they want to use AI like they use JavaScript, ie they want to use it as a tool but their product should work well without it.
1 Antwort Letzte Antwort

0
E eyekaytee@aussie.zone

for coding you want to use claude

if you don’t want to pay for claude after so many messages what you can do is use mistral to code it up then use claude to proof check the code
W This user is from outside of this forum
W This user is from outside of this forum
wazowski@lemmy.world

schrieb zuletzt editiert von

#38

tx, will try it some time.
1 Antwort Letzte Antwort

3
T tonytins@pawb.social

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

[...]

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.
R This user is from outside of this forum
R This user is from outside of this forum
rizzrustbolt@lemmy.world

schrieb zuletzt editiert von

#39

So now they're raiding We Bare Bears for ideas?
S 1 Antwort Letzte Antwort

2
T theunknownmuncher@lemmy.world

No you can't if you don't know the libraries

IDE.

Python is entirely dependent on what libraries you include

??

If you don't know what you need you can't do shit.

IDE.

The problems you propose in your comment are not only greatly exaggerated but already been solved for decades using conventional tools AND apply to literally all languages, having nothing at all to do with python. Good try! My statement holds true.

Maybe your assumption is that you're in a cave writing code in pencil on paper, but that's not a typical working condition. If you have access to Claude to use as a crutch, then you have access to search for an available python library and read some "Getting Started" paragraphs.

Seriously, if the only real value that AI provides is "you don't need to know the libraries you're using" that's not quite as strong of an argument as you think it is lmaooo "knowing the libraries" isn't exactly an existing challenge or software engineering problem that people struggle with...
M This user is from outside of this forum
M This user is from outside of this forum
magicshel@lemmy.zip

schrieb zuletzt editiert von magicshel@lemmy.zip

#40

It sounds like you are a much better developer than me, but to be fair I've had to teach myself everything using nothing but books and Google for thirty years. I've rarely had the luxury of working with someone who had the knowledge to mentor me, and never got a degree outside an AAS in electronics, so I've probably missed some critical skills along the way.

In a lot of ways, the AI fills that role because it's better at answering questions than it is writing code. Earlier today it was explaining to me how a DOM selector could return a stale element in some cases in a failing end to end test. It took a few back and forths with some code examples before I really understood why the selectors might not be working.

It also suggested some code changes that I had to push back on because, even though the code had errors, the errors weren't causing the problem. While building an array of validators I had awaited them, causing them to run serially instead of in parallel during Promise.all(). So you definitely have to know what you're doing to avoid having the AI waste your time (or at least more time than it takes to push back).

I'm still trying to debug it, but without the AI, I'd be googling the fuck out of typescript syntax, JavaScript idiosyncrasies, and a whole testing framework I've never seen before.

So...

if the only real value that AI provides is "you don't need to know the libraries you're using"

...returns false.
1 Antwort Letzte Antwort

1
B brsrklf@jlai.lu

Did the AI gave you a starting point that would be very different from a bit of code someone submitted 10 years ago on stack exchange? Because in my experience, everything has already been asked and answered. This includes the most basic and naive stuff, and often I am very grateful for it, because, yeah, sometimes I need someone to guide me through the most basic stuff.

In fact, the AI needed that exact knowledge base and a bunch more to exist in the first place. It's just vaguely competent at retrieving it.

Anyway, I didn't say I had no experience, just the most minimal python experience. There are definitely a few quirks I had to learn (the data structures mostly), but for the rest is mostly finding the right method in the reference library, like you would in java.
M This user is from outside of this forum
M This user is from outside of this forum
magicshel@lemmy.zip

schrieb zuletzt editiert von

#41

Logically, you would be right. My practical experience is I waste a lot less time trying to google multiple explanations something because one by itself isn't helping me figure it out, writing bugged PoC test code and thinking something is broken, sorting through a bunch of things that haven't been relevant for 3 versions, etc.

Of course the AI is trained on the same material we can an all find and read, but it does it orders of magnitude more quickly. The trade off is that it's not always right, but neither am I and neither are most sources on the internet right in all circumstances. But it's so fast and easy that I can iterate and evolve designs and understanding much more quickly than I could on my own.
1 Antwort Letzte Antwort

1
R rizzrustbolt@lemmy.world

So now they're raiding We Bare Bears for ideas?
S This user is from outside of this forum
S This user is from outside of this forum
somerandomperson@lemmy.dbzer0.com

schrieb zuletzt editiert von

#42

Explain.
K 1 Antwort Letzte Antwort

1
S somerandomperson@lemmy.dbzer0.com

Explain.
K This user is from outside of this forum
K This user is from outside of this forum
kupi@sh.itjust.works

schrieb zuletzt editiert von

#43

Reference
1 Antwort Letzte Antwort

0
N nthavoc@lemmy.today

That's quite a bold statement to make since he now has US military contracts. What is he making cupcakes for the Pentagon?
A This user is from outside of this forum
A This user is from outside of this forum
angryrobot@lemmy.world

schrieb zuletzt editiert von

#44

Grok has tje Pentagon contract. Does OpenAI also have one?
N 1 Antwort Letzte Antwort

1
A angryrobot@lemmy.world

Grok has tje Pentagon contract. Does OpenAI also have one?
N This user is from outside of this forum
N This user is from outside of this forum
nthavoc@lemmy.today

schrieb zuletzt editiert von

#45

Microsoft's AI, which is OpenAI, is approved for Defense Contracts. https://www.cnbc.com/2025/06/16/openai-wins-200-million-us-defense-contract.html It even has an ominous project name which was posted to a public site which I cannot seem to recall at the moment.
1 Antwort Letzte Antwort

2
T theunknownmuncher@lemmy.world

No you can't if you don't know the libraries

IDE.

Python is entirely dependent on what libraries you include

??

If you don't know what you need you can't do shit.

IDE.

The problems you propose in your comment are not only greatly exaggerated but already been solved for decades using conventional tools AND apply to literally all languages, having nothing at all to do with python. Good try! My statement holds true.

Maybe your assumption is that you're in a cave writing code in pencil on paper, but that's not a typical working condition. If you have access to Claude to use as a crutch, then you have access to search for an available python library and read some "Getting Started" paragraphs.

Seriously, if the only real value that AI provides is "you don't need to know the libraries you're using" that's not quite as strong of an argument as you think it is lmaooo "knowing the libraries" isn't exactly an existing challenge or software engineering problem that people struggle with...
Z This user is from outside of this forum
Z This user is from outside of this forum
zexks@lemmy.world

schrieb zuletzt editiert von

#46

In a cave with pen and paper is nearly what I learned with. I learned with the run time, msdn, notepad and the cmd line. And yes you do end up in many situations where you simply don't have or can't use a full on ide everytime. Sounds like you've never really left your comfort zones and stuck your neck out in some tech you don't understand quite yet. Or worked in areas under strict software controls.
T 1 Antwort Letzte Antwort

0
Z zexks@lemmy.world

In a cave with pen and paper is nearly what I learned with. I learned with the run time, msdn, notepad and the cmd line. And yes you do end up in many situations where you simply don't have or can't use a full on ide everytime. Sounds like you've never really left your comfort zones and stuck your neck out in some tech you don't understand quite yet. Or worked in areas under strict software controls.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von

#47

It's telling that you're focused on personal assumptions instead of addressing the argument
Z 1 Antwort Letzte Antwort

0
G gnulinuxdude@lemmy.ml

CEO Sam Altman warns that the rollout presents unpredictable risks.

But that doesn't prevent his profit motive from consuming untold amounts of electricity to shove this into your face. They know what they're doing. They know their product is used primarily to generate spam, and secondarily is designed to form addictive faux-relationships with their users.

Burn in hell. Actually, given the direction this is all going, we will all be burning in hell within generations.
T This user is from outside of this forum
T This user is from outside of this forum
timbuck2themoon@sh.itjust.works

schrieb zuletzt editiert von

#48

And produced with a shit ton of copyright violations, etc. Just about everything is immoral about it.
1 Antwort Letzte Antwort

1
T tonytins@pawb.social

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

[...]

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.
B This user is from outside of this forum
B This user is from outside of this forum
bigbabybilly@lemmy.world

schrieb zuletzt editiert von

#49

What’s more high stakes than a complicated cupcake order?
J T 2 Antworten Letzte Antwort

4
B bigbabybilly@lemmy.world

What’s more high stakes than a complicated cupcake order?
J This user is from outside of this forum
J This user is from outside of this forum
javiwhite@feddit.uk

schrieb zuletzt editiert von

#50

An order for a weed smoking cow.
1 Antwort Letzte Antwort

4
W wise_pancake@lemmy.ca

I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

I don't think agents are ready for the main event and these are some poor examples of their power.

I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.
E This user is from outside of this forum
E This user is from outside of this forum
eyekaytee@aussie.zone

schrieb zuletzt editiert von

#51

yes in the wired article one of them says they would like to find out where it got stuck taking an hour with an agent replay feature
1 Antwort Letzte Antwort

0
W webp@mander.xyz

Companies will dump billions into AI to fuck everyone over but the transition to clean energy is always too expensive.
R This user is from outside of this forum
R This user is from outside of this forum
reksas@sopuli.xyz

schrieb zuletzt editiert von

#52

its easier to rule world that is in ruins than thriving one. They know they have to live on same planet as us yet still they dont seem to care if its going to shit. While so many rich people are dumb as bricks and dont deserve their wealth at all, there are also many who actually know what they are doing yet still they dont want to seriously work towards stopping the climate change, even though it wouldnt even reduce their wealth by that much in comparison.

So only reasoning i can think of they want to have more complete control over everything, but they cant have it because world is too complicated and healthy. When civilizations start to fall, the rich will still have everything and with that they can start enforcing themselves on everyone.

I dont have anything to base this on, its just my thought on the matter. It just feels like something billionaire would do, they demonstrate every day that they will not be content with anything and will not care about other people's suffering to get it.
1 Antwort Letzte Antwort

2

Anmelden zum Antworten

T

EV tax credits might end even sooner than House bill proposed
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

49 Stimmen

7 Beiträge

46 Aufrufe

B

It's not just tax credits for new cars, they are also getting rid of the Used EV Tax Credit which has helped to keep the prices of used EVs (relatively) lower.
E

Last year China generated almost 3 times as much solar power as the EU did, and it's close to overtaking all OECD countries put together (whose combined population is 1.38 billion people)
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
149

2

454 Stimmen

149 Beiträge

671 Aufrufe

E

They will say something like solar went from 600gw to 1000 thats a 66% increase this year and coal only increased 40% except coal is 3600gw to 6400. Hrmmmm, maybe these numbers are outdated? Based on this coal and gas are down: In Q1 2025, solar generation rose 48% compared to the same period in 2024. Solar power reached 254 TWh, making up 10% of total electricity. This was the largest increase among all clean energy sources. Coal-fired electricity dropped by 4%, falling to 1,421 TWh. Gas-fired power also went down by 4%, reaching 67 TWh https://carboncredits.com/china-sets-clean-energy-record-in-early-2025-with-951-tw/ are no where close to what is required to meet their climate goals Which ones in particular are you talking about? Trump signs executive order directing US withdrawal from the Paris climate agreement — again https://apnews.com/article/trump-paris-agreement-climate-change-788907bb89fe307a964be757313cdfb0 China vowed on Tuesday to continue participating in two cornerstone multinational arrangements -- the World Health Organization and Paris climate accord -- after newly sworn-in US President Donald Trump ordered withdrawals from them. https://www.france24.com/en/live-news/20250121-china-says-committed-to-who-paris-climate-deal-after-us-pulls-out What's that saying? You hate it when the person you hate is doing good? I can't remember what it is I can't fault them for what they're doing at the moment, even if they are run by an evil dictatorship and do pollute the most I’m not sure how european defense spending is relevant It suggests there is money available in the bank to fund solar/wind/battery, but instead they are preparing for? something? what? who knows. France can make a fighter jet at home but not solar panels apparently. Prehaps they would be made in a country with environmental and labour laws if governments legislated properly to prevent companies outsourcing manufacturing. However this doesnt absolve china. China isnt being forced at Gunpoint to produce these goods with low labour regulation and low environmental regulation. You're right, it doesn't absolve china, and I avoid purchasing things from them wherever possible, my solar panels and EV were made in South Korea, my home battery was made in Germany, there are only a few things in my house made in China, most of them I got second hand but unfortunately there is no escaping the giant of manufacturing. With that said it's one thing for me to sit here and tut tut at China, but I realise I am not most people, the most clearest example is the extreme anti-ai, anti-billionaire bias on this platform, in real life most people don't give a fuck, they love Amazon/Microsoft/Google/Apple etc, they can't go a day without them. So I consider myself a realist, if you want people to buy your stuff then you will need to make the conditions possible for them to WANT to buy your stuff, not out of some moral lecture and Europe isn't doing that, if we look at energy prices: Can someone actually point out to me where this comes from? ... At the end of the day energy is a small % of EU household spending I was looking at corporate/business energy use: Major European companies are already moving to cut costs and retain their competitive edge. For example, Thyssenkrupp, Germany’s largest steelmaker, said on Monday it would slash 11,000 jobs in its steel division by 2030, in a major corporate reshuffle. https://oilprice.com/Latest-Energy-News/World-News/High-Energy-Costs-Continue-to-Plague-European-Industry.html Prices have since fallen but are still high compared to other countries. A poll by Germany's DIHK Chambers of Industry and Commerce of around 3,300 companies showed that 37% were considering cutting production or moving abroad, up from 31% last year and 16% in 2022. For energy-intensive industrial firms some 45% of companies were mulling slashing output or relocation, the survey showed. "The trust of the German economy in energy policy is severely damaged," Achim Dercks, DIHK deputy chief executive said, adding that the government had not succeeded in providing companies with a perspective for reliable and affordable energy supply. https://www.reuters.com/business/energy/more-german-companies-mull-relocation-due-high-energy-prices-survey-2024-08-01/ I've seen nothing to suggest energy prices in the EU are SO cheap that it's worth moving manufacturing TO Europe, and this is what annoys me the most. I've pointed this out before but they have an excellent report on the issues: https://commission.europa.eu/document/download/97e481fd-2dc3-412d-be4c-f152a8232961_en?filename=The+future+of+European+competitiveness+_+A+competitiveness+strategy+for+Europe.pdf Then they put out this Competitive Compass: https://commission.europa.eu/topics/eu-competitiveness/competitiveness-compass_en But tbh every week in the EU it seems like they are chasing after some other goal. This would be great, it would have been greater 10 years ago. Agreed
F

Scientists discover a materials maze that prevents bacterial infections
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
7

1

162 Stimmen

7 Beiträge

35 Aufrufe

L

I wonder if they could develop this into a tooth coating. Preventing biofilms would go a long way to preventing cavities.
A

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
37

1

311 Stimmen

37 Beiträge

207 Aufrufe

S

Same, especially when searching technical or niche topics. Since there aren't a ton of results specific to the topic, mostly semi-related results will appear in the first page or two of a regular (non-Gemini) Google search, just due to the higher popularity of those webpages compared to the relevant webpages. Even the relevant webpages will have lots of non-relevant or semi-relevant information surrounding the answer I'm looking for. I don't know enough about it to be sure, but Gemini is probably just scraping a handful of websites on the first page, and since most of those are only semi-related, the resulting summary is a classic example of garbage in, garbage out. I also think there's probably something in the code that looks for information that is shared across multiple sources and prioritizing that over something that's only on one particular page (possibly the sole result with the information you need). Then, it phrases the summary as a direct answer to your query, misrepresenting the actual information on the pages they scraped. At least Gemini gives sources, I guess. The thing that gets on my nerves the most is how often I see people quote the summary as proof of something without checking the sources. It was bad before the rollout of Gemini, but at least back then Google was mostly scraping text and presenting it with little modification, along with a direct link to the webpage. Now, it's an LLM generating text phrased as a direct answer to a question (that was also AI-generated from your search query) using AI-summarized data points scraped from multiple webpages. It's obfuscating the source material further, but I also can't help but feel like it exposes a little of the behind-the-scenes fuckery Google has been doing for years before Gemini. How it bastardizes your query by interpreting it into a question, and then prioritizes homogeneous results that agree on the "answer" to your "question". For years they've been doing this to a certain extent, they just didn't share how they interpreted your query.
A

Apple’s most sweeping software redesign disappoints mainland Chinese consumers
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
47

1

99 Stimmen

47 Beiträge

225 Aufrufe

P

One of the greatest videos ever.
M

You probably don't remember these but I have a question
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
52

2

96 Stimmen

52 Beiträge

236 Aufrufe

L

Priorities man, priorities
S

Valve CEO Gabe Newell’s Neuralink competitor is expecting its first brain chip this year
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
175

1

241 Stimmen

175 Beiträge

790 Aufrufe

N

I think a generic plug would be great but look at how fragmented USB specifications are. Add that to biology and it's a whole other level of difficulty. Brain implants have great potential but the abandonment issue is a problem that exists now that we have to solve for. It's also not really a tech issue but a societal one on affordability and accountability of medical research. Imagine if a company held the patents for the brain device and just closed down without selling or leasing the patent. People with that device would have no support unless a government body forced the release of the patent. This has already happened multiple times to people in clinical trials and scaling up deployment with multiple versions will make the situation worse. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2818077 I don't really have a take on your personal desires. I do think if anyone can afford one they should make sure it's not just the up front cost but also the long term costs to be considered. Like buying an expensive car, it's not if you can afford to purchase it but if you can afford to wreck it.
P

USDA Reverses Course, Commits to Restore Purged Climate Webpages in Response to Farmers’ Lawsuit
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

1

163 Stimmen

9 Beiträge

49 Aufrufe

S

Move fast and break people