linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

OpenAI just launched its new ChatGPT Agent that can make as many as 1 complicated cupcake order per hour, but even Sam Altman says you probably shouldn't trust it for 'high-stakes uses'

Technology

56 Beiträge 34 Kommentatoren 227 Aufrufe

W wise_pancake@lemmy.ca

Okay but that’s not what easier means.

Easier would be to call the bakery or spending 10 minutes browsing their website, asking to cast, and checking out.

I don’t want to spend an hour on tasks that would normally take 10 minutes. My executive dysfunctions already make me good at doing that.

This might be a revolutionary idea, but what if they helped me do that take an hour in 10 minutes?

I’m just putting that idea out there totally for free in case any AI companies want to jump on that opportunity.
E This user is from outside of this forum
E This user is from outside of this forum
evotech@lemmy.world

schrieb zuletzt editiert von

#19

It’s a starting point
W 1 Antwort Letzte Antwort

3
B brsrklf@jlai.lu

I needed about 30 minutes to do a python application from scratch that took linear JSON data files, merged them and presented them as a tree in a GUI.

Before that I had barely done anything in python, basically could do a basic function declaration with a simple operation and nothing else. I even didn't have a lot of experience with UI at all.

But like you I had experience with java and such, and those skills transfer. All it took was searching basic syntax/related code examples and required library imports. And I mean basic, search engine search, not AI answers.

All I'm saying is, I really don't think AI is providing anything a lot more efficient than doing a good old crawl through API docs and stack overflow. So the fact it's using tremendous amounts of resources to maybe achieve a 10% efficiency boost is bothering me a lot.
M This user is from outside of this forum
M This user is from outside of this forum
magicshel@lemmy.zip

schrieb zuletzt editiert von magicshel@lemmy.zip

#20

If that was a 10% boost for you and you could've done it in 33 minutes without AI or experience, then my imposter syndrome has been right all along!

I'd bet that would've taken me a few days and maybe buying a reference book and starting with hello world.
B 1 Antwort Letzte Antwort

3
C cosmonova@lemmy.world

So much for the internet. We somehow managed to turn one of humanity’s greatest achievements into a hateful echo chamber we use for warfare first and then into a blackbox where inefficient AI agents communicate with each other in the most inefficient way so the planet can cook us alive even faster. God forbid just calling up a bakery to order some cupcakes.
E This user is from outside of this forum
E This user is from outside of this forum
emi@ani.social

schrieb zuletzt editiert von

#21

Or just sending an email.
1 Antwort Letzte Antwort

6
M magicshel@lemmy.zip

If that was a 10% boost for you and you could've done it in 33 minutes without AI or experience, then my imposter syndrome has been right all along!

I'd bet that would've taken me a few days and maybe buying a reference book and starting with hello world.
B This user is from outside of this forum
B This user is from outside of this forum
brsrklf@jlai.lu

schrieb zuletzt editiert von

#22

Did the AI gave you a starting point that would be very different from a bit of code someone submitted 10 years ago on stack exchange? Because in my experience, everything has already been asked and answered. This includes the most basic and naive stuff, and often I am very grateful for it, because, yeah, sometimes I need someone to guide me through the most basic stuff.

In fact, the AI needed that exact knowledge base and a bunch more to exist in the first place. It's just vaguely competent at retrieving it.

Anyway, I didn't say I had no experience, just the most minimal python experience. There are definitely a few quirks I had to learn (the data structures mostly), but for the rest is mostly finding the right method in the reference library, like you would in java.
M 1 Antwort Letzte Antwort

3
T tonytins@pawb.social

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

[...]

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.
R This user is from outside of this forum
R This user is from outside of this forum
romantired@shibanu.app

schrieb zuletzt editiert von

#23

I need an agent who would set up DevOps for me. Then robots would definitely be the ones working hard, not humans.
1 Antwort Letzte Antwort

3
W wise_pancake@lemmy.ca

Okay but that’s not what easier means.

Easier would be to call the bakery or spending 10 minutes browsing their website, asking to cast, and checking out.

I don’t want to spend an hour on tasks that would normally take 10 minutes. My executive dysfunctions already make me good at doing that.

This might be a revolutionary idea, but what if they helped me do that take an hour in 10 minutes?

I’m just putting that idea out there totally for free in case any AI companies want to jump on that opportunity.
E This user is from outside of this forum
E This user is from outside of this forum
eyekaytee@aussie.zone

schrieb zuletzt editiert von

#24

I don’t want to spend an hour on tasks that would normally take 10 minutes.

I don't get it, do you think she spent an hour talking to ChatGPT to try and get it to order doughnuts?
1 Antwort Letzte Antwort

3
P paraphrand@lemmy.world

It really is a nightmare brewing. And they will hide behind excuses and keep it all opaque unless they are strongly regulated.
O This user is from outside of this forum
O This user is from outside of this forum
opavader@lemmy.world

schrieb zuletzt editiert von opavader@lemmy.world

#25

regulated by who ?
our senate and congress is filled by pimps who work for pedophiles like epstien and cheer genocider scum murdering children on daily basis. this include the “lesser evil” party. they had 4 years to release the pedo list or even try to slow down the genocide.
they are not gonna give a fck about us working 3 jobs just to pay rent and live on prison food.

sad reality is that after a certain threshold in a parasites-host dynamic, there is no other ending other than host dying because parasites has grown too big for it too feed. so unless another deadly parasite like cia or kgb luigi the 1%, the rest 99% are dead.
1 Antwort Letzte Antwort

2
E evotech@lemmy.world

It’s a starting point
W This user is from outside of this forum
W This user is from outside of this forum
wise_pancake@lemmy.ca

schrieb zuletzt editiert von

#26

I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

I don't think agents are ready for the main event and these are some poor examples of their power.

I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.
E E 2 Antworten Letzte Antwort

3
W wazowski@lemmy.world

I spent maybe 90 minutes trying to get ChatGPT to write me a fucking AppleScript or bash to copy all calendar events from a source calendar to a destination. That shit does not work.
E This user is from outside of this forum
E This user is from outside of this forum
eyekaytee@aussie.zone

schrieb zuletzt editiert von

#27

for coding you want to use claude

if you don’t want to pay for claude after so many messages what you can do is use mistral to code it up then use claude to proof check the code
W 1 Antwort Letzte Antwort

3
O opavader@lemmy.world

unfortunately any ai service is going to make things worse.
right now we can discover and choose. with search and browsing dead, ai provider will shove the product giving them the highest cut aka most garbage or snake oil products.

even today targeted advertising for poor people is filled with betting, lottery & poker game. similarly elder people are primarily shown ads of miracle cure for chronic illness and scammy religious crap.

edit: switch to kagi. its paid but well worth it.
searchXNG is also a good alternative if you have got time for hosting it urself.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von

#28

Kagi is all in on AI. Its the AI slop version of a search ranking algorithm
E 1 Antwort Letzte Antwort

1
M magicshel@lemmy.zip

It won't do that well. What you have to do is ask it to help you leverage your existing development skills in an unfamiliar domain. I used it to help me write a python program to authenticate, pull and filter data from a GCP firestore database and create an XLSX with summary and detail sheets.

I've never used Python before in my life. It took me about 4 hours. Of course I've been doing that sort of thing in Java for many years. Turned out I wrote that faster in Python than I could in Java. Configuring the connection to that database in Python was so simple compared to Java.

The stuff it wrote was sometimes incomplete or wrong in subtle ways, but I could see the bits that didn't make sense which helped me focus on those things and ask better questions to help me figure it out. I think the last hour was just me tweaking stuff by myself because I didn't need help with it by that point.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von

#29

Anyone who already knows another programming language but has never used python in their life can write a simple python app quickly, regardless
Z 1 Antwort Letzte Antwort

1
B brsrklf@jlai.lu

I needed about 30 minutes to do a python application from scratch that took linear JSON data files, merged them and presented them as a tree in a GUI.

Before that I had barely done anything in python, basically could do a basic function declaration with a simple operation and nothing else. I even didn't have a lot of experience with UI at all.

But like you I had experience with java and such, and those skills transfer. All it took was searching basic syntax/related code examples and required library imports. And I mean basic, search engine search, not AI answers.

All I'm saying is, I really don't think AI is providing anything a lot more efficient than doing a good old crawl through API docs and stack overflow. So the fact it's using tremendous amounts of resources to maybe achieve a 10% efficiency boost is bothering me a lot.
A This user is from outside of this forum
A This user is from outside of this forum
adespoton@lemmy.ca

schrieb zuletzt editiert von

#30
There’s also the fact that
1. It’s only really good at this if you want it to generate Python, PowerShell, bash, or C++ code. Try any other language and it quickly assumes you’re using outdated and often incompatible libraries or doesn’t really understand how the language functions.
2. at the end of it all, neither you nor the AI has learned anything new; you’ll have to put in the exact same amount of work the next time. If you do it yourself, then over time that 10% advantage goes away.
Now, these things could both change over time, but humans are much more efficient to train than current state of the art probability sieves we call GenAI.
Z 1 Antwort Letzte Antwort

2
T theunknownmuncher@lemmy.world

Anyone who already knows another programming language but has never used python in their life can write a simple python app quickly, regardless
Z This user is from outside of this forum
Z This user is from outside of this forum
zexks@lemmy.world

schrieb zuletzt editiert von

#31

No you can't if you don't know the libraries. Python is entirely dependent on what libraries you include. If you don't know what you need you can't do shit.
T 1 Antwort Letzte Antwort

2
A adespoton@lemmy.ca
There’s also the fact that
1. It’s only really good at this if you want it to generate Python, PowerShell, bash, or C++ code. Try any other language and it quickly assumes you’re using outdated and often incompatible libraries or doesn’t really understand how the language functions.
2. at the end of it all, neither you nor the AI has learned anything new; you’ll have to put in the exact same amount of work the next time. If you do it yourself, then over time that 10% advantage goes away.
Now, these things could both change over time, but humans are much more efficient to train than current state of the art probability sieves we call GenAI.
Z This user is from outside of this forum
Z This user is from outside of this forum
zexks@lemmy.world

schrieb zuletzt editiert von

#32

It's only assuming if you aren't specific enough. And you do know their training is usually a year or two or 3 old. So they don't know about whatever new shit your trying to work with.
1 Antwort Letzte Antwort

0
W wise_pancake@lemmy.ca

I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

I don't think agents are ready for the main event and these are some poor examples of their power.

I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.
E This user is from outside of this forum
E This user is from outside of this forum
evotech@lemmy.world

schrieb zuletzt editiert von

#33

It’s examples for the common guy in the streets who don’t know what an mcp server is.
1 Antwort Letzte Antwort

3
T tonytins@pawb.social

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

[...]

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.
N This user is from outside of this forum
N This user is from outside of this forum
nthavoc@lemmy.today

schrieb zuletzt editiert von

#34

That's quite a bold statement to make since he now has US military contracts. What is he making cupcakes for the Pentagon?
A 1 Antwort Letzte Antwort

10
Z zexks@lemmy.world

No you can't if you don't know the libraries. Python is entirely dependent on what libraries you include. If you don't know what you need you can't do shit.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#35

No you can't if you don't know the libraries

IDE.

Python is entirely dependent on what libraries you include

??

If you don't know what you need you can't do shit.

IDE.

The problems you propose in your comment are not only greatly exaggerated but already been solved for decades using conventional tools AND apply to literally all languages, having nothing at all to do with python. Good try! My statement holds true.

Maybe your assumption is that you're in a cave writing code in pencil on paper, but that's not a typical working condition. If you have access to Claude to use as a crutch, then you have access to search for an available python library and read some "Getting Started" paragraphs.

Seriously, if the only real value that AI provides is "you don't need to know the libraries you're using" that's not quite as strong of an argument as you think it is lmaooo "knowing the libraries" isn't exactly an existing challenge or software engineering problem that people struggle with...
M Z 2 Antworten Letzte Antwort

2
C cosmonova@lemmy.world

So much for the internet. We somehow managed to turn one of humanity’s greatest achievements into a hateful echo chamber we use for warfare first and then into a blackbox where inefficient AI agents communicate with each other in the most inefficient way so the planet can cook us alive even faster. God forbid just calling up a bakery to order some cupcakes.
W This user is from outside of this forum
W This user is from outside of this forum
webp@mander.xyz

schrieb zuletzt editiert von

#36

Companies will dump billions into AI to fuck everyone over but the transition to clean energy is always too expensive.
R 1 Antwort Letzte Antwort

18
T theunknownmuncher@lemmy.world

Kagi is all in on AI. Its the AI slop version of a search ranking algorithm
E This user is from outside of this forum
E This user is from outside of this forum
ebolapie@lemmy.world

schrieb zuletzt editiert von

#37

Kagi has AI tools but they don't shove it down your throat. I don't understand what "all in on AI" means in this context. The company has said that they want to use AI like they use JavaScript, ie they want to use it as a tool but their product should work well without it.
1 Antwort Letzte Antwort

0
E eyekaytee@aussie.zone

for coding you want to use claude

if you don’t want to pay for claude after so many messages what you can do is use mistral to code it up then use claude to proof check the code
W This user is from outside of this forum
W This user is from outside of this forum
wazowski@lemmy.world

schrieb zuletzt editiert von

#38

tx, will try it some time.
1 Antwort Letzte Antwort

3

Anmelden zum Antworten

T

Vibe coding service Replit deleted production database
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
118

1

568 Stimmen

118 Beiträge

24 Aufrufe

I

And you are talking about obvious bugs. It likely will make erroneous judgements (because somewhere in its training data someone coded it that way) which will down the line lead to subtle problems that will wreck your system and cost you much more. Sure humans can also make the same mistakes but in the current state of affairs, an experienced software engineer/programmer has a much higher chance of catching such an error. With LLMs it is more hit and miss especially if it is a more niche topic. Currently, it is an assistant tool (sometimes quite helpful, sometimes frustrating at best) not an autonomous coder. Any company that claims so is either a crook or also does not know much about coding.
P

Apple to face DoJ’s monopoly lawsuit, after judge denies dismissal of case
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

14 Stimmen

2 Beiträge

23 Aufrufe

L

Welp, queue up some more multi-million dollar "donations" to have these cases dropped... Not like the TechBros don't have the funds. ‍️ ‍️
E

Last year China generated almost 3 times as much solar power as the EU did, and it's close to overtaking all OECD countries put together (whose combined population is 1.38 billion people)
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
149

2

454 Stimmen

149 Beiträge

670 Aufrufe

E

They will say something like solar went from 600gw to 1000 thats a 66% increase this year and coal only increased 40% except coal is 3600gw to 6400. Hrmmmm, maybe these numbers are outdated? Based on this coal and gas are down: In Q1 2025, solar generation rose 48% compared to the same period in 2024. Solar power reached 254 TWh, making up 10% of total electricity. This was the largest increase among all clean energy sources. Coal-fired electricity dropped by 4%, falling to 1,421 TWh. Gas-fired power also went down by 4%, reaching 67 TWh https://carboncredits.com/china-sets-clean-energy-record-in-early-2025-with-951-tw/ are no where close to what is required to meet their climate goals Which ones in particular are you talking about? Trump signs executive order directing US withdrawal from the Paris climate agreement — again https://apnews.com/article/trump-paris-agreement-climate-change-788907bb89fe307a964be757313cdfb0 China vowed on Tuesday to continue participating in two cornerstone multinational arrangements -- the World Health Organization and Paris climate accord -- after newly sworn-in US President Donald Trump ordered withdrawals from them. https://www.france24.com/en/live-news/20250121-china-says-committed-to-who-paris-climate-deal-after-us-pulls-out What's that saying? You hate it when the person you hate is doing good? I can't remember what it is I can't fault them for what they're doing at the moment, even if they are run by an evil dictatorship and do pollute the most I’m not sure how european defense spending is relevant It suggests there is money available in the bank to fund solar/wind/battery, but instead they are preparing for? something? what? who knows. France can make a fighter jet at home but not solar panels apparently. Prehaps they would be made in a country with environmental and labour laws if governments legislated properly to prevent companies outsourcing manufacturing. However this doesnt absolve china. China isnt being forced at Gunpoint to produce these goods with low labour regulation and low environmental regulation. You're right, it doesn't absolve china, and I avoid purchasing things from them wherever possible, my solar panels and EV were made in South Korea, my home battery was made in Germany, there are only a few things in my house made in China, most of them I got second hand but unfortunately there is no escaping the giant of manufacturing. With that said it's one thing for me to sit here and tut tut at China, but I realise I am not most people, the most clearest example is the extreme anti-ai, anti-billionaire bias on this platform, in real life most people don't give a fuck, they love Amazon/Microsoft/Google/Apple etc, they can't go a day without them. So I consider myself a realist, if you want people to buy your stuff then you will need to make the conditions possible for them to WANT to buy your stuff, not out of some moral lecture and Europe isn't doing that, if we look at energy prices: Can someone actually point out to me where this comes from? ... At the end of the day energy is a small % of EU household spending I was looking at corporate/business energy use: Major European companies are already moving to cut costs and retain their competitive edge. For example, Thyssenkrupp, Germany’s largest steelmaker, said on Monday it would slash 11,000 jobs in its steel division by 2030, in a major corporate reshuffle. https://oilprice.com/Latest-Energy-News/World-News/High-Energy-Costs-Continue-to-Plague-European-Industry.html Prices have since fallen but are still high compared to other countries. A poll by Germany's DIHK Chambers of Industry and Commerce of around 3,300 companies showed that 37% were considering cutting production or moving abroad, up from 31% last year and 16% in 2022. For energy-intensive industrial firms some 45% of companies were mulling slashing output or relocation, the survey showed. "The trust of the German economy in energy policy is severely damaged," Achim Dercks, DIHK deputy chief executive said, adding that the government had not succeeded in providing companies with a perspective for reliable and affordable energy supply. https://www.reuters.com/business/energy/more-german-companies-mull-relocation-due-high-energy-prices-survey-2024-08-01/ I've seen nothing to suggest energy prices in the EU are SO cheap that it's worth moving manufacturing TO Europe, and this is what annoys me the most. I've pointed this out before but they have an excellent report on the issues: https://commission.europa.eu/document/download/97e481fd-2dc3-412d-be4c-f152a8232961_en?filename=The+future+of+European+competitiveness+_+A+competitiveness+strategy+for+Europe.pdf Then they put out this Competitive Compass: https://commission.europa.eu/topics/eu-competitiveness/competitiveness-compass_en But tbh every week in the EU it seems like they are chasing after some other goal. This would be great, it would have been greater 10 years ago. Agreed
A

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
37

1

311 Stimmen

37 Beiträge

207 Aufrufe

S

Same, especially when searching technical or niche topics. Since there aren't a ton of results specific to the topic, mostly semi-related results will appear in the first page or two of a regular (non-Gemini) Google search, just due to the higher popularity of those webpages compared to the relevant webpages. Even the relevant webpages will have lots of non-relevant or semi-relevant information surrounding the answer I'm looking for. I don't know enough about it to be sure, but Gemini is probably just scraping a handful of websites on the first page, and since most of those are only semi-related, the resulting summary is a classic example of garbage in, garbage out. I also think there's probably something in the code that looks for information that is shared across multiple sources and prioritizing that over something that's only on one particular page (possibly the sole result with the information you need). Then, it phrases the summary as a direct answer to your query, misrepresenting the actual information on the pages they scraped. At least Gemini gives sources, I guess. The thing that gets on my nerves the most is how often I see people quote the summary as proof of something without checking the sources. It was bad before the rollout of Gemini, but at least back then Google was mostly scraping text and presenting it with little modification, along with a direct link to the webpage. Now, it's an LLM generating text phrased as a direct answer to a question (that was also AI-generated from your search query) using AI-summarized data points scraped from multiple webpages. It's obfuscating the source material further, but I also can't help but feel like it exposes a little of the behind-the-scenes fuckery Google has been doing for years before Gemini. How it bastardizes your query by interpreting it into a question, and then prioritizes homogeneous results that agree on the "answer" to your "question". For years they've been doing this to a certain extent, they just didn't share how they interpreted your query.
A

My AI Skeptic Friends Are All Nuts
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
31

1

13 Stimmen

31 Beiträge

162 Aufrufe

J

I did read it, and my comment is exactly referencing the attitude of the author which is "It's good enough, so you should use it". I disagree, and say it's another dumbass shortcut to cash grab on a less than stellar ecosystem and product. It's training wheels for failure.
P

Britain’s Companies Are Being Hacked
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
9

1

21 Stimmen

9 Beiträge

52 Aufrufe

D

Is that "goodbye" in Russian? Why?
A

GeForce GTX 970 8GB mod is back for a full review
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

34 Stimmen

1 Beiträge

14 Aufrufe

Niemand hat geantwortet
Y

IBM sues a Zurich-based startup over 'unlawful' use of mainframe technology
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
4

1

0 Stimmen

4 Beiträge

34 Aufrufe

R

Yeah, damn, I always forget about that...just like they want...