Skip to content

OpenAI just launched its new ChatGPT Agent that can make as many as 1 complicated cupcake order per hour, but even Sam Altman says you probably shouldn't trust it for 'high-stakes uses'

Technology
56 34 329
  • It won't do that well. What you have to do is ask it to help you leverage your existing development skills in an unfamiliar domain. I used it to help me write a python program to authenticate, pull and filter data from a GCP firestore database and create an XLSX with summary and detail sheets.

    I've never used Python before in my life. It took me about 4 hours. Of course I've been doing that sort of thing in Java for many years. Turned out I wrote that faster in Python than I could in Java. Configuring the connection to that database in Python was so simple compared to Java.

    The stuff it wrote was sometimes incomplete or wrong in subtle ways, but I could see the bits that didn't make sense which helped me focus on those things and ask better questions to help me figure it out. I think the last hour was just me tweaking stuff by myself because I didn't need help with it by that point.

    I needed about 30 minutes to do a python application from scratch that took linear JSON data files, merged them and presented them as a tree in a GUI.

    Before that I had barely done anything in python, basically could do a basic function declaration with a simple operation and nothing else. I even didn't have a lot of experience with UI at all.

    But like you I had experience with java and such, and those skills transfer. All it took was searching basic syntax/related code examples and required library imports. And I mean basic, search engine search, not AI answers.

    All I'm saying is, I really don't think AI is providing anything a lot more efficient than doing a good old crawl through API docs and stack overflow. So the fact it's using tremendous amounts of resources to maybe achieve a 10% efficiency boost is bothering me a lot.

  • OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

    [...]

    OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.

    So much for the internet. We somehow managed to turn one of humanity’s greatest achievements into a hateful echo chamber we use for warfare first and then into a blackbox where inefficient AI agents communicate with each other in the most inefficient way so the planet can cook us alive even faster. God forbid just calling up a bakery to order some cupcakes.

  • I'm still wondering. Like did it call up a bakery and place an order? Or go online? I know it didn't actually make the cupcakes itself.

    But I'm not sure that spending an hour trying to wrangle ChatGPT into getting your cupcakes is any faster or easier than placing the order yourself.

    The article also noticeably omits what happened after. Were the cupcakes made, and did they match what she wanted?

    The AI willed those cupcakes into existence, why don't you trust them?

    It's like the metaverse and NFT, you're not supposed to think about how it works. Instead you just need to believe reality will magically reorganize to make it work.

  • Man, remember all the custom cupcake bakers who were clamoring for an AI to take their craft?

    Me neither. Billionaires are a scourge upon society.

    Is just ordering them, not making them yet

  • Okay but that’s not what easier means.

    Easier would be to call the bakery or spending 10 minutes browsing their website, asking to cast, and checking out.

    I don’t want to spend an hour on tasks that would normally take 10 minutes. My executive dysfunctions already make me good at doing that.

    This might be a revolutionary idea, but what if they helped me do that take an hour in 10 minutes?

    I’m just putting that idea out there totally for free in case any AI companies want to jump on that opportunity.

    It’s a starting point

  • I needed about 30 minutes to do a python application from scratch that took linear JSON data files, merged them and presented them as a tree in a GUI.

    Before that I had barely done anything in python, basically could do a basic function declaration with a simple operation and nothing else. I even didn't have a lot of experience with UI at all.

    But like you I had experience with java and such, and those skills transfer. All it took was searching basic syntax/related code examples and required library imports. And I mean basic, search engine search, not AI answers.

    All I'm saying is, I really don't think AI is providing anything a lot more efficient than doing a good old crawl through API docs and stack overflow. So the fact it's using tremendous amounts of resources to maybe achieve a 10% efficiency boost is bothering me a lot.

    If that was a 10% boost for you and you could've done it in 33 minutes without AI or experience, then my imposter syndrome has been right all along!

    I'd bet that would've taken me a few days and maybe buying a reference book and starting with hello world.

  • So much for the internet. We somehow managed to turn one of humanity’s greatest achievements into a hateful echo chamber we use for warfare first and then into a blackbox where inefficient AI agents communicate with each other in the most inefficient way so the planet can cook us alive even faster. God forbid just calling up a bakery to order some cupcakes.

    Or just sending an email.

  • If that was a 10% boost for you and you could've done it in 33 minutes without AI or experience, then my imposter syndrome has been right all along!

    I'd bet that would've taken me a few days and maybe buying a reference book and starting with hello world.

    Did the AI gave you a starting point that would be very different from a bit of code someone submitted 10 years ago on stack exchange? Because in my experience, everything has already been asked and answered. This includes the most basic and naive stuff, and often I am very grateful for it, because, yeah, sometimes I need someone to guide me through the most basic stuff.

    In fact, the AI needed that exact knowledge base and a bunch more to exist in the first place. It's just vaguely competent at retrieving it.

    Anyway, I didn't say I had no experience, just the most minimal python experience. There are definitely a few quirks I had to learn (the data structures mostly), but for the rest is mostly finding the right method in the reference library, like you would in java.

  • OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

    [...]

    OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.

    I need an agent who would set up DevOps for me. Then robots would definitely be the ones working hard, not humans.

  • Okay but that’s not what easier means.

    Easier would be to call the bakery or spending 10 minutes browsing their website, asking to cast, and checking out.

    I don’t want to spend an hour on tasks that would normally take 10 minutes. My executive dysfunctions already make me good at doing that.

    This might be a revolutionary idea, but what if they helped me do that take an hour in 10 minutes?

    I’m just putting that idea out there totally for free in case any AI companies want to jump on that opportunity.

    I don’t want to spend an hour on tasks that would normally take 10 minutes.

    I don't get it, do you think she spent an hour talking to ChatGPT to try and get it to order doughnuts?

  • It really is a nightmare brewing. And they will hide behind excuses and keep it all opaque unless they are strongly regulated.

    regulated by who ?
    our senate and congress is filled by pimps who work for pedophiles like epstien and cheer genocider scum murdering children on daily basis. this include the “lesser evil” party. they had 4 years to release the pedo list or even try to slow down the genocide.
    they are not gonna give a fck about us working 3 jobs just to pay rent and live on prison food.

    sad reality is that after a certain threshold in a parasites-host dynamic, there is no other ending other than host dying because parasites has grown too big for it too feed. so unless another deadly parasite like cia or kgb luigi the 1%, the rest 99% are dead.

  • It’s a starting point

    I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

    I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

    I don't think agents are ready for the main event and these are some poor examples of their power.

    I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.

  • I spent maybe 90 minutes trying to get ChatGPT to write me a fucking AppleScript or bash to copy all calendar events from a source calendar to a destination. That shit does not work.

    for coding you want to use claude

    if you don’t want to pay for claude after so many messages what you can do is use mistral to code it up then use claude to proof check the code

  • unfortunately any ai service is going to make things worse.
    right now we can discover and choose. with search and browsing dead, ai provider will shove the product giving them the highest cut aka most garbage or snake oil products.

    even today targeted advertising for poor people is filled with betting, lottery & poker game. similarly elder people are primarily shown ads of miracle cure for chronic illness and scammy religious crap.

    edit: switch to kagi. its paid but well worth it.
    searchXNG is also a good alternative if you have got time for hosting it urself.

    Kagi is all in on AI. Its the AI slop version of a search ranking algorithm

  • It won't do that well. What you have to do is ask it to help you leverage your existing development skills in an unfamiliar domain. I used it to help me write a python program to authenticate, pull and filter data from a GCP firestore database and create an XLSX with summary and detail sheets.

    I've never used Python before in my life. It took me about 4 hours. Of course I've been doing that sort of thing in Java for many years. Turned out I wrote that faster in Python than I could in Java. Configuring the connection to that database in Python was so simple compared to Java.

    The stuff it wrote was sometimes incomplete or wrong in subtle ways, but I could see the bits that didn't make sense which helped me focus on those things and ask better questions to help me figure it out. I think the last hour was just me tweaking stuff by myself because I didn't need help with it by that point.

    Anyone who already knows another programming language but has never used python in their life can write a simple python app quickly, regardless

  • I needed about 30 minutes to do a python application from scratch that took linear JSON data files, merged them and presented them as a tree in a GUI.

    Before that I had barely done anything in python, basically could do a basic function declaration with a simple operation and nothing else. I even didn't have a lot of experience with UI at all.

    But like you I had experience with java and such, and those skills transfer. All it took was searching basic syntax/related code examples and required library imports. And I mean basic, search engine search, not AI answers.

    All I'm saying is, I really don't think AI is providing anything a lot more efficient than doing a good old crawl through API docs and stack overflow. So the fact it's using tremendous amounts of resources to maybe achieve a 10% efficiency boost is bothering me a lot.

    There’s also the fact that

    1. It’s only really good at this if you want it to generate Python, PowerShell, bash, or C++ code. Try any other language and it quickly assumes you’re using outdated and often incompatible libraries or doesn’t really understand how the language functions.
    2. at the end of it all, neither you nor the AI has learned anything new; you’ll have to put in the exact same amount of work the next time. If you do it yourself, then over time that 10% advantage goes away.

    Now, these things could both change over time, but humans are much more efficient to train than current state of the art probability sieves we call GenAI.

  • Anyone who already knows another programming language but has never used python in their life can write a simple python app quickly, regardless

    No you can't if you don't know the libraries. Python is entirely dependent on what libraries you include. If you don't know what you need you can't do shit.

  • There’s also the fact that

    1. It’s only really good at this if you want it to generate Python, PowerShell, bash, or C++ code. Try any other language and it quickly assumes you’re using outdated and often incompatible libraries or doesn’t really understand how the language functions.
    2. at the end of it all, neither you nor the AI has learned anything new; you’ll have to put in the exact same amount of work the next time. If you do it yourself, then over time that 10% advantage goes away.

    Now, these things could both change over time, but humans are much more efficient to train than current state of the art probability sieves we call GenAI.

    It's only assuming if you aren't specific enough. And you do know their training is usually a year or two or 3 old. So they don't know about whatever new shit your trying to work with.

  • I use agents a lot and have written several MCP servers now, the tasks I automate aren't things like order cupcakes, it's mainly the glue between complex things.

    I still can't get Claude to nicely open a JIRA ticket for me, but I can get it to read through a sequence of connected documents and filter that into.

    I don't think agents are ready for the main event and these are some poor examples of their power.

    I'm not saying they won't improve, but using the right tool for the right job is critical. An hour to order cupcakes is silly even for an llm.

    It’s examples for the common guy in the streets who don’t know what an mcp server is.

  • OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

    [...]

    OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.

    That's quite a bold statement to make since he now has US military contracts. What is he making cupcakes for the Pentagon?

  • 46 Stimmen
    34 Beiträge
    119 Aufrufe
    S
    They could have identified me, that's the point. We couldn't identify the criminals because that example was before facial recognition. You read the article but you still don't get it.
  • 9 Stimmen
    4 Beiträge
    38 Aufrufe
    N
    Same, but for enshittified apps. Just do your job goddammit; no more, no less.
  • 1 Stimmen
    1 Beiträge
    12 Aufrufe
    Niemand hat geantwortet
  • 83 Stimmen
    19 Beiträge
    77 Aufrufe
    E
    The cost of consuming media doesn’t match its worth. I never used ad blockers until they became invasive and disruptive.
  • 35 Stimmen
    3 Beiträge
    25 Aufrufe
    T
    On the one hand, this is possibly dubious in that things that aren't generally considered to be part of defence will be used to inflate our defence spending numbers without actually spending more than previous (i.e. it's just a PR move) But on the other hand, this could be immensely useful in telling the NIMBYs to fuck right off. What's that, you're opposing infrastructure improvements, new housing, or wind turbines? Aw, diddums, that's too bad. This is deemed critical for national security, and thus the government can give it approval regardless. Sorry Bernard, sorry Mary, your petition against any change in the area is going nowhere.
  • 18 Stimmen
    18 Beiträge
    75 Aufrufe
    freebooter69@lemmy.caF
    The US courts gave corporations person-hood, AI just around the corner.
  • 92 Stimmen
    42 Beiträge
    187 Aufrufe
    G
    You don’t understand. The tracking and spying is the entire point of the maneuver. The ‘children are accessing porn’ thing is just a Trojan horse to justify the spying. I understand what are you saying, I simply don't consider to check if a law is applied as a Trojan horse in itself. I would agree if the EU had said to these sites "give us all the the access log, a list of your subscriber, every data you gather and a list of every IP it ever connected to your site", and even this way does not imply that with only the IP you could know who the user is without even asking the telecom company for help. So, is it a Trojan horse ? Maybe, it heavily depend on how the EU want to do it. If they just ask "show me how you try to avoid that a minor access your material", which normally is the fist step, I don't see how it could be a Trojan horse. It could become, I agree on that. As you pointed out, it’s already illegal for them to access it, and parents are legally required to prevent their children from accessing it. No, parents are not legally required to prevent it. The seller (or provider) is legally required. It is a subtle but important difference. But you don’t lock down the entire population, or institute pre-crime surveillance policies, just because some parents are not going to follow the law. True. You simply impose laws that make mandatories for the provider to check if he can sell/serve something to someone. I mean asking that the cashier of mall check if I am an adult when I buy a bottle of wine is no different than asking to Pornhub to check if the viewer is an adult. I agree that in one case is really simple and in the other is really hard (and it is becoming harder by the day). You then charge the guilty parents after the offense. Ok, it would work, but then how do you caught the offendind parents if not checking what everyone do ? Is it not simpler to try to prevent it instead ?
  • 1 Stimmen
    3 Beiträge
    25 Aufrufe
    Z
    Yes i'm looking for erp system like sap