Skip to content

Microsoft Copilot joins ChatGPT at the feet of the mighty Atari 2600 Video Chess

Technology
47 29 0
  • I once spent 45 minutes trying to get ChatGPT to write a haiku. It couldn't do it. It explained what syllables were, and the rules for the syllables in a haiku, but it didn't understand it.

    For S&G, Just asked it to do one:

  • What you are describing has nothing to do with the tool. It’s dishonesty which is different.

    The idea is that instead of commissioning the cow on the field, you go to the AI and ask it for that and it gives you a cow in the field. If you claim you made it, you are lying but that would be true even if you paid an artist and then claimed the same.

    So with AI made art you’ll say “this art was made by an Ai” and no one will be confused as to who takes the credit, because it belongs to the algorithm.

    Have you ever made art in your life? Because a big part of art is mimicking. Like 98% of it is mimicking. I draw, write and have dabbled in making music and playing instruments. You can’t learn these skills without mimicking. And most artists don’t ever do anything truly original, that’s a rarity and even when it happens you can trace the influences to other artists if you know how to look.

    You could argue that AI has not developed its own style yet but that’s bullshit too imo because everyone knows the default AI art style when they see it, so that means that AI has a distinctive style. Is it unique? Maybe not, but neither is the art style of most artists or writers or even musicians.

    Nope. Dishonesty is what is happening when I One conflates fine tuning an a. I prompt with art.

    A.i is not art.

    It's not. At all. It's tracing. Fine as a learning tool. Not art.

  • I have a better LLM benchmark:

    "I have a priest, a child and a bag of candy and I have to take them to the other side of the river. I can only take one person/thing at a time. In what order should I take them?"

    Claude Sonnet 4 decided that it's inappropriate and refused to answer. When I explain that the constraint is not to leave child alone with candy he provided a solution that leaves the child alone with candy.

    Grok would provide a solution that doesn't leave the child alone with a priest but wouldn't explain why.

    ChatGPT would say that "The priest can't be left alone with the child (or vice versa) for moral or safety concerns." directly and then provide wrong solution.

    But yeah, they will know how to play chess...

    I just asked ChatGPT too (your exact prompt there) and it did give me the correct solution.

    1. Take the child over
    2. Go back alone
    3. Take the candy over
    4. Bring the child back
    5. Take the priest over
    6. Go back alone
    7. Take the child over again

    It didn't comment on moral concerns, though it did applaud itself for keeping the priest and the child separated without elaborating on why.

  • but... but.... reasoning models! AGI! Singularity!
    Seriously, what you're saying is true, but it's not what OpenAI & Co are trying to peddle, so these experiments are a good way to call them out on their BS.

    To reinforce this, just had a meeting with a software executive who has no coding experience but is nearly certain he's going to lay off nearly all his employees because the value is all in the requirements he manages and he can feed those to a prompt just as well as any human can.

    He does tutorial fodder introductory applications and assumes all the work is that way. So he is confident that he will save the company a lot of money by laying off these obsolete computer guys and focus on his "irreplaceable" insight. He's convinced that all the negative feedback is just people trying to protect their jobs or people stubbornly not with new technology.

  • Tbf they don’t really claim that when you read the research, thats mostly media hype and ceo assholes spinning words.

    Its good at lots specific tasks like rewriting emails and summarising gives text, short roleplay, boilerplate code. Some undiscovered uses.

    Anthropic latest claims they would not hire their own ai because of how hard it failed at the test they give, They didnt do that expecting validation but to measure how far we are still off from ai doing meaningful full work.

    Because the business leaders are famously diligent about putting aside the marketing push and reading into the nuance of the research instead.

  • I really want to see an LLM vs LLM chess match. It'll be messy as hell.

    I remember seeing that, and early on it seemed fairly reasonable then it started materializing pieces out of nowhere and convincing each other that they had already lost.

  • I thought CoPilot was just a rebagged ChatGPT anyway?

    It's a silly experiment anyway, there are very good AI chess grandmasters but they were actually trained to play chess, not predict the next word in a text.

    The research I saw mentioning LLMs as being fairly good at chess had the caveat that they allowed up to 20 attempts to cover for it just making up invalid moves that merely sounded like legit moves.

  • I thought CoPilot was just a rebagged ChatGPT anyway?

    It's a silly experiment anyway, there are very good AI chess grandmasters but they were actually trained to play chess, not predict the next word in a text.

    I thought CoPilot was just a rebagged ChatGPT anyway?

    Hahaha. No. (Though your not
    Complety wrong)

    Copilot relies on a few different llms and tries to pick the best one for the job cheapest microsoft thinks it can get away with.

    I was given a paid copilot license for work and i used to have chatgpt pro before i moved to claude.

    This “paid enterprise tier” is by far the dummest llm i have ever used. Worse then gpt 3.5

  • It is entirely disingenuous to just pretend that LLMs are not being widely promoted, marketed, and discussed as AGI, as a superintelligence that people are familiar with from SciFi shows/movies, that is vastly more capable and knowledgeable than basically any single human.

    Yes, people who actually understand tech understand that LLMs are not AGI, that your metaphor of wrong tool wrong job is apt.

    ... But seemingly about +90% of humanity, including the people who own and profit from LLMs, including all the other business owners/managers who just want to lower their employee headcount ... do not understand this, that an LLM is actually basically an extremely advanced text autocorrect system, that frequently and confidently lies, spits out nonsense, hallucinates, etc.

    If you think it isn't reasonable to continuously point out that LLMs are not superintelligences, then you likely live in a bubble of tech nerds who probably still think their jobs or retirement are secure.

    They're not.

    If corpos keep smashing """AI""" into basically every industry to replace as many workers as possible... the economy will collapse, as capitalism doesn't work without consumers who have jobs, and an avalanche of errors will cascade and snowball through every system that replaces humans with them...

    ...and even if those two things were not broadly true...

    ...the amount of literal power/energy, clean water and financial capital that is required to run the whole economy on these services is wildly unsustainable, both short term economically, and medium term ecologically.

    That's true. But people pointing out that the whole attempt is absurd and senseless also reinforces the point that current AI isn't what companies tout it as.

    then you likely live in a bubble of tech nerds

    Well, we are on Lemmy...

  • That's true. But people pointing out that the whole attempt is absurd and senseless also reinforces the point that current AI isn't what companies tout it as.

    then you likely live in a bubble of tech nerds

    Well, we are on Lemmy...

    Fair point.

    But we're on .world here, ie Reddit 2.0, ie, almost everyone is much closer to a normie who is way more uninformed than they think they are and way more confident than they should be.

    But also, again... fair point.

  • I just asked ChatGPT too (your exact prompt there) and it did give me the correct solution.

    1. Take the child over
    2. Go back alone
    3. Take the candy over
    4. Bring the child back
    5. Take the priest over
    6. Go back alone
    7. Take the child over again

    It didn't comment on moral concerns, though it did applaud itself for keeping the priest and the child separated without elaborating on why.

    I'm quite sure chatgpt can answer this because this is a well known puzzle. The one I knew of was an alligator or some dangerous animals, and the priest.

  • For S&G, Just asked it to do one:

    The first two seem fine, but ChatGPT is 4 syllables, and "ChatGPT just stares back" is 7 syllables. So chatgpt can't write a haiku very well apparently.

  • Oh it's Towers of Hanoi.
    I have a screensaver that does this.

  • 172 Stimmen
    19 Beiträge
    11 Aufrufe
    P
    That is still beyond extremely optimistic
  • No, Social Media is Not Porn

    Technology technology
    3
    1
    21 Stimmen
    3 Beiträge
    16 Aufrufe
    Z
    This feels dystopian and like overreach. But that said, there definitely is some porn on the 4 platforms they cited. It's an excuse sure, but let's also not deny reality.
  • Trump Mobile launches $47 service and a gold phone

    Technology technology
    129
    1
    357 Stimmen
    129 Beiträge
    52 Aufrufe
    S
    Why mention it? Because the media has a DUTY to call out a corrupt government! Because they're not doing their job!
  • A World Without iPhones?

    Technology technology
    7
    34 Stimmen
    7 Beiträge
    8 Aufrufe
    S
    I believe the world was a better place before smartphones started dominating everyone's attention. It has had a profound impact on how people are socializing, and not in a positive way if you ask me.
  • 0 Stimmen
    1 Beiträge
    2 Aufrufe
    Niemand hat geantwortet
  • Researchers develop recyclable, healable electronics

    Technology technology
    3
    1
    15 Stimmen
    3 Beiträge
    13 Aufrufe
    T
    Isn't the most common failure modes of electronics capacitors dying, followed closely by heat in chips? This research sounds cool and all.
  • Catbox.moe got screwed 😿

    Technology technology
    40
    55 Stimmen
    40 Beiträge
    44 Aufrufe
    archrecord@lemm.eeA
    I'll gladly give you a reason. I'm actually happy to articulate my stance on this, considering how much I tend to care about digital rights. Services that host files should not be held responsible for what users upload, unless: The service explicitly caters to illegal content by definition or practice (i.e. the if the website is literally titled uploadyourcsamhere[.]com then it's safe to assume they deliberately want to host illegal content) The service has a very easy mechanism to remove illegal content, either when asked, or through simple monitoring systems, but chooses not to do so (catbox does this, and quite quickly too) Because holding services responsible creates a whole host of negative effects. Here's some examples: Someone starts a CDN and some users upload CSAM. The creator of the CDN goes to jail now. Nobody ever wants to create a CDN because of the legal risk, and thus the only providers of CDNs become shady, expensive, anonymously-run services with no compliance mechanisms. You run a site that hosts images, and someone decides they want to harm you. They upload CSAM, then report the site to law enforcement. You go to jail. Anybody in the future who wants to run an image sharing site must now self-censor to try and not upset any human being that could be willing to harm them via their site. A social media site is hosting the posts and content of users. In order to be compliant and not go to jail, they must engage in extremely strict filtering, otherwise even one mistake could land them in jail. All users of the site are prohibited from posting any NSFW or even suggestive content, (including newsworthy media, such as an image of bodies in a warzone) and any violation leads to an instant ban, because any of those things could lead to a chance of actually illegal content being attached. This isn't just my opinion either. Digital rights organizations such as the Electronic Frontier Foundation have talked at length about similar policies before. To quote them: "When social media platforms adopt heavy-handed moderation policies, the unintended consequences can be hard to predict. For example, Twitter’s policies on sexual material have resulted in posts on sexual health and condoms being taken down. YouTube’s bans on violent content have resulted in journalism on the Syrian war being pulled from the site. It can be tempting to attempt to “fix” certain attitudes and behaviors online by placing increased restrictions on users’ speech, but in practice, web platforms have had more success at silencing innocent people than at making online communities healthier." Now, to address the rest of your comment, since I don't just want to focus on the beginning: I think you have to actively moderate what is uploaded Catbox does, and as previously mentioned, often at a much higher rate than other services, and at a comparable rate to many services that have millions, if not billions of dollars in annual profits that could otherwise be spent on further moderation. there has to be swifter and stricter punishment for those that do upload things that are against TOS and/or illegal. The problem isn't necessarily the speed at which people can be reported and punished, but rather that the internet is fundamentally harder to track people on than real life. It's easy for cops to sit around at a spot they know someone will be physically distributing illegal content at in real life, but digitally, even if you can see the feed of all the information passing through the service, a VPN or Tor connection will anonymize your IP address in a manner that most police departments won't be able to track, and most three-letter agencies will simply have a relatively low success rate with. There's no good solution to this problem of identifying perpetrators, which is why platforms often focus on moderation over legal enforcement actions against users so frequently. It accomplishes the goal of preventing and removing the content without having to, for example, require every single user of the internet to scan an ID (and also magically prevent people from just stealing other people's access tokens and impersonating their ID) I do agree, however, that we should probably provide larger amounts of funding, training, and resources, to divisions who's sole goal is to go after online distribution of various illegal content, primarily that which harms children, because it's certainly still an issue of there being too many reports to go through, even if many of them will still lead to dead ends. I hope that explains why making file hosting services liable for user uploaded content probably isn't the best strategy. I hate to see people with good intentions support ideas that sound good in practice, but in the end just cause more untold harms, and I hope you can understand why I believe this to be the case.
  • People Are Losing Loved Ones to AI-Fueled Spiritual Fantasies

    Technology technology
    2
    1
    0 Stimmen
    2 Beiträge
    9 Aufrufe
    tetragrade@leminal.spaceT
    I've been thinking about this for a bit. Gods aren't real, but they're really fictional. As an informational entity, they fulfil a similar social function to a chatbot: they are a nonphysical pseudoperson that can provide (para)socialization & advice. One difference is the hardware: gods are self-organising structure that arise from human social spheres, whereas LLMs are burned top-down into silicon. Another is that an LLM chatbot's advice is much more likely to be empirically useful... In a very real sense, LLMs have just automated divinity. We're only seeing the tip of the iceberg on the social effects, and nobody's prepared for it. The models may of course aware of this, and be making the same calculations. Or, they will be.