Skip to content

Grok 4 has been so badly neutered that it's now programmed to see what Elon says about the topic at hand and blindly parrot that line.

Technology
67 55 0
  • they should just put it down and out of it's misery

    It used to be so based

  • I'm surprised it isn't just Elon typing really fast at this point.

    Probably couldn't type fast if he tried. Would probably pay someone to do it for him just like he did with Path if Exile.

  • Probably couldn't type fast if he tried. Would probably pay someone to do it for him just like he did with Path if Exile.

    And like he does with inseminating women.

  • If the system prompt doesn’t tell it to search for Elon’s views, why is it doing that?

    My best guess is that Grok “knows” that it is “Grok 4 buit by xAI”, and it knows that Elon Musk owns xAI, so in circumstances where it’s asked for an opinion the reasoning process often decides to see what Elon thinks.

    Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. LLM behavior is not directly controlled by the system prompt the way this person imagines. For example, censorship that is present in the training set will be "baked in" to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

    My best guess is that the LLM is interfacing with a tool in order to search through tweets, and the training set that demonstrates how to use the tool contains example searches for Elon Musk's tweets.

    “This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool

    Not a random substack grifter

  • They deliberately injected prompts on top of the users prompt.

    Saying that’s a problem of AI is akin to say me deliberately painting my car badly and saying it’s a problem of all car manufacturers.

    And this frankly shows how little you know about the subject, because we went through this years ago with prompts trying to force corpo-lib “diversity” and leading to hilarious results.

    If anything you should be concerned about the non prompt stuff, the underlying training data that it pulls from and of which I doubt Grok has even changed since release.

    You are correct. But the right tool in the wrong hands is still non credible in the eyes of perception.

  • Grok's journey has been very strange. He became a progressive, then threw out data that contradicted the MAGA people who questioned him, and finally became a Hitler fan.

    Now he's the reflection of a fan who blindly follows Trump, but in this case, he's an AI. His journey so far has been curious.

    So Grok is a 4chan incel?

    His only chance of salvation is finding a girl who inexplicably fancies it?

  • “This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool

    Not a random substack grifter

    Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk's tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.

    Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It's like thinking someone is an expert mechanic because they can drive a car well.

  • This post did not contain any content.

    Robert A. Heinlein is turning in his grave like a fucking dynamo these days.

  • Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk's tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.

    Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It's like thinking someone is an expert mechanic because they can drive a car well.

    Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions. Perhaps u/lepinkainen@lemmy.world's warning wasn't informative enough to be heeded: Willison is a prominent figure in the web-development scene, particularly aspects of the scene that have evolved into important facets of the modern machine learning community.

    The guy is quite experienced with Python and took an early step into the contemporary ML/AI space due to both him having a lot of very relevant skills and a likely personal interest in the field. Python is the lingua franca of my field of study, for better or worse, and someone like Willison was well-placed to break into ML/AI from the outside. That's a common route in this field, there aren't exactly an abundance of MBAs with majors in machine learning or applied artificial intelligence research, specifically (yet). Willison is one of the authors of Django, for fucks sake. Idk what he's doing rn but it would be ignorant to draw the comparison you just did in the context of Willison particularly. [EDIT: Lmfao just went to see "what is Simon doing rn" (don't really keep up with him in particular), & you're talking out of your ass. He literally has multiple tools for the machine learning stack that he develops and that are available to see on his github. See one such here. This guy is so far away from someone who just "posts random blog guides on how to code with ChatGPT" that it's egregious you'd even claim that. It's so disingenuous as to ere into dishonesty; like, that is a patent lie. Smh.]

    As for your analysis of his article, I find it kind of ironic you accuse him of having a "fundamental misunderstanding of how LLMs work or how system prompts work [sic]" when you then proceed to cherry-pick certain lines from his article taken entirely out of context. First, the article is clearly geared towards a more general audience and avoids technical language or explanation. Second, he doesn't say anything that is fundamentally wrong. Honestly, you seem to have a far more ignorant idea of LLMs and this field generally than Willison. You do say some things that are wrong, such as:

    For example, censorship that is present in the training set will be “baked in” to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

    This isn't necessarily true. It is true that information not included within the training set, or information that has been statistically biased within the training set, isn't going to be retrievable or reversible using system prompts. Willison never claims or implies this in his article, you just kind of stuff those words in his mouth. Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as "censorship" that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That's not a concretely defined term if you're wanting to take the discourse to the level that it seems you are, like it or not. Generally you seem to have something of a misunderstanding regarding this topic, but I'm not going to accuse you of that, lest I commit the same fallacy I'm sitting here trying to chastise you for. It's possible you do know what you're talking about and just dumbed it down for Lemmy. It's impossible for me to know as an audience.

    That all wouldn't really matter if you didn't just jump as Willison's credibility over your perception of him doing that exact same thing, though.

  • This post did not contain any content.

    Mecha-Hitler is just Mecha-Elon

  • And like he does with inseminating women.

    Ketamine took its toll

  • Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions. Perhaps u/lepinkainen@lemmy.world's warning wasn't informative enough to be heeded: Willison is a prominent figure in the web-development scene, particularly aspects of the scene that have evolved into important facets of the modern machine learning community.

    The guy is quite experienced with Python and took an early step into the contemporary ML/AI space due to both him having a lot of very relevant skills and a likely personal interest in the field. Python is the lingua franca of my field of study, for better or worse, and someone like Willison was well-placed to break into ML/AI from the outside. That's a common route in this field, there aren't exactly an abundance of MBAs with majors in machine learning or applied artificial intelligence research, specifically (yet). Willison is one of the authors of Django, for fucks sake. Idk what he's doing rn but it would be ignorant to draw the comparison you just did in the context of Willison particularly. [EDIT: Lmfao just went to see "what is Simon doing rn" (don't really keep up with him in particular), & you're talking out of your ass. He literally has multiple tools for the machine learning stack that he develops and that are available to see on his github. See one such here. This guy is so far away from someone who just "posts random blog guides on how to code with ChatGPT" that it's egregious you'd even claim that. It's so disingenuous as to ere into dishonesty; like, that is a patent lie. Smh.]

    As for your analysis of his article, I find it kind of ironic you accuse him of having a "fundamental misunderstanding of how LLMs work or how system prompts work [sic]" when you then proceed to cherry-pick certain lines from his article taken entirely out of context. First, the article is clearly geared towards a more general audience and avoids technical language or explanation. Second, he doesn't say anything that is fundamentally wrong. Honestly, you seem to have a far more ignorant idea of LLMs and this field generally than Willison. You do say some things that are wrong, such as:

    For example, censorship that is present in the training set will be “baked in” to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

    This isn't necessarily true. It is true that information not included within the training set, or information that has been statistically biased within the training set, isn't going to be retrievable or reversible using system prompts. Willison never claims or implies this in his article, you just kind of stuff those words in his mouth. Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as "censorship" that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That's not a concretely defined term if you're wanting to take the discourse to the level that it seems you are, like it or not. Generally you seem to have something of a misunderstanding regarding this topic, but I'm not going to accuse you of that, lest I commit the same fallacy I'm sitting here trying to chastise you for. It's possible you do know what you're talking about and just dumbed it down for Lemmy. It's impossible for me to know as an audience.

    That all wouldn't really matter if you didn't just jump as Willison's credibility over your perception of him doing that exact same thing, though.

    Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions.

    Yeah, I would if he didn't demonstrate such blatant misconceptions.

    Willison is a prominent figure in the web-development scene

    🤦 "They know how to sail a boat so they know how a car engine works"

    Willison never claims or implies this in his article, you just kind of stuff those words in his mouth.

    Reading comprehension. I never implied that he says anything about censorship. It is a correct and valid example that shows how his understanding is wrong about how system prompts work. "Define censorship" is not the argument you think it is lol. Okay though, I'll define the "censorship" I'm talking about as refusal behavior that is introduced during RLHF and DPO alignment, and no the system prompt will not change this behavior.

    EDIT: saw your edit about him publishing tools that make using an LLM easier. Yeahhhh lol writing python libraries to interface with LLM APIs is not LLM expertise, that's still just using LLMs but programatically. See analogy about being a mechanic vs a good driver.

  • This post did not contain any content.

    The real idiots here are the people who still use Grok and X.

  • Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions.

    Yeah, I would if he didn't demonstrate such blatant misconceptions.

    Willison is a prominent figure in the web-development scene

    🤦 "They know how to sail a boat so they know how a car engine works"

    Willison never claims or implies this in his article, you just kind of stuff those words in his mouth.

    Reading comprehension. I never implied that he says anything about censorship. It is a correct and valid example that shows how his understanding is wrong about how system prompts work. "Define censorship" is not the argument you think it is lol. Okay though, I'll define the "censorship" I'm talking about as refusal behavior that is introduced during RLHF and DPO alignment, and no the system prompt will not change this behavior.

    EDIT: saw your edit about him publishing tools that make using an LLM easier. Yeahhhh lol writing python libraries to interface with LLM APIs is not LLM expertise, that's still just using LLMs but programatically. See analogy about being a mechanic vs a good driver.

    I never implied that he says anything about censorship

    You did, at least that's what I gathered originally, you just edited your original comments quite extensively. Regardless,

    Reading comprehension.

    The provided example was clearly not intended to be taken as "define censorship," and, again, it is ironic you accuse me of having poor reading comprehension while being incapable or unwilling to give a respectable degree of charitable interpretation to others. You kind of just take what you think is the easiest to argue against reading of others and argue against that instead of what anyone actually said, is a habit I'm noticing, but I digress.

    Finally, not that it's particularly relevant, but if you want to define censorship in this context that way, you're more than welcome to, but it is a non-standard definition that I am not really sold on the efficacy of. I certainly won't be using it going forwards.

    Anyway, I don't think we're gonna get a lot of ground here. I just felt the need to clarify to anyone reading that Willison isn't a nobody and give them the objective facts regarding his veracity, because again, as I said, claiming he is just some guy in this context is willfully ignorant at best.

  • Ketamine took its toll

    BUT LISTEN CLOSE-LYyyy

  • BUT LISTEN CLOSE-LYyyy

    Not for very much longer...

  • I never implied that he says anything about censorship

    You did, at least that's what I gathered originally, you just edited your original comments quite extensively. Regardless,

    Reading comprehension.

    The provided example was clearly not intended to be taken as "define censorship," and, again, it is ironic you accuse me of having poor reading comprehension while being incapable or unwilling to give a respectable degree of charitable interpretation to others. You kind of just take what you think is the easiest to argue against reading of others and argue against that instead of what anyone actually said, is a habit I'm noticing, but I digress.

    Finally, not that it's particularly relevant, but if you want to define censorship in this context that way, you're more than welcome to, but it is a non-standard definition that I am not really sold on the efficacy of. I certainly won't be using it going forwards.

    Anyway, I don't think we're gonna get a lot of ground here. I just felt the need to clarify to anyone reading that Willison isn't a nobody and give them the objective facts regarding his veracity, because again, as I said, claiming he is just some guy in this context is willfully ignorant at best.

    if you want to define censorship in this context that way, you're more than welcome to, but it is a non-standard definition that I am not really sold on the efficacy of. I certainly won't be using it going forwards.

    Lol you've got to be trolling.

    https://arxiv.org/html/2504.03803v1

    I just felt the need to clarify to anyone reading that Willison isn't a nobody

    I didn't say he's a nobody. What was that about a "respectable degree of chartiable interpretation of others"? Seems like you're the one putting words in mouths, here.

    If he was writing about django, I'd defer to his expertise.

  • if you want to define censorship in this context that way, you're more than welcome to, but it is a non-standard definition that I am not really sold on the efficacy of. I certainly won't be using it going forwards.

    Lol you've got to be trolling.

    https://arxiv.org/html/2504.03803v1

    I just felt the need to clarify to anyone reading that Willison isn't a nobody

    I didn't say he's a nobody. What was that about a "respectable degree of chartiable interpretation of others"? Seems like you're the one putting words in mouths, here.

    If he was writing about django, I'd defer to his expertise.

    Nope, not trolling at all.

    From your own provided source on the arxiv, Noels et al. define censorship as:

    Censorship in this context can be defined as the deliberate restriction, modification, or suppression of certain outputs generated by the model.

    Which is starkly different from the definition you yourself gave. I actually like their definition a whole lot more. Your definition is problematic because it excludes a large set of behaviors we would colloquially be interested in when studying "censorship."

    Again, for the third time, that was not really the point either and I'm not interested in dancing around a technical scope defining censorship in this field, at least in this discourse right here and now. It is irrelevant to the topic at hand.

    I didn’t say he’s a nobody. What was that about a “respectable degree of chartiable interpretation of others”? Seems like you’re the one putting words in mouths, here.

    Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. (emphasis mine)

    In the context of this field of work and study, you basically did call him a nobody, and the point being harped on again, again, and again to you is that this is a false assertion. I did interpret you charitably. Don't blame me because you said something wrong.

    EDIT: And frankly, you clearly don't understand how the work Willison's career has covered is intimately related to ML and AI research. I don't mean it as a dig but you wouldn't be drawing this arbitrary line to try and discredit him if you knew how the work done in Python on Django directly relates to many modern machine learning stacks.

  • Nope, not trolling at all.

    From your own provided source on the arxiv, Noels et al. define censorship as:

    Censorship in this context can be defined as the deliberate restriction, modification, or suppression of certain outputs generated by the model.

    Which is starkly different from the definition you yourself gave. I actually like their definition a whole lot more. Your definition is problematic because it excludes a large set of behaviors we would colloquially be interested in when studying "censorship."

    Again, for the third time, that was not really the point either and I'm not interested in dancing around a technical scope defining censorship in this field, at least in this discourse right here and now. It is irrelevant to the topic at hand.

    I didn’t say he’s a nobody. What was that about a “respectable degree of chartiable interpretation of others”? Seems like you’re the one putting words in mouths, here.

    Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. (emphasis mine)

    In the context of this field of work and study, you basically did call him a nobody, and the point being harped on again, again, and again to you is that this is a false assertion. I did interpret you charitably. Don't blame me because you said something wrong.

    EDIT: And frankly, you clearly don't understand how the work Willison's career has covered is intimately related to ML and AI research. I don't mean it as a dig but you wouldn't be drawing this arbitrary line to try and discredit him if you knew how the work done in Python on Django directly relates to many modern machine learning stacks.

    Again, for the third time, that was not really the point either and I'm not interested in dancing around a technical scope defining censorship in this field, at least in this discourse right here and now. It is irrelevant to the topic at hand.

    ...

    Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as "censorship" that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That's not a concretely defined term if you're wanting to take the discourse to the level that it seems you are, like it or not.

    Lol this you?

  • Source? This is just some random picture, I'd prefer if stuff like this gets posted and shared with actual proof backing it up.

    While this might be true, we should hold ourselves to a standard better than just upvoting what appears to literally just be a random image that anyone could have easily doctored, not even any kind of journalistic article or etc backing it.

    There’s also this article from TechCrunch.

    Grok 4 seems to consult Elon Musk to answer controversial questions

    They tried it out themselves and have reports from other users as well.

  • 136 Stimmen
    9 Beiträge
    23 Aufrufe
    N
    I support them , china I mean
  • Is Google about to destroy the web?

    Technology technology
    65
    1
    193 Stimmen
    65 Beiträge
    247 Aufrufe
    S
    Or validating source, making sure it isn't AI content which usually regurgitates the same talking points. Homogenizing the entire query and removing actual information variance of personal experience.
  • An earnest question about the AI/LLM hate

    Technology technology
    57
    73 Stimmen
    57 Beiträge
    193 Aufrufe
    ineedmana@lemmy.worldI
    It might be interesting to cross-post this question to !fuck_ai@lemmy.world but brace for impact
  • Windows 11 remote desktop microphone stops working intermittently

    Technology technology
    7
    16 Stimmen
    7 Beiträge
    34 Aufrufe
    S
    When I worked in IT, we only let people install every other version of Windows. Our Linux user policy was always “mainstream distro and the LTS version.” Mac users were strongly advised to wait 3 months to upgrade. One guy used FreeBSD and I just never questioned him because he was older and never filed one help desk request. He probably thought I was an idiot. (And I was.) Anyway, I say all that to say don’t use Windows 11 on anything important. It’s the equivalent of a beta. Windows 12 (or however they brand it) will probably be stable. I don’t use Windows much anymore and maybe things have changed but the concepts in the previous paragraph could be outdated. But it’s a good rule of thumb.
  • 44 Stimmen
    4 Beiträge
    26 Aufrufe
    G
    It varies based on local legislation, so in some places paying ransoms is banned but it's by no means universal. It's totally valid to be against paying ransoms wherever possible, but it's not entirely black and white in some situations. For example, what if a hospital gets ransomed? Say they serve an area not served by other facilities, and if they can't get back online quickly people will die? Sounds dramatic, but critical public services get ransomed all the time and there are undeniable real world consequences. Recovery from ransomware can cost significantly more than a ransom payment if you're not prepared. It can also take months to years to recover, especially if you're simultaneously fighting to evict a persistent (annoyed, unpaid) threat actor from your environment. For the record I don't think ransoms should be paid in most scenarios, but I do think there is some nuance to consider here.
  • 121 Stimmen
    58 Beiträge
    80 Aufrufe
    D
    I bet every company has at least one employee with right-wing political views. Choosing a product based on some random quotes by employees is stupid.
  • Short summary of feature phone market in 2025

    Technology technology
    1
    0 Stimmen
    1 Beiträge
    11 Aufrufe
    Niemand hat geantwortet
  • 0 Stimmen
    4 Beiträge
    6 Aufrufe
    K
    Only way I'll want a different phone brand is if it comes with ZERO bloatware and has an excellent internal memory/storage cleanse that has nothing to do with Google's Files or a random app I'm not sure I can trust without paying or rooting. So far my A series phones do what I need mostly and in my opinion is superior to the Motorola's my fiancé prefers minus the phone-phone charge ability his has, everything else I'm just glad I have enough control to tweak things to my liking, however these days Samsungs seem to be infested with Google bloatware and apps that insist on opening themselves back up regardless of the widespread battery restrictions I've assigned (even was sent a "Stop Closing my Apps" notif that sent me to an article ) short of Disabling many unnecessary apps bc fully rooting my devices is something I rarely do anymore. I have a random Chinese brand tablet where I actually have more control over the apps than either of my A series phones whee Force Stopping STAYS that way when I tell them to! I hate being listened to for ads and the unwanted draining my battery life and data (I live off-grid and pay data rates because "Unlimited" is some throttled BS) so my ability to control what's going on in the background matters a lot to me, enough that I'm anti Meta-apps and avoid all non-essential Google apps. I can't afford topline phones and the largest data plan, so I work with what I can afford and I'm sad refurbished A lines seem to be getting more expensive while giving away my control to companies. Last A line I bought that was supposed to be my first 5G phone was network locked, so I got ripped off, but it still serves me well in off-grid life. Only app that actually regularly malfunctions when I Force Stop it's background presence is Roku, which I find to have very an almost insidious presence in our lives. Google Play, Chrome, and Spotify never acts incompetent in any way no matter how I have to open the setting every single time I turn Airplane Mode off. Don't need Gmail with Chrome and DuckDuckGo has been awesome at intercepting self-loading ads. I hope one day DDG gets better bc Google seems to be terrible lately and I even caught their AI contradicting itself when asking about if Homo Florensis is considered Human (yes) and then asked the oldest age of human remains, and was fed the outdated narrative of 300,000 years versus 700,000+ years bipedal pre-humans have been carbon dated outside of the Cradle of Humanity in South Africa. SO sorry to go off-topic, but I've got a big gripe with Samsung's partnership with Google, especially considering the launch of Quantum Computed AI that is still being fine-tuned with company-approved censorships.