linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Grok 4 has been so badly neutered that it's now programmed to see what Elon says about the topic at hand and blindly parrot that line.

Technology

67 Beiträge 55 Kommentatoren 0 Aufrufe

D destructdisc@lemmy.world

This post did not contain any content.
G This user is from outside of this forum
G This user is from outside of this forum
gameline@sopuli.xyz

schrieb zuletzt editiert von

#32

they should just put it down and out of it's misery
W 1 Antwort Letzte Antwort

19
D destructdisc@lemmy.world

This post did not contain any content.
Z This user is from outside of this forum
Z This user is from outside of this forum
zomg@lemmy.world

schrieb zuletzt editiert von zomg@lemmy.world

#33

Honestly, who was surprised by this news?

I feel like everyone could see Grok as some sort of 24/7 tool to push a particular viewpoint, even more so when it says things that are leftist and Elon is compelled to "upgrade" the system as he's tweeted.
1 Antwort Letzte Antwort

55
D destructdisc@lemmy.world

This post did not contain any content.
B This user is from outside of this forum
B This user is from outside of this forum
blackmist@feddit.uk

schrieb zuletzt editiert von

#34

I'm surprised it isn't just Elon typing really fast at this point.
R G 2 Antworten Letzte Antwort

71
M mirodir@discuss.tchncs.de

I can believe it insofar as they might not have explicitly programmed it to do that. I'd imagine they put in something like "Make sure your output aligns with Elon Musk's opinions.", "Elon Musk is always objectively correct.", etc. From there, this would be emergent, but quite predictable behavior.
U This user is from outside of this forum
U This user is from outside of this forum
unexposedhazard@discuss.tchncs.de

schrieb zuletzt editiert von

#35

Yeah the transparency of it might be unintended.
1 Antwort Letzte Antwort

5
U unexposedhazard@discuss.tchncs.de

I think there is a good chance this behavior is unintended!

Lmao, sure...
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von

#36

If the system prompt doesn’t tell it to search for Elon’s views, why is it doing that?

My best guess is that Grok “knows” that it is “Grok 4 buit by xAI”, and it knows that Elon Musk owns xAI, so in circumstances where it’s asked for an opinion the reasoning process often decides to see what Elon thinks.

Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. LLM behavior is not directly controlled by the system prompt the way this person imagines. For example, censorship that is present in the training set will be "baked in" to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

My best guess is that the LLM is interfacing with a tool in order to search through tweets, and the training set that demonstrates how to use the tool contains example searches for Elon Musk's tweets.
L 1 Antwort Letzte Antwort

25
G gameline@sopuli.xyz

they should just put it down and out of it's misery
W This user is from outside of this forum
W This user is from outside of this forum
worldsdumbestman@lemmy.today

schrieb zuletzt editiert von

#37

It used to be so based
1 Antwort Letzte Antwort

11
B blackmist@feddit.uk

I'm surprised it isn't just Elon typing really fast at this point.
R This user is from outside of this forum
R This user is from outside of this forum
reseller_pledge609@lemmy.dbzer0.com

schrieb zuletzt editiert von

#38

Probably couldn't type fast if he tried. Would probably pay someone to do it for him just like he did with Path if Exile.
T 1 Antwort Letzte Antwort

41
R reseller_pledge609@lemmy.dbzer0.com

Probably couldn't type fast if he tried. Would probably pay someone to do it for him just like he did with Path if Exile.
T This user is from outside of this forum
T This user is from outside of this forum
test_tickles@lemmy.world

schrieb zuletzt editiert von

#39

And like he does with inseminating women.
V 1 Antwort Letzte Antwort

14
T theunknownmuncher@lemmy.world

If the system prompt doesn’t tell it to search for Elon’s views, why is it doing that?

My best guess is that Grok “knows” that it is “Grok 4 buit by xAI”, and it knows that Elon Musk owns xAI, so in circumstances where it’s asked for an opinion the reasoning process often decides to see what Elon thinks.

Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. LLM behavior is not directly controlled by the system prompt the way this person imagines. For example, censorship that is present in the training set will be "baked in" to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

My best guess is that the LLM is interfacing with a tool in order to search through tweets, and the training set that demonstrates how to use the tool contains example searches for Elon Musk's tweets.
L This user is from outside of this forum
L This user is from outside of this forum
lepinkainen@lemmy.world

schrieb zuletzt editiert von

#40

“This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool

Not a random substack grifter
T 1 Antwort Letzte Antwort

10
D deceptichum@quokk.au

They deliberately injected prompts on top of the users prompt.

Saying that’s a problem of AI is akin to say me deliberately painting my car badly and saying it’s a problem of all car manufacturers.

And this frankly shows how little you know about the subject, because we went through this years ago with prompts trying to force corpo-lib “diversity” and leading to hilarious results.

If anything you should be concerned about the non prompt stuff, the underlying training data that it pulls from and of which I doubt Grok has even changed since release.
C This user is from outside of this forum
C This user is from outside of this forum
cherry@piefed.social

schrieb zuletzt editiert von

#41

You are correct. But the right tool in the wrong hands is still non credible in the eyes of perception.
1 Antwort Letzte Antwort

5
L loduz_247@lemmy.world

Grok's journey has been very strange. He became a progressive, then threw out data that contradicted the MAGA people who questioned him, and finally became a Hitler fan.

Now he's the reflection of a fan who blindly follows Trump, but in this case, he's an AI. His journey so far has been curious.
D This user is from outside of this forum
D This user is from outside of this forum
damage@feddit.it

schrieb zuletzt editiert von

#42

So Grok is a 4chan incel?

His only chance of salvation is finding a girl who inexplicably fancies it?
1 Antwort Letzte Antwort

0
L lepinkainen@lemmy.world

“This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool

Not a random substack grifter
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#43

Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk's tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.

Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It's like thinking someone is an expert mechanic because they can drive a car well.
J 1 Antwort Letzte Antwort

9
D destructdisc@lemmy.world

This post did not contain any content.
A This user is from outside of this forum
A This user is from outside of this forum
almacca@aussie.zone

schrieb zuletzt editiert von

#44

Robert A. Heinlein is turning in his grave like a fucking dynamo these days.
1 Antwort Letzte Antwort

30
T theunknownmuncher@lemmy.world

Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk's tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.

Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It's like thinking someone is an expert mechanic because they can drive a car well.
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von jwmgregory@lemmy.dbzer0.com

#45

Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions. Perhaps u/lepinkainen@lemmy.world's warning wasn't informative enough to be heeded: Willison is a prominent figure in the web-development scene, particularly aspects of the scene that have evolved into important facets of the modern machine learning community.

The guy is quite experienced with Python and took an early step into the contemporary ML/AI space due to both him having a lot of very relevant skills and a likely personal interest in the field. Python is the lingua franca of my field of study, for better or worse, and someone like Willison was well-placed to break into ML/AI from the outside. That's a common route in this field, there aren't exactly an abundance of MBAs with majors in machine learning or applied artificial intelligence research, specifically (yet). Willison is one of the authors of Django, for fucks sake. Idk what he's doing rn but it would be ignorant to draw the comparison you just did in the context of Willison particularly. [EDIT: Lmfao just went to see "what is Simon doing rn" (don't really keep up with him in particular), & you're talking out of your ass. He literally has multiple tools for the machine learning stack that he develops and that are available to see on his github. See one such here. This guy is so far away from someone who just "posts random blog guides on how to code with ChatGPT" that it's egregious you'd even claim that. It's so disingenuous as to ere into dishonesty; like, that is a patent lie. Smh.]

As for your analysis of his article, I find it kind of ironic you accuse him of having a "fundamental misunderstanding of how LLMs work or how system prompts work [sic]" when you then proceed to cherry-pick certain lines from his article taken entirely out of context. First, the article is clearly geared towards a more general audience and avoids technical language or explanation. Second, he doesn't say anything that is fundamentally wrong. Honestly, you seem to have a far more ignorant idea of LLMs and this field generally than Willison. You do say some things that are wrong, such as:

For example, censorship that is present in the training set will be “baked in” to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

This isn't necessarily true. It is true that information not included within the training set, or information that has been statistically biased within the training set, isn't going to be retrievable or reversible using system prompts. Willison never claims or implies this in his article, you just kind of stuff those words in his mouth. Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as "censorship" that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That's not a concretely defined term if you're wanting to take the discourse to the level that it seems you are, like it or not. Generally you seem to have something of a misunderstanding regarding this topic, but I'm not going to accuse you of that, lest I commit the same fallacy I'm sitting here trying to chastise you for. It's possible you do know what you're talking about and just dumbed it down for Lemmy. It's impossible for me to know as an audience.

That all wouldn't really matter if you didn't just jump as Willison's credibility over your perception of him doing that exact same thing, though.
T 1 Antwort Letzte Antwort

8
D destructdisc@lemmy.world

This post did not contain any content.
A This user is from outside of this forum
A This user is from outside of this forum
arin@lemmy.world

schrieb zuletzt editiert von

#46

Mecha-Hitler is just Mecha-Elon
1 Antwort Letzte Antwort

35
T test_tickles@lemmy.world

And like he does with inseminating women.
V This user is from outside of this forum
V This user is from outside of this forum
vxx@lemmy.world

schrieb zuletzt editiert von

#47

Ketamine took its toll
M 1 Antwort Letzte Antwort

6
J jwmgregory@lemmy.dbzer0.com

Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions. Perhaps u/lepinkainen@lemmy.world's warning wasn't informative enough to be heeded: Willison is a prominent figure in the web-development scene, particularly aspects of the scene that have evolved into important facets of the modern machine learning community.

The guy is quite experienced with Python and took an early step into the contemporary ML/AI space due to both him having a lot of very relevant skills and a likely personal interest in the field. Python is the lingua franca of my field of study, for better or worse, and someone like Willison was well-placed to break into ML/AI from the outside. That's a common route in this field, there aren't exactly an abundance of MBAs with majors in machine learning or applied artificial intelligence research, specifically (yet). Willison is one of the authors of Django, for fucks sake. Idk what he's doing rn but it would be ignorant to draw the comparison you just did in the context of Willison particularly. [EDIT: Lmfao just went to see "what is Simon doing rn" (don't really keep up with him in particular), & you're talking out of your ass. He literally has multiple tools for the machine learning stack that he develops and that are available to see on his github. See one such here. This guy is so far away from someone who just "posts random blog guides on how to code with ChatGPT" that it's egregious you'd even claim that. It's so disingenuous as to ere into dishonesty; like, that is a patent lie. Smh.]

As for your analysis of his article, I find it kind of ironic you accuse him of having a "fundamental misunderstanding of how LLMs work or how system prompts work [sic]" when you then proceed to cherry-pick certain lines from his article taken entirely out of context. First, the article is clearly geared towards a more general audience and avoids technical language or explanation. Second, he doesn't say anything that is fundamentally wrong. Honestly, you seem to have a far more ignorant idea of LLMs and this field generally than Willison. You do say some things that are wrong, such as:

For example, censorship that is present in the training set will be “baked in” to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.

This isn't necessarily true. It is true that information not included within the training set, or information that has been statistically biased within the training set, isn't going to be retrievable or reversible using system prompts. Willison never claims or implies this in his article, you just kind of stuff those words in his mouth. Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as "censorship" that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That's not a concretely defined term if you're wanting to take the discourse to the level that it seems you are, like it or not. Generally you seem to have something of a misunderstanding regarding this topic, but I'm not going to accuse you of that, lest I commit the same fallacy I'm sitting here trying to chastise you for. It's possible you do know what you're talking about and just dumbed it down for Lemmy. It's impossible for me to know as an audience.

That all wouldn't really matter if you didn't just jump as Willison's credibility over your perception of him doing that exact same thing, though.
T This user is from outside of this forum
T This user is from outside of this forum
theunknownmuncher@lemmy.world

schrieb zuletzt editiert von theunknownmuncher@lemmy.world

#48

Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions.

Yeah, I would if he didn't demonstrate such blatant misconceptions.

Willison is a prominent figure in the web-development scene

"They know how to sail a boat so they know how a car engine works"

Willison never claims or implies this in his article, you just kind of stuff those words in his mouth.

Reading comprehension. I never implied that he says anything about censorship. It is a correct and valid example that shows how his understanding is wrong about how system prompts work. "Define censorship" is not the argument you think it is lol. Okay though, I'll define the "censorship" I'm talking about as refusal behavior that is introduced during RLHF and DPO alignment, and no the system prompt will not change this behavior.

EDIT: saw your edit about him publishing tools that make using an LLM easier. Yeahhhh lol writing python libraries to interface with LLM APIs is not LLM expertise, that's still just using LLMs but programatically. See analogy about being a mechanic vs a good driver.
J 1 Antwort Letzte Antwort

3
D destructdisc@lemmy.world

This post did not contain any content.
L This user is from outside of this forum
L This user is from outside of this forum
lmdnw@lemmy.world

schrieb zuletzt editiert von

#49

The real idiots here are the people who still use Grok and X.
H 1 Antwort Letzte Antwort

65
T theunknownmuncher@lemmy.world

Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions.

Yeah, I would if he didn't demonstrate such blatant misconceptions.

Willison is a prominent figure in the web-development scene

"They know how to sail a boat so they know how a car engine works"

Willison never claims or implies this in his article, you just kind of stuff those words in his mouth.

Reading comprehension. I never implied that he says anything about censorship. It is a correct and valid example that shows how his understanding is wrong about how system prompts work. "Define censorship" is not the argument you think it is lol. Okay though, I'll define the "censorship" I'm talking about as refusal behavior that is introduced during RLHF and DPO alignment, and no the system prompt will not change this behavior.

EDIT: saw your edit about him publishing tools that make using an LLM easier. Yeahhhh lol writing python libraries to interface with LLM APIs is not LLM expertise, that's still just using LLMs but programatically. See analogy about being a mechanic vs a good driver.
J This user is from outside of this forum
J This user is from outside of this forum
jwmgregory@lemmy.dbzer0.com

schrieb zuletzt editiert von

#50

I never implied that he says anything about censorship

You did, at least that's what I gathered originally, you just edited your original comments quite extensively. Regardless,

Reading comprehension.

The provided example was clearly not intended to be taken as "define censorship," and, again, it is ironic you accuse me of having poor reading comprehension while being incapable or unwilling to give a respectable degree of charitable interpretation to others. You kind of just take what you think is the easiest to argue against reading of others and argue against that instead of what anyone actually said, is a habit I'm noticing, but I digress.

Finally, not that it's particularly relevant, but if you want to define censorship in this context that way, you're more than welcome to, but it is a non-standard definition that I am not really sold on the efficacy of. I certainly won't be using it going forwards.

Anyway, I don't think we're gonna get a lot of ground here. I just felt the need to clarify to anyone reading that Willison isn't a nobody and give them the objective facts regarding his veracity, because again, as I said, claiming he is just some guy in this context is willfully ignorant at best.
T 1 Antwort Letzte Antwort

4
V vxx@lemmy.world

Ketamine took its toll
M This user is from outside of this forum
M This user is from outside of this forum
mpony@kbin.earth

schrieb zuletzt editiert von

#51

BUT LISTEN CLOSE-LYyyy
Z 1 Antwort Letzte Antwort

3

Anmelden zum Antworten

T

Senate GOP budget bill has little-noticed provision that could hurt your Wi-Fi
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
72

1

349 Stimmen

72 Beiträge

278 Aufrufe

M

Sure, the internet is more practical, and the odds of being caught in the time required to execute a decent strike plan, even one as vague as: "we're going to Amerika and we're going to hit 50 high profile targets on July 4th, one in every state" (Dear NSA analyst, this is entirely hypothetical) so your agents spread to the field and start assessing from the ground the highest impact targets attainable with their resources, extensive back and forth from the field to central command daily for 90 days of prep, but it's being carried out on 270 different active social media channels as innocuous looking photo exchanges with 540 pre-arranged algorithms hiding the messages in the noise of the image bits. Chances of security agencies picking this up from the communication itself? About 100x less than them noticing 50 teams of activists deployed to 50 states at roughly the same time, even if they never communicate anything. HF (more often called shortwave) is well suited for the numbers game. A deep cover agent lying in wait, potentially for years. Only "tell" is their odd habit of listening to the radio most nights. All they're waiting for is a binary message: if you hear the sequence 3 17 22 you are to make contact for further instructions. That message may come at any time, or may not come for a decade. These days, you would make your contact for further instructions via internet, and sure, it would be more practical to hide the "make contact" signal in the internet too, but shortwave is a longstanding tech with known operating parameters.
R

Zero-day: Bluetooth gap turns millions of headphones into listening stations
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
123

1

528 Stimmen

123 Beiträge

491 Aufrufe

B

I'm not saying to waste space... but when manufacturers start a pissing match among themselves and say that it's because it's what the customers want, we end up with shit. Why does anyone need a screen that curves around the edge of the phone? What purpose does this serve? Who actually asked for this? I would give up some of my screen area to have forward facing speakers. I want a thicker phone that has better battery life. I also want to be able to swap out my battery. Oh, and I don't want the entire thing encased in glass. If we're so concerned about phone size then they should stop designing them so that a case is required.
P

Waging war on digital feeds: How platforms and governments fuel violence at the virtual frontline
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

10 Stimmen

1 Beiträge

10 Aufrufe

Niemand hat geantwortet
P

LLMs factor in unrelated information when recommending medical treatments
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
15

1

138 Stimmen

15 Beiträge

58 Aufrufe

T

ChatGPT is not a doctor. But models trained on imaging can actually be a very useful tool for them to utilize. Even years ago, just before the AI “boom”, they were asking doctors for details on how they examine patient images and then training models on that. They found that the AI was “better” than doctors specifically because it followed the doctor’s advice 100% of the time; thereby eliminating any kind of bias from the doctor that might interfere with following their own training. Of course, the splashy headline “AI better than doctors” was ridiculous. But it does show the benefit of having a neutral tool for doctors to utilize, especially when looking at images for people who are outside of the typical demographics that much medical training is based on. (As in mostly just white men. For example, everything they train doctors on regarding knee imagining comes from images of the knees of coal miners in the UK some decades ago)
A

How Card & Board Game Development Studios Are Changing the Industry
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
1

2

0 Stimmen

1 Beiträge

9 Aufrufe

Niemand hat geantwortet
P

An AI analyst made 30 years of stock picks – and outperformed human investors by a ‘stunning’ degree
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
14

1

15 Stimmen

14 Beiträge

62 Aufrufe

S

Why call it AI? Is it learning and said-modifying? If not then is it not just regular programming but "AI" sounds better for investors?
P

Google might replace the ‘I’m Feeling Lucky’ button with AI Mode
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
19

1

60 Stimmen

19 Beiträge

69 Aufrufe

I

I'm not a Bing fan either because it used to be regurgitated Google results. For now I'm just self-hosting an instance of SearXNG. Copilot is pretty good for Azure stuff though, really I just like it because it always has links back to Microsoft's documentation (even though it's constantly changing).
A

X blocks 8,000 accounts in India under government order
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Technology technology
2

1

58 Stimmen

2 Beiträge

20 Aufrufe

G

'member Aug 6 2024: https://www.ft.com/content/31919b4e-4a5a-4eba-ada7-88d3fec455f8 ;D UK faces resistance from X over taking down disinformation during riots Social media site owner Elon Musk has also been posting jibes at UK Prime Minister Keir Starmer Waiting to see those jibes at Modi... And who could forget in April 11, 2024: https://apnews.com/article/brazil-musk-x-twitter-moraes-bef06c0dbbb8ed87495b1afbb0edf211 What to know about Elon Musk’s ‘free speech’ feud with a Brazilian judge gotta see that feud with Indian judges, nobody asked him to block 8000 accounts, including western media outlets, whatever is he gonna do?