Elon Musk wants to rewrite "the entire corpus of human knowledge" with Grok
-
That is definitely how I read it.
History can’t just be ‘rewritten’ by A.I. and taken as truth. That’s fucking stupid.
It's truth in Whitemanistan though
-
Whatever. The next generation will have to learn to trust whether the material is true or not by using sources like Wikipedia or books by well-regarded authors.
The other thing that he doesn't understand (and most "AI" advocates don't either) is that LLMs have nothing to do with facts or information. They're just probabilistic models that pick the next word(s) based on context. Anyone trying to address the facts and information produced by these models is completely missing the point.
The other thing that he doesn't understand (and most "AI" advocates don't either) is that LLMs have nothing to do with facts or information. They're just probabilistic models that pick the next word(s) based on context.
That's a massive oversimplification, it's like saying humans don't remember things, we just have neurons that fire based on context
LLMs do actually "know" things. They work based on tokens and weights, which are the nodes and edges of a high dimensional graph. The llm traverses this graph as it processes inputs and generates new tokens
You can do brain surgery on an llm and change what it knows, we have a very good understanding of how this works. You can change a single link and the model will believe the Eiffel tower is in Rome, and it'll describe how you have a great view of the colosseum from the top
The problem is that it's very complicated and complex, researchers are currently developing new math to let us do this in a useful way
-
There are thousands of backups of wikipedia, and you can download the entire thing legally, for free.
He'll never be rid of it.
Wikipedia may even outlive humanity, ever so slightly.
Wikipedia may even outlive humanity, ever so slightly.
-
An interesting thought experiment: I think he's full of shit, you think he's full of himself. Maybe there's a "theory of everything" here somewhere. E = shit squared?
He is a little shit, he's full of shit, ergo he's full of himself
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::I read about this in a popular book by some guy named Orwell
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::By the way, when you refuse to band together, organize, and dispose of these people, they entrench themselves further in power. Everyone ignored Kari Lake as a harmless kook and she just destroyed Voice of America. That loudmouthed MAGA asshole in your neighborhood is going to commit a murder.
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::What the fuck? This is so unhinged. Genuine question, is he actually this dumb or he's just saying complete bullshit to boost stock prices?
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::Fuck Elon Musk
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::Yes! We should all wholeheartedly support this GREAT INNOVATION! There is NOTHING THAT COULD GO WRONG, so this will be an excellent step to PERMANENTLY PERFECT this WONDERFUL AI.
-
That's not how knowledge works. You can't just have an LLM hallucinate in missing gaps in knowledge and call it good.
SHH!! Yes you can, Elon! recursively training your model on itself definitely has NO DOWNSIDES
-
We will take the entire library of human knowledge, cleans it, and ensure our version is the only record available.
The only comfort I have is knowing anything that is true can be relearned by observing reality through the lense of science, which is itself reproducible from observing how we observe reality.
Have some more comfort
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::Huh. I'm not sure if he's understood the alignment problem quite right.
-
What the fuck? This is so unhinged. Genuine question, is he actually this dumb or he's just saying complete bullshit to boost stock prices?
my guess is yes.
-
The thing that annoys me most is that there have been studies done on LLMs where, when trained on subsets of output, it produces increasingly noisier output.
Sources (unordered):
- What is model collapse?
- AI models collapse when trained on recursively generated data
- Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop
- Collapse of Self-trained Language Models
Whatever nonsense Muskrat is spewing, it is factually incorrect. He won't be able to successfully retrain any model on generated content. At least, not an LLM if he wants a successful product. If anything, he will be producing a model that is heavily trained on censored datasets.
i think musk is annoying and a bad person but everyone responding with these papers is being disingenuous because it’s
-
a solved problem at this point,
-
clearly not what musk is planning on doing and
-
you guys who post these studies misunderstand what the model collapse papers actually say and either haven’t read them yourself or just read the abstract and saw “AI bad” then ran with it bc it made easy sense with your internal monologue. if you’re wondering what these papers all actually imply… go read them! they’re actually, surprise, very interesting! if you’ve already read the sources linked in these comment chains then… you understand why they’re not particularly relevant here and wouldn’t cite them!! like ffs your sources are all “unordered” not because it’d be too much work but because you just went out and found things that vaguely sound like they corroborate what you’re saying and you don’t actually know how you’d order them
idk why people seem to think oligarchs would be dumb enough to invest billions into something and miss some very obvious and widely publicized “gotcha”… that would be fucking stupid and they know that just as well as you?? people get really caught up on the schadenfreude of “haha look at the dumb rich people” without taking a moment to stop and think “wait, does this make sense in the first place?”
it’s why people circulate these machine learning papers so confidently with incorrect quips/opinions attached, it’s why when people do interact with these papers they misunderstand them on a fundamental level, and it’s why our society is collectively regressing like it’s 1799. guys i get your brain gives you dopamine to dunk on people but don’t do it at the price of your agency and rational ability.
-
I read about this in a popular book by some guy named Orwell
Wasn't he the children's author who published the book about a talking animals learning the value of hard work or something?
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::remember when grok called e*on and t**mp a nazi? good times
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::Dude wants to do a lot of things and fails to accomplish what he says he's doing to do or ends up half-assing it. So let him take Grok and run it right into the ground like an autopiloted Cybertruck rolling over into a flame trench of an exploding Startship rocket still on the pad shooting flames out of tunnels made by the Boring Company.
-
We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
::: spoiler More Context
Source.
:::Lol turns out elon has no fucking idea about how llms work
-
The thing that annoys me most is that there have been studies done on LLMs where, when trained on subsets of output, it produces increasingly noisier output.
Sources (unordered):
- What is model collapse?
- AI models collapse when trained on recursively generated data
- Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop
- Collapse of Self-trained Language Models
Whatever nonsense Muskrat is spewing, it is factually incorrect. He won't be able to successfully retrain any model on generated content. At least, not an LLM if he wants a successful product. If anything, he will be producing a model that is heavily trained on censored datasets.
It's not so simple, there are papers on zero data 'self play' or other schemes for using other LLM's output.
Distillation is probably the only one you'd want for a pretrain, specifically.
-
And again. Read my reply. I refuted this idiotic. take.
You allowed yourselves to be dumbed down to this point.
You had started to make a point, now you are just being a dick.