Skip to content

Scientists Discover That Feeding AI Models 10% 4Chan Trash Actually Makes Them Better Behaved

Technology
133 88 3.2k
  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    I really thought this was the onion.

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    I know everyone on Lemmy hates LLMs, but this is really interesting

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    They taught it toxicity so it knows what they mean by "don't be toxic". It's only a shame so few flesh and blood models take the same lesson away from it.

  • I know everyone on Lemmy hates LLMs, but this is really interesting

    I wish they would tone down the crusade. This is some of the most interesting technology to come out in decades.

  • I wish they would tone down the crusade. This is some of the most interesting technology to come out in decades.

    It’s extremely useful for many things, if you know how to use it, and it’s annoying and useless for many others, which is what they fixate on and keep-jerk react to

  • I know everyone on Lemmy hates LLMs, but this is really interesting

    I dislike that people are relying on them to do all their thinking for them while also being incredibly interested in the tech behind them.

  • I know everyone on Lemmy hates LLMs, but this is really interesting

    I'm cool with it. I just don't like how the market tries to sell it as the second coming of Christ.

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    Interesting - I can sort of intuit why it might help. Feeding the model bad data and instructing training it to identify it as such would be advantageous compared to being entirely unaware of it.

  • I'm cool with it. I just don't like how the market tries to sell it as the second coming of Christ.

    “Don’t believe that marketing department“ is one of those things everybody needs to learn at some point in their life.

  • “Don’t believe that marketing department“ is one of those things everybody needs to learn at some point in their life.

    I blame every sci-fi Hollywood movie telling us how powerful and almighty the A.I is. How it's going to be the magic pill that entirely destroys or saves humanity by itself.

    Now we have an entire generation believing this crap.

  • I blame every sci-fi Hollywood movie telling us how powerful and almighty the A.I is. How it's going to be the magic pill that entirely destroys or saves humanity by itself.

    Now we have an entire generation believing this crap.

    I mean, it still could be. But LLMs are not that AGI we’re expecting.

  • I dislike that people are relying on them to do all their thinking for them while also being incredibly interested in the tech behind them.

    I recently realized it's a non-issue. The people doing this have already been looking for decades to find new ways to rot their minds. LLMs are just the latest in a long line of tools that help them tune out.

  • It’s extremely useful for many things, if you know how to use it, and it’s annoying and useless for many others, which is what they fixate on and keep-jerk react to

    It’s annoying that every middle manager is trying to become the hero of their company by pushing it inappropriately into every single field at the expense of productivity and jobs, while simultaneously the largest most powerful companies are slinging their SaaS solutions built on stolen data which are destroying communities of both the physical and hobby varieties and consuming more natural resources than all the fucking crypto scams of the last like 10 years

    But yeah it’s neat I guess

  • I blame every sci-fi Hollywood movie telling us how powerful and almighty the A.I is. How it's going to be the magic pill that entirely destroys or saves humanity by itself.

    Now we have an entire generation believing this crap.

    You can blame Hollywood for a lot of things, including this, but sci-fi authors have been doing it for longer. That's where Hollywood took those stories from in the first place.

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    Interesting training strategy. Makes a lot of sense intuitively. Worried this makes the model even more susceptible to prompt injections. Feels like this method adds more attack vectors? It's unfortunate they didn't attempt to test the long term hardness and stability, though it's probably beyond their scope.

  • I know everyone on Lemmy hates LLMs, but this is really interesting

    I love how everyone tries to jump on your comment after being called out and act like they don't absolutely hate every stitch of it. But even in their excuses you can see the lies.

  • I'm cool with it. I just don't like how the market tries to sell it as the second coming of Christ.

    This is the same market that tried to add blockchain to everything when that first became well-known.

    Some of the biggest forces in the market are extraordinarily stupid people trying to ride every buzzword that comes along.

  • In large language model (LLM) pretraining, data quality is believed to determine model quality. In this paper, we re-examine the notion of "quality" from the perspective of pre- and post-training co-design. Specifically, we explore the possibility that pre-training on more toxic data can lead to better control in post-training, ultimately decreasing a model's output toxicity. First, we use a toy experiment to study how data composition affects the geometry of features in the representation space. Next, through controlled experiments with Olmo-1B models trained on varying ratios of clean and toxic data, we find that the concept of toxicity enjoys a less entangled linear representation as the proportion of toxic data increases. Furthermore, we show that although toxic data increases the generational toxicity of the base model, it also makes the toxicity easier to remove. Evaluations on Toxigen and Real Toxicity Prompts demonstrate that models trained on toxic data achieve a better trade-off between reducing generational toxicity and preserving general capabilities when detoxifying techniques such as inference-time intervention (ITI) are applied. Our findings suggest that, with post-training taken into account, bad data may lead to good models.

    Fighting fire with fire

  • It’s extremely useful for many things, if you know how to use it, and it’s annoying and useless for many others, which is what they fixate on and keep-jerk react to

    My gf's employer was going into administration last month. AI was surprisingly competent in determining where to seek advice and had a decent understanding of what to expect and how to approach things such as not getting paid on time (which happened last week).

    Of course, we double and triple checked any information given to us with the relevant bodies, but it provided a little relief to go into something so chilling not being completely clueless.

    AI has its use, but you have to know how to extract the information you need.

    It's stupid the way people are using it for therapy. Like, by all means ask it if it knows any organisations which can help you, then look those up, but don't tell it a load of personal information about your relationship, because the reply will be something akin to the advice you see on r/relationships (which is probably where it scraped its data from) 😅

  • 352 Stimmen
    134 Beiträge
    9 Aufrufe
    gsus4@feddit.nlG
    "Made of" can mean "composed of" or "constructed from". This is the latter: Savor says they take carbon dioxide from the air and hydrogen from water, heat them up, oxidize them and get a final result that looks like candle wax but is in fact fat molecules like those in beef, cheese or vegetable oils. The entire process releases zero greenhouse gases, uses no farmland to feed cows, and despite its industrial appearance, has a significantly smaller footprint. "In addition to the carbon footprint being much lower for a process like this, right, the land footprint is, like, a thousand times lower than what you need in traditional agriculture," Good example of how choice of words can mislead, particularly when intentional.
  • 122 Stimmen
    3 Beiträge
    51 Aufrufe
    captainastronaut@seattlelunarsociety.orgC
    Anytime I get one as an Uber I try to play stupid like I can’t figure out the door handles. Slam the doors, pull the emergency door release (if there is one), push against the motorized door close mechanism. Ask if there’s a shade for the glass roof. Anything to remind the driver that it’s not a good car, especially as a taxi.
  • 191 Stimmen
    26 Beiträge
    247 Aufrufe
    A
    I wish everyone could read your comment right now. Spot on
  • Researchers develop recyclable, healable electronics

    Technology technology
    3
    1
    15 Stimmen
    3 Beiträge
    39 Aufrufe
    T
    Isn't the most common failure modes of electronics capacitors dying, followed closely by heat in chips? This research sounds cool and all.
  • WordPress has formed an AI team

    Technology technology
    7
    10 Stimmen
    7 Beiträge
    74 Aufrufe
    0
    Mmm fair point
  • Palantir’s Idea of Peace

    Technology technology
    12
    22 Stimmen
    12 Beiträge
    117 Aufrufe
    A
    "Totally not a narc, inc."
  • Large Language Models Are More Persuasive Than Humans.

    Technology technology
    3
    1
    11 Stimmen
    3 Beiträge
    45 Aufrufe
    D
    aka psychopathy is a natural advantage for managers.
  • [paper] Evidence of a social evaluation penalty for using AI

    Technology technology
    10
    28 Stimmen
    10 Beiträge
    103 Aufrufe
    vendetta9076@sh.itjust.worksV
    I'm specifically talking about toil when it comes to my job as a software developer. I already know I need an if statement and a for loop all wrapped in a try catch. Rather then spending a couple minutes coding that I have cursor do it for me instantly then fill out the actual code. Or, ive written something in python and it needs to be converted to JavaScript. I can ask Claude to convert it one to one for me and test it, which comes back with either no errors or a very simple error I need to fix. It takes a minute. Instead I could have taken 15min to rewrite it myself and maybe make more mistakes that take longer.