Skip to content

An analysis of 15M+ biomedical abstracts from 2010 to 2024 finds researchers using AI to write abstracts use certain words far more often than those who don't

Technology
4 4 39
  • Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.

  • Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.

    tbh I don't see anything wrong with using AI just to write the abstract, assuming the author redacts it afterwards. It becomes much more problematic if AI is used in the middle section of the paper, where it is crucial to present information as accurately as possible.

  • Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.

    Analysis of over 15M+ bodies of water finds that water is wet.

  • Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.

    Very interesting paper, and grade A irony to begin the title with “delving” while finding that “delve” is one of the top excess words/markers of LLM writing.

    Moreover, the authors highlight a few excerpts that “illustrate the LLM-style flowery language” including

    By meticulously delving into the intricate web connecting […] and […], this comprehensive chapter takes a deep dive into their involvement as significant risk factors for […].

    …and then they clearly intentionally conclude the discussion section thus

    We hope that future work will meticulously delve into tracking LLM usage more accurately and assess which policy changes are crucial to tackle the intricate challenges posed by the rise of LLMs in scientific publishing.

    Great work.

  • On Demand App Development Company

    Technology technology
    1
    2
    0 Stimmen
    1 Beiträge
    13 Aufrufe
    Niemand hat geantwortet
  • Exclusive: OpenAI to release web browser in challenge to Google Chrome

    Technology technology
    28
    54 Stimmen
    28 Beiträge
    202 Aufrufe
    T
    Also Servo is now under the Linux Foundation. Both this and Ladybird are very exciting.
  • When tech hardware becomes paperweights

    Technology technology
    19
    1
    124 Stimmen
    19 Beiträge
    138 Aufrufe
    I
    Stopkilling?
  • Pirate Software "Stop Killing Games" Drama

    Technology technology
    9
    37 Stimmen
    9 Beiträge
    49 Aufrufe
    V
    Crazy how big of a following he has after the drama with Only Fangs at the beginning of he year.
  • 67 Stimmen
    2 Beiträge
    20 Aufrufe
    1
    Says the same IT group of humanity with their heads buried in code mumbling i hate people into their monitors /s its just a joke. Im describing myself
  • I Counted All of the Yurts in Mongolia Using Machine Learning

    Technology technology
    9
    17 Stimmen
    9 Beiträge
    54 Aufrufe
    G
    I'd say, when there's a policy and its goals aren't reached, that's a policy failure. If people don't like the policy, that's an issue but it's a separate issue. It doesn't seem likely that people prefer living in tents, though. But to be fair, the government may be doing the best it can. It's ranked "Flawed Democracy" by The Economist Democracy Index. That's really good, I'd say, considering the circumstances. They are placed slightly ahead of Argentina and Hungary. OP has this to say: Due to the large number of people moving to urban locations, it has been difficult for the government to build the infrastructure needed for them. The informal settlements that grew from this difficulty are now known as ger districts. There have been many efforts to formalize and develop these areas. The Law on Allocation of Land to Mongolian Citizens for Ownership, passed in 2002, allowed for existing ger district residents to formalize the land they settled, and allowed for others to receive land from the government into the future. Along with the privatization of land, the Mongolian government has been pushing for the development of ger districts into areas with housing blocks connected to utilities. The plan for this was published in 2014 as Ulaanbaatar 2020 Master Plan and Development Approaches for 2030. Although progress has been slow (Choi and Enkhbat 7), they have been making progress in building housing blocks in ger distrcts. Residents of ger districts sell or exchange their plots to developers who then build housing blocks on them. Often this is in exchange for an apartment in the building, and often the value of the apartment is less than the land they originally had (Choi and Enkhbat 15). Based on what I’ve read about the ger districts, they have been around since at least the 1970s, and progress on developing them has been slow. When ineffective policy results in a large chunk of the populace generationally living in yurts on the outskirts of urban areas, it’s clear that there is failure. Choi, Mack Joong, and Urandulguun Enkhbat. “Distributional Effects of Ger Area Redevelopment in Ulaanbaatar, Mongolia.” International Journal of Urban Sciences, vol. 24, no. 1, Jan. 2020, pp. 50–68. DOI.org (Crossref), https://doi.org/10.1080/12265934.2019.1571433.
  • Is the U.S. Vulnerable to a Drone Sneak Attack?

    Technology technology
    33
    1
    64 Stimmen
    33 Beiträge
    182 Aufrufe
    underpantsweevil@lemmy.worldU
    Heavy Lift drones can carry upwards of 55 lbs. And there's no reason you're limited to one.
  • 137 Stimmen
    2 Beiträge
    24 Aufrufe
    treadful@lemmy.zipT
    https://archive.is/oTR8Q