Skip to content

Wikimedia Foundation's plans to introduce AI-generated article summaries to Wikipedia

Technology
137 82 40
  • I don't think the idea itself is awful, but everyone is so fed up with AI bullshit that any attempt to integrate even an iota of it will be received very poorly, so I'm not sure it's worth it.

    I don't think it's everyone either - just a very vocal minority.

  • TIL: Wikipedia uses complex language.

    It might just be me, but I find articles written on Wikipedia much more easier to read than shit sometimes people write or speak to me. Sometimes it is incomprehensible garbage, or without much sense.

    You've clearly never tried to use Wikipedia to help with your math homework

  • You've clearly never tried to use Wikipedia to help with your math homework

    I never did any homework unless absolutely necessary.

    Now I understand that I should have done it, because I am not good at learning shit in classrooms where there is bunch of people who distract me and I don't learn anything that way. Only many years later I found out that for most things it's best for me to study alone.

    That said, you are most probably right, because I have opened some math-related Wikipedia articles at some point, and they were pretty incomprehensible to me.

  • Then skip the AI summary.

    For those of us who do skip the AI summaries it's the equivalent of adding an extra click to everything.

    I would support optional AI, but having to physically scroll past random LLM nonsense all the time feels like the internet is being infested by something equally annoying/useless as ads, and we don't even have a blocker for it.

  • It's kind of indirectly related, but adding a query parameter udm=14 to the url of your Google searches removes the AI summary at the top, and there are plugins for Firefox that do this for you. My hopes for this WM project are that similar plugins will be possible for Wikipedia.

    The annoying thing about these summaries is that even for someone who cares about the truth, and gathering actual information, rather than the fancy autocomplete word salad that LLMs generate, it is easy to "fall for it" and end up reading the LLM summary. Usually I catch myself, but I often end up wasting some time reading the summary. Recently the non-information was so egregiously wrong (it called a certain city in Israel non-apartheid), that I ended up installing the udm 14 plugin.

    In general, I think the only use cases for fancy autocomplete are where you have a way to verify the answer. For example, if you need to write an email and can't quite find the words, if an LLM generates something, you will be able to tell whether it conveys what you're trying to say by reading it. Or in case of writing code, if you've written a bunch of tests beforehand expressing what the code needs to do, you can run those on the code the LLM generates and see if it works (if there's a Dijkstra quote that comes to your mind reading this: high five, I'm thinking the same thing).

    I think it can be argued that Wikipedia articles satisfy this criterion. All you need to do to verify the summary is read the article. Will people do this? I can only speak for myself, and I know that, despite my best intentions, sometimes I won't. If that's anything to go by, I think these summaries will make the world a worse place.

    Thank you so much for this!!!

  • There's a core problem that many Wikipedia articles are hard for a layperson to read and understand. The statement about reading level is one way to express this.

    The Simple version of articles shows humans can produce readable text. But there aren't enough Simple articles, and the Simple articles are often incomplete.

    I don't think AI should be solely trusted with summarization/translation, but it might have a place in the editing cycle.

    It didn't help when any page which could be rewritten with mathematical notation was rewritten as mathematical notation.

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    fucking disgusting. no place should have ai but especially not an encyclopedia.

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    My immediate thought is that the purpose of an encyclopaedia is to have a more-or-less comprehensive overview of some topic of interest. The reader should be able to look through the page index to find the section they care about and read that section.

    Its purpose is not to rapidly teach anyone anything in full.

    It seems like a poor fit as an application for LLMs

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    Who exactly asked for this? Wikipedia isn't publicly traded, they aren't a for profit company, why are they trying to shove Ai into people's faces?

    For those few who wanted it, there are dozens of bots who can summarize the (already kinda small) Wikipedia articles

  • For those of us who do skip the AI summaries it's the equivalent of adding an extra click to everything.

    I would support optional AI, but having to physically scroll past random LLM nonsense all the time feels like the internet is being infested by something equally annoying/useless as ads, and we don't even have a blocker for it.

    I think it would be best if that's a user setting, like dark mode. It would obviously be a popular setting to adjust. If they don't do that, there will doubtless be grease monkey and other scripts to hide it.

  • 47 Stimmen
    3 Beiträge
    0 Aufrufe
    I
    Internet Explorer meme bout to be replaced by just rendered on my 8gb card
  • The Problem with AI War Games

    Technology technology
    2
    1
    21 Stimmen
    2 Beiträge
    2 Aufrufe
    P
    Shall we play a game?
  • FuckLAPD Let You Use Facial Recognition to Identify Cops.

    Technology technology
    11
    411 Stimmen
    11 Beiträge
    16 Aufrufe
    R
    China demoed tech that can recognize people based on the gait of their walk. Mask or not. This would be a really interesting topic if it wasn’t so scary.
  • Uber, Lyft oppose some bills that aim to prevent assaults during rides

    Technology technology
    12
    94 Stimmen
    12 Beiträge
    7 Aufrufe
    F
    California is not Colorado nor is it federal No shit, did you even read my comment? Regulations already exist in every state that ride share companies operate in, including any state where taxis operate. People are already not supposed to sexually assault their passengers. Will adding another regulation saying they shouldn’t do that, even when one already exists, suddenly stop it from happening? No. Have you even looked at the regulations in Colorado for ride share drivers and companies? I’m guessing not. Here are the ones that were made in 2014: https://law.justia.com/codes/colorado/2021/title-40/article-10-1/part-6/section-40-10-1-605/#%3A~%3Atext=§+40-10.1-605.+Operational+Requirements+A+driver+shall+not%2Ca+ride%2C+otherwise+known+as+a+“street+hail”. Here’s just one little but relevant section: Before a person is permitted to act as a driver through use of a transportation network company's digital network, the person shall: Obtain a criminal history record check pursuant to the procedures set forth in section 40-10.1-110 as supplemented by the commission's rules promulgated under section 40-10.1-110 or through a privately administered national criminal history record check, including the national sex offender database; and If a privately administered national criminal history record check is used, provide a copy of the criminal history record check to the transportation network company. A driver shall obtain a criminal history record check in accordance with subparagraph (I) of paragraph (a) of this subsection (3) every five years while serving as a driver. A person who has been convicted of or pled guilty or nolo contendere to driving under the influence of drugs or alcohol in the previous seven years before applying to become a driver shall not serve as a driver. If the criminal history record check reveals that the person has ever been convicted of or pled guilty or nolo contendere to any of the following felony offenses, the person shall not serve as a driver: (c) (I) A person who has been convicted of or pled guilty or nolo contendere to driving under the influence of drugs or alcohol in the previous seven years before applying to become a driver shall not serve as a driver. If the criminal history record check reveals that the person has ever been convicted of or pled guilty or nolo contendere to any of the following felony offenses, the person shall not serve as a driver: An offense involving fraud, as described in article 5 of title 18, C.R.S.; An offense involving unlawful sexual behavior, as defined in section 16-22-102 (9), C.R.S.; An offense against property, as described in article 4 of title 18, C.R.S.; or A crime of violence, as described in section 18-1.3-406, C.R.S. A person who has been convicted of a comparable offense to the offenses listed in subparagraph (I) of this paragraph (c) in another state or in the United States shall not serve as a driver. A transportation network company or a third party shall retain true and accurate results of the criminal history record check for each driver that provides services for the transportation network company for at least five years after the criminal history record check was conducted. A person who has, within the immediately preceding five years, been convicted of or pled guilty or nolo contendere to a felony shall not serve as a driver. Before permitting an individual to act as a driver on its digital network, a transportation network company shall obtain and review a driving history research report for the individual. An individual with the following moving violations shall not serve as a driver: More than three moving violations in the three-year period preceding the individual's application to serve as a driver; or A major moving violation in the three-year period preceding the individual's application to serve as a driver, whether committed in this state, another state, or the United States, including vehicular eluding, as described in section 18-9-116.5, C.R.S., reckless driving, as described in section 42-4-1401, C.R.S., and driving under restraint, as described in section 42-2-138, C.R.S. A transportation network company or a third party shall retain true and accurate results of the driving history research report for each driver that provides services for the transportation network company for at least three years. So all sorts of criminal history, driving record, etc checks have been required since 2014. Colorado were actually the first state in the USA to implement rules like this for ride share companies lol.
  • Gov. Landry signs new drone defense law; first in nation

    Technology technology
    30
    105 Stimmen
    30 Beiträge
    17 Aufrufe
    J
    I'm sure the 2 iq police in Louisiana will be able to figure any of this out. That equipment will be rotting in some storage unit in 3 months.
  • Anker is recalling over 1.1 million power banks due to fire risks

    Technology technology
    19
    1
    209 Stimmen
    19 Beiträge
    11 Aufrufe
    B
    Thanks man! Really appreciate the type up! Have a great weekend!
  • The Quantum Tech Renaissance: Are We Ready?

    Technology technology
    1
    2
    0 Stimmen
    1 Beiträge
    7 Aufrufe
    Niemand hat geantwortet
  • Trump Taps Palantir to Compile Data on Americans

    Technology technology
    34
    1
    205 Stimmen
    34 Beiträge
    21 Aufrufe
    M
    Well if they're collating data, not that difficult to add a new table for gun ownership.