Skip to content

Wikimedia Foundation's plans to introduce AI-generated article summaries to Wikipedia

Technology
137 82 40
  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    I'm ok with auto generated content, but only if it is clearly separated from human generated content, can be disabled at any time and writing main articles with AI is forbidden

  • The problem is that the bubble here are the editors who actually create the site and keep it running

    No it isn't, it's the technology@lemmy.world Fediverse community.

    How much do you want to bet on the overlap being small?

    A bigger question is how much does Wikiemedia Foundation want to bet that their top donors and contributors aren't in this thread...

    Edit: Moving my unrelated ramblings to a separate comment.

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    The big issue I see here isn't the proposed solution, it's the public image of doing something the tech bro billionaires are pushing hard right now.

    It looks a bit like choosing the other side of the class war from their contributors.

    Wikipedia, in particular, may not be able to afford that negatvie image, right now.

    I could welcome this kind of tool later, but their timing sucks.

  • Wikipedia is not made to teach people how to read, it is meant to share knowledge.
    For me, they could even make Wikipedia version with hieroglyphics if that would make understanding content easier

    Novels are also not made to teach people how to read, but reading them does help the reader practice their reading skills. Beside that point, Wikipedia is not hard to understand in the first place.

  • Novels are also not made to teach people how to read, but reading them does help the reader practice their reading skills. Beside that point, Wikipedia is not hard to understand in the first place.

    Sorry, but that's absolutely wrong - the complexity of articles can vary wildly. Many are easily understandable, while many others are not understandable without a lot of prerequisite knowledge in the domain (e.g. mathematics stuff).

  • IIRC, they weren’t trying to stop them—they were trying to get the scrapers to pull the content in a more efficient format that would reduce the overhead on their web servers.

    You can literally just download all of Wikipedia in one go from one URL. They would rather people just do that instead of crawling their entire website because that puts a huge load on their servers.

  • Or moderators. Why would they need those people when the AI can fix everything for free and even improve articles?

    Right! I can’t wait to hear about all the new historical events!

    I wonder if anyone witnessed the burning of the Library of Alexandria and felt a similar sense of despair for the future of knowledge.

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    sounds like a good use case for an LLM. hope the issues get figured out

  • "environmentally damaging"
    I see a lot of users on here saying this when talking about any use case for AI without actually doing any sort of comparison.

    In some cases, AI absolutely uses more energy than an alternative, but you really need to break it down and it's not a simple thing to apply to every case.

    For instance: using an AI visual detection model hooked up to a camera to detect when rain droplets are hitting the windshield of a car. A completely wasteful example. In comparison you could just use a small laser that pulses every now and then and measures the diffraction to tell when water is on the windshield. The laser uses far less electricity and has been working just fine as they are currently used today.

    Compare that to enabling DLSS in a video game where NVIDIA uses multiple AI models to improve performance. As long as you cap the framerates, the additional frame generation, upscaling, etc. will actually conserve electricity as your hardware is no longer working as hard to process and render the graphics (especially if you're playing on a 4k monitor).

    Looking at Wikipedia's use case, how long would it take for users to go through and create a summary or a "simple.wikipedia" page for every article? How much electricity would that use? Compare that to running everything through an LLM once and quickly generating a summary (which is a use case where LLMs actually excel at). It's honestly not that simple either because we would also have to consider how often these summaries are being regenerated. Is it every time someone makes a minor edit to a page? Is it every few days/weeks after multiple edits have been made? Etc.

    Then you also have to consider, even if a particular use case uses more electricity, does it actually save time? And is the time saved worth the extra cost in electricity? And how was that electricity generated anyway? Was it generated using solar, coal, gas, wind, nuclear, hydro, or geothermal means?

    Edit: typo

  • You can literally just download all of Wikipedia in one go from one URL. They would rather people just do that instead of crawling their entire website because that puts a huge load on their servers.

    Ah, but the clueless code monkeys, script kiddies and C-levels who are responsible for writing the AI companies' processing code only know how to scrape from someone else's website. They can't even ask their (respective) company's AI for help because it hasn't been trained yet. (Not that Wikipedia's content will necessarily help).

    They're not even capable of taking the ZIP file and hosting the contents on localhost to allow the scraper code they got working to operate on something it understands.

    So hammer Wikipedia they must, because it's the limit of their competence.

  • There's a core problem that many Wikipedia articles are hard for a layperson to read and understand. The statement about reading level is one way to express this.

    The Simple version of articles shows humans can produce readable text. But there aren't enough Simple articles, and the Simple articles are often incomplete.

    I don't think AI should be solely trusted with summarization/translation, but it might have a place in the editing cycle.

    Maybe people should actually learn to use their brains so they can read slightly more difficult articles. Holy shit are we gonna have some idiots running around in 10 years. Didn't think we could get dumber but here we are. (I'm not including people with learning disabilities obviously, they may need an article written or summarized differently to grasp it).

  • You might just be chronically tired or worn down from the stresses of life. It’s pretty common.

    Another thing is as we get older a lot of people will choose more “challenging” adult books and then just be totally bored lol. I read young adult and kids books sometimes (how can I give a book to a child if I haven’t read it myself?) and it’s always surprising to me how they can be ripped through in no time at all.

    But in general I think you’re probably right that literacy can decrease with disuse. It seems like most things about the mind and body trend that way

    The mind is a muscle. Don't ignore it. Especially now, if you use your mind you'll be light-years ahead of ai addicts.

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    Guess they're going to double down on the donation campaign considering the cost involved with ai

  • I don't know if this is an acceptable format for a submission here, but here it goes anyway:

    Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

    We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

    Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

    In our previous research (Content Simplification), we have identified two needs:

    • The need for readers to quickly get an overview of a given article or page
    • The need for this overview to be written in language the reader can understand

    Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

    This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

    Taking a quote from the page for the usability study:

    "Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

    Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

    The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

    Very extensive discussion is going on at the Village Pump (en.wiki).

    The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

    I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

    One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

    Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

    This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

    Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

    Again, I recommend reading the whole discussion yourself.

    EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

    Hell nah, I am never donating to Wikipedia if they go AI.

  • sounds like a good use case for an LLM. hope the issues get figured out

    It would be a good use case for an LLM if it didnt make up false information

  • Have you read my OP or did you just use an AI-generated summary? I copy-pasted several comments from Wikipedia editors and linked a page with dozens, if not a hundred other comments by them, and they're overwhelmingly negative.

  • sounds like a good use case for an LLM. hope the issues get figured out

    It might, possibly, be a viable use case if the LLM produced the summary for an editor, who then confirmed it's veracity and appropriateness to the article and posted it themselves.

  • Right! I can’t wait to hear about all the new historical events!

    I wonder if anyone witnessed the burning of the Library of Alexandria and felt a similar sense of despair for the future of knowledge.

    You can download a copy of Wikipedia in full today before they turn it to shit.

    Unlike the people in Alexandria, you can spend less that $20 and 20 minutes to download the whole thing and preserve it yourself

  • Have you read my OP or did you just use an AI-generated summary? I copy-pasted several comments from Wikipedia editors and linked a page with dozens, if not a hundred other comments by them, and they're overwhelmingly negative.

    I'm not talking about them at all. I'm talking about the technology@lemmy.world Fediverse community. It's an anti-AI bubble. Just look at the vote ratios on the comments here. The guy you responded to initially said "Finally, a good use case for AI" and he got close to four downvotes per upvote. That's what I'm talking about.

    The target of these AI summaries are not Wikipedia editors, it's Wikipedia readers. I see no reason to expect that target group to be particularly anti-AI. If Wikipedia editors don't like it there'll likely be an option to disable it.

  • You realize this is just a proposal at this stage? Their proposed next step is an experiment:

    If we introduce a pre-generated summary feature as an opt-in feature on a the mobile site of a production wiki, we will be able to measure a clickthrough rate greater than 4%, ensure no negative effects to session length, pageviews, or internal referrals, and use this data to decide how and if we will further scale the summary feature.

    Note, an opt-in clickthrough that they intend to monitor for further information on how to implement features like this and whether they should monitor them at all. As befits Wikipedia, they're planning to base these decisions on evidence.

    If "they're gathering evidence and making proposals" is the threshold for you to jump ship to some other encyclopedia, I guess you do you. It's not going to be much of an exodus though since nobody who actually uses Wikipedia has seen anything change.

    Mb. I still don't see anything good coming out of implementing anything to do with AI though.

  • Trump social media site brought down by Iran hackers

    Technology technology
    174
    1k Stimmen
    174 Beiträge
    75 Aufrufe
    B
    That's the spirit
  • An AI video ad is making a splash. Is it the future of advertising?

    Technology technology
    2
    10 Stimmen
    2 Beiträge
    6 Aufrufe
    apfelwoischoppen@lemmy.worldA
    Gobble that AI slop NPR. Reads like sponsored content.
  • Deep Dive on Google's TPU (Tensor Processing Unit)

    Technology technology
    1
    45 Stimmen
    1 Beiträge
    4 Aufrufe
    Niemand hat geantwortet
  • Founder of 23andMe buys back company out of bankruptcy auction

    Technology technology
    60
    1
    349 Stimmen
    60 Beiträge
    17 Aufrufe
    A
    Come on up to Canada, we still got that garlic bomb. I can still taste the one from last week
  • 66 Stimmen
    8 Beiträge
    5 Aufrufe
    erasmus@lemmy.worldE
    The Convergiance is beginning. Altman Be Praised!!
  • Catbox.moe got screwed 😿

    Technology technology
    40
    55 Stimmen
    40 Beiträge
    30 Aufrufe
    archrecord@lemm.eeA
    I'll gladly give you a reason. I'm actually happy to articulate my stance on this, considering how much I tend to care about digital rights. Services that host files should not be held responsible for what users upload, unless: The service explicitly caters to illegal content by definition or practice (i.e. the if the website is literally titled uploadyourcsamhere[.]com then it's safe to assume they deliberately want to host illegal content) The service has a very easy mechanism to remove illegal content, either when asked, or through simple monitoring systems, but chooses not to do so (catbox does this, and quite quickly too) Because holding services responsible creates a whole host of negative effects. Here's some examples: Someone starts a CDN and some users upload CSAM. The creator of the CDN goes to jail now. Nobody ever wants to create a CDN because of the legal risk, and thus the only providers of CDNs become shady, expensive, anonymously-run services with no compliance mechanisms. You run a site that hosts images, and someone decides they want to harm you. They upload CSAM, then report the site to law enforcement. You go to jail. Anybody in the future who wants to run an image sharing site must now self-censor to try and not upset any human being that could be willing to harm them via their site. A social media site is hosting the posts and content of users. In order to be compliant and not go to jail, they must engage in extremely strict filtering, otherwise even one mistake could land them in jail. All users of the site are prohibited from posting any NSFW or even suggestive content, (including newsworthy media, such as an image of bodies in a warzone) and any violation leads to an instant ban, because any of those things could lead to a chance of actually illegal content being attached. This isn't just my opinion either. Digital rights organizations such as the Electronic Frontier Foundation have talked at length about similar policies before. To quote them: "When social media platforms adopt heavy-handed moderation policies, the unintended consequences can be hard to predict. For example, Twitter’s policies on sexual material have resulted in posts on sexual health and condoms being taken down. YouTube’s bans on violent content have resulted in journalism on the Syrian war being pulled from the site. It can be tempting to attempt to “fix” certain attitudes and behaviors online by placing increased restrictions on users’ speech, but in practice, web platforms have had more success at silencing innocent people than at making online communities healthier." Now, to address the rest of your comment, since I don't just want to focus on the beginning: I think you have to actively moderate what is uploaded Catbox does, and as previously mentioned, often at a much higher rate than other services, and at a comparable rate to many services that have millions, if not billions of dollars in annual profits that could otherwise be spent on further moderation. there has to be swifter and stricter punishment for those that do upload things that are against TOS and/or illegal. The problem isn't necessarily the speed at which people can be reported and punished, but rather that the internet is fundamentally harder to track people on than real life. It's easy for cops to sit around at a spot they know someone will be physically distributing illegal content at in real life, but digitally, even if you can see the feed of all the information passing through the service, a VPN or Tor connection will anonymize your IP address in a manner that most police departments won't be able to track, and most three-letter agencies will simply have a relatively low success rate with. There's no good solution to this problem of identifying perpetrators, which is why platforms often focus on moderation over legal enforcement actions against users so frequently. It accomplishes the goal of preventing and removing the content without having to, for example, require every single user of the internet to scan an ID (and also magically prevent people from just stealing other people's access tokens and impersonating their ID) I do agree, however, that we should probably provide larger amounts of funding, training, and resources, to divisions who's sole goal is to go after online distribution of various illegal content, primarily that which harms children, because it's certainly still an issue of there being too many reports to go through, even if many of them will still lead to dead ends. I hope that explains why making file hosting services liable for user uploaded content probably isn't the best strategy. I hate to see people with good intentions support ideas that sound good in practice, but in the end just cause more untold harms, and I hope you can understand why I believe this to be the case.
  • 146 Stimmen
    37 Beiträge
    17 Aufrufe
    D
    Self hosted Sunshine and Moonlight is the way to go.
  • San Francisco crypto founder faked his own death

    Technology technology
    10
    1
    98 Stimmen
    10 Beiträge
    10 Aufrufe
    S
    My head canon is that Satoshi Nakamoto... ... is Hideo Kojima. Anyway, Satoshi is the pseudonym used on the original... white paper, design doc, whatever it was, for Bitcoin. There's no doubt about that, I was there back before even Mt. Gox became a bitcoin exchange, on the forums discussing it. I thought it was a neat idea, at the time... and then I realized 95% of the discussions on that forum were about 'the ethics of fully informed ponzi schemes' and such, very little devoted to actual technical development... realized this was probably a bad omen.