Is anyone else hating a lot of these current articles that are sparse as fuck on detail. How are they actually using generative AI. Where is it being applied. Just telling me that it’s tools for editors and volunteers doesn’t tell me what the tool is doing. 😤
Here’s the actual source: https://meta.m.wikimedia.org/wiki/Strategy/Multigenerational/Artificial_intelligence_for_editors
ah so no generative ai used in actual article production, just in meta stuff and for newcomers to ask questions about how to do things.
Yeah, this article seems like an anti-Wikipedia article. They’re just using it for translation, spelling errors, content quality, etc.
Wikipedia’s model of collective knowledge generation has demonstrated its ability to create verifiable and neutral encyclopedic knowledge. The Wikipedian community and WMF have long used AI to support the work of volunteers while centering the role of the human. Today we use AI to support editors to detect vandalism on all Wikipedia sites, translate content for readers, predict article quality, quantify the readability of articles, suggest edits to volunteers, and beyond. We have done so following Wikipedia’s values around community governance, transparency, support of human rights, open source, and others. That said, we have modestly applied AI to the editing experience when opportunities or technology presented itself. However, we have not undertaken a concerted effort to improve the editing experience of volunteers with AI, as we have chosen not to prioritize it over other opportunities.
Thank you!
Wikipedia had bots writing US census gathering-place articles in 2002, 20 years before LLMs were a thing. They’ve got decades of regulations in place, so I am not scared that the quality is going to drop.
Remember to download a backup while information quality is still passable
It’s not for use in editing articles.
Do these backups also contain the edit histories?
There are both dumps with full history and ones that are just the current set of articles. The full dump happens once a month on the 1st, but will often take ~2 weeks to run to completion, so you probably have to look back to the April 1 2025 dump for those. The metawiki dumps page has all the info.
Wikipedia generally a really good candidate for generative AI.
Generative AI suffers from inaccuracy; text AI generators making up believable lies if it doesnt have enough information
The idea of generative AI isn’t accuracy, so that’s pretty expected.
Generative AI is designed to be used with a content base and expand on information, not to create new information. You can feed generative AI with the entirety of the current Wikipedia text source and have it expand on subjects which need it, and curtail and simplify other subjects which need it.
You don’t ask generative AI to come up with new information–that’s how you get inaccurate information.
text AI generators making up believable lies if it doesnt have enough information
Let’s not anthropomorphize AI. It doesn’t lie. It uses available data to expand on a subject to make it conversationally complete when it lacks sufficient information on a subject, regardless of whether or not the context is correct. That’s completely different, and you can specifically prohibit an AI from doing that…
AI is great when used appropriately. The issue is that people are using AI as a Google replacement, something it’s not designed to do. AI isn’t a fact engine. LLMs are designed to as closely resemble human speech as possible, not to give correct information to questions. People’s issue with AI is that they’re fucking using it wrong.
This is an exceptionally great usage of AI because you already have the required factual background knowledge. You can simply feed it to your AI telling it not to fill in any gaps and to rewrite articles to be more uniform and to have direct and easy to consume verbiage. This instance is quite literally what generative AI was designed for…to use factual knowledge and to generate context around the existing data.
Issues arise when you use AI for things other than what it was intended, and you don’t give it enough information and it has to generate information to complete datasets. AI will do what you ask, you just have to know how to ask it. That’s why AI prompt engineers are a thing.
…nothing could possibly go worng!..
(Some of you may remember the original Westworld 1sheet…)