User:Phoenix/Against LLMs

This is an essay.

It contains the advice or opinions of one or more Wikipedia contributors. This page is not an encyclopedia article or a Wikipedia policy, as it has not been reviewed by the community.

Shortcuts

This page in a nutshell: LLM use on Wikipedia should be prohibited. Competence is required.

Due to the ease of access to programs like ChatGPT, those interested in Wikipedia will undoubtedly try to get it to create an article for them. This is quite common in drafts and even in mainspace. Some editors are fine with it as long as the person adding it fact-checks the information that the LLM has presented. I would like to make my case for a complete ban on LLM usage on the English Wikipedia.

What is an LLM?

An LLM, in simple terms, is an AI chatbot. In long terms, it stands for "Large language model", which gives users outputs to their prompts based on an unfathomable amount of data scraped from the internet and subsequently fed to the LLM. This includes academic articles, webpages, Fandom, and yes, Wikipedia.

Why do users use LLMs?

Creating a Wikipedia article is a tedious task. It should ideally involve research on the topic, collecting reliable and independent sources, writing a neutral article about the subject, adequately formatting and citing it, and then either submitting it to Articles for Creation (AfC) or publishing it directly to mainspace. However, some users don't want to go through this process. It would be much quicker and easier to open ChatGPT, ask for, say, a Wikipedia article about a Brazilian pop artist that's "up and coming" based on the release of their newest album, or a company that's the "leading organization" in Japanese cybersecurity. Maybe they're not even writing an article. Perhaps they want to expand a pre-existing article on some old African kingdom, or a Native American tribal leader who died in the 1700s.

Simply put, they don't want to put the effort in to contribute to the encyclopedia themselves. Thus, they turn to an LLM.

A case example of User:Example

This editor, using an LLM to write him a draft, is about to get a rude awakening.

Say an editor goes through with an LLM-generated article. It's submitted to AfC and is declined, for.. well, LLM usage. They then resubmits it, claiming they've addressed the concerns. Lo and behold, declined again. The editor takes it to the AfC Help Desk, insisting they did not use AI. Their concerns are waved away, and one editor tells the draft creator "write it on your own words." The editor, by now, has made enough edits to become autoconfirmed and simply moves the page to draftspace.

A new pages patroller reviews the article. It's put as start-class, and a big orange "LLM" tag is slapped on the top of it. "No problem!" the creator cries, who promptly removes the tag. It's restored (shocker) with a note left in the restoree's edit summary to address the tag before removing it. Another editor comes by and gives the page some hallucinated reference tags, which the creator removes in hopes of making their article look better. The editor is reverted, receives a warning not to use LLMs to write for them.

But the editor doesn't listen. They edit some more pages by adding LLM-generated sources. They make another page, going through the same lowly process as the first time, and the messages on their talk page starts to add up. They leave LLM-generated messages on talk pages, which people point out and collapse, though the editor ignores these. Maybe they're also undeclaratoraly COI editing, adding info to BLPs, or editing a contentious topic. Whatever the case, someone's had enough. They make an AN/I thread because they've had enough of this editor. The user uses an LLM to state their case, has it immediately collapse, and after 23 more messages than really should've been necessary, they're indefinitely blocked.

Why is this an issue?

In the example above, Wikipedians just spent far too much time and resources dealing with this singular editor using LLMs. You might think this is preposterous, that it's not that big of a deal, but it happens. It happens over, and over, and over again. If you search the archives of AN/I looking for "LLM" or "AI", it's guaranteed you'll turn up countless examples of editors following, in some shape or form, the example laid out above (and then getting blocked for it).

This wasting of time and resources distracts users and takes away time that could've been spent helping improve the encyclopedia in any manner of ways. It fills AN/I and led to the creation of the AI Cleanup Noticeboard, which gets new posts every couple hours asking for help cleaning up the mess that an editor using an LLM left behind. My rationale behind proposing a complete ban on LLM usage is that it will free up this time for editors to go about helping the encyclopedia rather than cleaning up the mess of LLM-using editors.

Playing devil's advocate and then killing the devil

Myself (top) vanquishing the devil (bottom, representing editors using LLMs) from Wikipedia.

Alternative title: Refuting possible claims that those still in favor of LLM use may have.

The editor adding the LLM's info can fact-check!
- It is unlikely that they will. True, there may be some editors using LLMs that do check the outputs, verifying the information is correct and cited to reliable sources. But why? That same amount of time could simply be used finding those same sources and putting them into the article.

The editor has the sources, they're just using an LLM to put it together!
- Then they can put it together themself.

They're just using an LLM to communicate! They don't speak English/Their English is poor!
- Google Translate is a largely accurate way to translate text from one language to another. I have used it before to communicate to those on the internet who do not speak English and have experienced few problems.
You'll come down on new editors like bricks and ban those who don't know any better!
- Please see the below, which is my proposal for enforcement.
It will only assist human editors, not replace them!
- Call me a Luddite, but AI is already being used in places like marketing and music (e.g. "We are Charlie Kirk"). These come to intense backlash: just look at the art community's reaction to AI. Sure, it can suggest grammar changes and fill in missing citations, but human editors can do that too. In my experience, they're better at both those things than LLMs.
LLMs are better than they were three years ago!
- Sure, that's true, but they still hallucinate. They still take details not relevant and treat them like they are.
LLMs can be used with editor guidance and oversight!
- How do we set a threshold of being responsible when using LLMs? We can't. If we can't police it, then there's no way we can prevent it from getting out of control again.
We already have a template to speedy-delete pages made by LLMs!
- There has to be an additional rationale. "It was made by an LLM" isn't enough.
There's no consensus for this!
- I don't expect consensus. This is a user essay, not a guideline. It reflects my personal opinions and serves as a manifesto of sorts, and does not represent the opinions of the editor community.
LLM outputs are generally reliable!
- The keyword here is generally. They still hallucinate and produce slop. Wikipedia pages are read by millions every day, and we have a high standard to uphold. Using LLM has us break that standard.

TL;DR: Competence is required. If you are required to use an LLM to be competent, then you're not competent.

My proposal for enforcement

You'll find that my proposal for enforcing the prohibition of LLMs is quite simple, actually.

It is to be noted when starting to edit a page, preferably with a banner like when you edit AN/I, that LLMs may not be used in any shape or form. There should be a way to toggle this off in settings, as it will get annoying for editors, but newbies cannot say they weren't warned.
An editor suspected of having used an LLM will be given a talk page warning not to use LLMs. It is to be made, as clear as day, that LLMs may not be used and any future use will result in a block.
If they refute the claim that they are using an LLM, evidence is to be presented of their use.
If they are found using an LLM again, they are to be blocked.

"Gee, isn't this a little harsh?" You might ask. "You're coming down on them like a ton of bricks." The editor will be given a warning that explains LLM usage is not tolerated. If they break the rules, then oh well—they get the punishment. People will tell you growing up that you shouldn't kill somebody, but you go and do it anyway despite knowing it's unlawful. How dare you, O. J..

If we want the issue to be resolved, then we must attack it with an iron fist. Thus, the only solution is the complete prohibition of LLM-generated content on Wikipedia.