Wikipedia:Bot requests/Archive 89

Archive 85Archive 87Archive 88Archive 89

Unnecessary disambiguations

Is there a way a bot could find instances of unnecessary disambiguations? Specifically, instances of articles named "Title (parenthetical)" where there isn't currently an article at just "Title". (In other words, something like Floofy (band) existing where Floofy is still a redlink.)

I ask this because sometimes I see people stick (film) or (band) at the end of article names unnescessarily, or sometimes the non-parenthetical gets deleted via AFD or PROD and the parenthetical version is never moved to reclaim the title. Ten Pound Hammer(What did I screw up now?) 17:15, 19 January 2026 (UTC)

I’ve also seen several instances of this during NPP, I will try coming up with something. Vanderwaalforces (talk) 17:42, 20 January 2026 (UTC)
I did a quick database query and found a total of 37,887 of these, which is far too large to make any kind of useful report. * Pppery * it has begun... 17:50, 20 January 2026 (UTC)
Can these be moved by a bot? If better information is needed before a bot run, then maybe sort by disambiguation and move the ones we know for sure can be moved (pages with (film), (country film), etc.). Gonnym (talk) 18:06, 20 January 2026 (UTC)
Yeah, that might narrow it down. Start with ones that are "Name (film)" in cases where "Name" doesn't exist, then maybe the same with (band), as those are the two I see most often. Ten Pound Hammer(What did I screw up now?) 18:20, 20 January 2026 (UTC)
All 37,000 can't be moved by a bot because sometimes the actual name of a proper noun includes parentheses, like Barugh (Great and Little) (okay, Barugh technically exists, but I'm not convinced there aren't any like that where the base name is red). Specific parenthetical disambiguators can probably be botted; see Wikipedia:Database reports/Specific unnecessary disambiguations. * Pppery * it has begun... 18:27, 20 January 2026 (UTC)
SD0001, I'll be honest I've not done much in the way of database reports, could we get this unnecessary dab report refreshed? Primefac (talk) 15:50, 16 March 2026 (UTC)
The bot isn't updating because there's no |interval= param. I've added it now for 30 days - feel free to adjust if more frequent updates are desired. It's also possible to just hit "Update the table now" for an on-demand update. – SD0001 (talk) 16:07, 16 March 2026 (UTC)
Thanks! Didn't even think to look at the page code for that sort of thing. Primefac (talk) 16:18, 16 March 2026 (UTC)
TenPoundHammer, from my work on Wikipedia:Missing redirects project I obtained User:Qwerfjkl/sandbox/55, which BD2412 organized. Qwerfjkltalk 20:57, 20 January 2026 (UTC)
Well, this is quite it then. Vanderwaalforces (talk) 21:01, 20 January 2026 (UTC)
@Qwerfjkl: I have been meaning to ask your permission to subdivide that page, as it is of rather unwieldy length. Cheers! BD2412 T 21:49, 20 January 2026 (UTC)
BD2412, by all means. Qwerfjkltalk 22:28, 20 January 2026 (UTC)
@Qwerfjkl: sweet, that's a big help Ten Pound Hammer(What did I screw up now?) 00:46, 21 January 2026 (UTC)
So could all discrete subsets be moved by bot, provided that the page without "(parenthetical)" still does not exist or redirects to the disambiguated title? Wikiwerner (talk) 17:40, 15 March 2026 (UTC)
Wikiwerner, these need to be handled on a case-by-case basis in general. Qwerfjkltalk 10:53, 16 March 2026 (UTC)
Why? I think it wold be a reasonable bot request to take a specific disambiguator and move all pages with that disambiguator where the base name either has never existed or is a single-revision redirect to the disambiguated page and move them to the base name. * Pppery * it has begun... 13:30, 16 March 2026 (UTC)
In fact, I was originally going to code this task, but seeing the discussion so far, I paused because I am not sure we want to, even though I think we should too. Vanderwaalforces (talk) 13:57, 16 March 2026 (UTC)
If someone wants to modify User:Plastikspork/massmove.js to strip out suffixes, any admin would be able to batch-remove disambiguators such as (film). It should theoretically just require shifting your lookup value from the front to the back of the string, but I'm pulled in a few too many directions at the moment so I'm not sure if I'm able to do this myself. Primefac (talk) 15:15, 16 March 2026 (UTC)
It really was that easy; User:Primefac/massmove2.js allows for stripping of suffixes. Primefac (talk) 15:44, 16 March 2026 (UTC)
Thank you, duh! Vanderwaalforces (talk) 18:38, 16 March 2026 (UTC)
There are only a few hundred (film) and (band) articles in the dbase report, and we can add other values to the report based on the discussion below. I figure to minimise disruption maybe do batches of forty or fifty at a time spread out over a couple of days. Primefac (talk) 10:23, 18 March 2026 (UTC)
I do have concerns that we may end up automating "primary topic" status to a bunch of titles that do not merit that over the title redirecting elsewhere. BD2412 T 14:39, 18 March 2026 (UTC)
I was only planning on moving pages that have a redlinked base name. Primefac (talk) 18:42, 19 March 2026 (UTC)
Pppery, what are some examples of disambiguators that you think could be safely stripped? Qwerfjkltalk 17:03, 16 March 2026 (UTC)

Automatically add Template:AI-retrieved source

Since we have Template:AI-retrieved source, it would be nice if a bot could add this template to refs based on whether they have the utm_source parameter set to a LLM value. (See User:Headbomb/unreliable for a list of these utm_source values).

Sources that were manually verified by someone can simply be marked as "good" by removing the utm_source parameter. Laura240406 (talk) 21:56, 19 January 2026 (UTC)

I would be interested in developing this (n.b. I see that {{AI-retrieved source}} suggests either adding a |checked= parameter or commenting out the template rather than modifying the source URL). — chrs || talk 02:46, 20 January 2026 (UTC)
We could also change the citation templats, so that if the URL parameter contains e.g. "UTM_source=chatgpt.com", then {{AI-retrieved source}} is added at the end of the citation. Wikiwerner (talk) 11:36, 19 March 2026 (UTC)

Bloxx website creation Bot

I am building a website builder and SEO optimization tool, users by default customers bring in their Socials across yelp, google reviews, instragram et cetera by way of google places API.

One thing that signals trust is also a wikipedia page with founding date and basics of the company. For the established businesses with important context, i'd like to automatically create the wikipedia pages via the API with the important details on the company to signal trust. Jamespentalow (talk) 00:13, 24 March 2026 (UTC)

Hi @Jamespentalow, Wikipedia is not intended as a directory or something to signal trust, but an encyclopedia. As an encyclopedia, there are criteria for inclusion. The criteria for inclusion for companies is at WP:NCORP. This is a generally higher bar, and thus it's harder to get an article on a company than it is to get one on most other topics. We require high-quality, independent, and reliable sources for articles, and right now there is not an AI or other automated software that can gather these, let alone write an article. If you think the company meets WP:NCORP, you're free to take a stab at writing an article about it yourself and going through the Articles for creation process. However, the task is Impossible for a bot. HurricaneZetaC 01:31, 24 March 2026 (UTC)

Reference spam detector

I would like to see a bot that would detect likely Reference spam, and generate a confidence score internally, the way User:Cluebot NG does. Ideally, at high levels of confidence, perhaps it could just revert as Cluebot does, but in any case, it ought to generate a project page with a table or log of rated edits so that humans could review the results, comment, perhaps define a confidence threshold for auto-reverts, and of course, provide data for refining and tuning the algorithm.

I seem to be spending more and more time analyzing and reverting WP:REFSPAM, and a lot of them are very obvious and really should not need human intervention. If someone is a new editor, adds substantially the same citation to multiple articles, with no added content (or brief, near-identical content), and has few or no edits outside one topic area (i.e., an WP:SPA), odds are very high they are a ref spammer. Mathglot (talk) 01:34, 24 February 2026 (UTC)

@Mathglot: this is programmatically possible, a little difficult and time-consuming — but possible. I have a few questions, would it be okay if we continue the discussion on your or my talkpage? —usernamekiran (talk) 08:37, 23 March 2026 (UTC)
usernamekiran, sure, let's move it, but let's find a more public Wikipedia page or WikiProject page where we have a prayer of attracting other interested comment. Perhaps Wikipedia talk:Spam (540 watchers, 56 pageviews/mo.) or Wikipedia talk:WikiProject Spam (1,218 / 1,926), or a subpage of one of them to centralize possibly extended commentary? My knowledge in this field is ancient now, but I wonder if we compiled a grab-bag of possible features (to add to the four I listed), generated a test set of a few hundred human-assessed spam evaluations, and threw a machine learning bot at it with the feature set, whether that might generate a usable model, at least as proof of concept. Likely with all the advances in AI, some of that can be streamlined, maybe even the assessments? That would be a win. Mathglot (talk) 09:19, 23 March 2026 (UTC)
That's even better, I should have thought of that. I have a primary workflow in my mind, only for creating the report(s). In the early phase, the bot should rely on heuristics instead of machine learning. Once we create a good confidence scoring mechanism, we can move to next phase of reverting the edits. But during first phase, we will need inputs from other users on reports — to cross-verify the suspected spam links. In few hours, I will copy-paste this conversation, and detailed workflow on Wikipedia talk:WikiProject Spam, and notify few relevant venues of the discussion. Once we create a workflow/logic, I can start on concrete programming. During the discussion, I will create code for detecting the URLs being inserted, associating them with users/articles, and other basic necessary stuff. —usernamekiran (talk) 23:28, 23 March 2026 (UTC)

Coding... a base/skeleton code has been created, but given the complexity, this would take at least a month to go fully operational in bot's user-space, and to collect enough data. It would take at least a month or two after that to go live in mainspace/BRFA. —usernamekiran (talk) 08:17, 24 March 2026 (UTC)

I am wanting to see if there is potentially a way to automate the adding of Wikilinks to all articles that have listed or otherwise have Bruce Dukov mentioned in the contents of the article. This is an entry I recently created (after it was initially deleted in 2006), and there are about 290 articles that mention Mr. Dukov individually (not in infoboxes, it seems) in the text of the article. I am wanting to see if those mentions could be automatically linked, essentially. Please let me know if this is possible.

Sincerely, PootisHeavy (talk) 22:48, 19 May 2026 (UTC)

WP:AWB is probably the best option for this. BilledMammal (talk) 03:45, 20 May 2026 (UTC)
{{BOTREQ}} Primefac (talk) 17:20, 30 May 2026 (UTC)

Categorisation of uncategorised redirects

I am unfamiliar with bots, and have no idea if this is a bad idea or not (it may very well overload the server or something).

Would it be a good idea to have a bot automatically put redirects with no categorisation into it's own maintenance category, so editors can fix it? (See WP:RCAT) Jacksonvil (talk|contribs) 05:29, 24 May 2026 (UTC)

I think a database report might be better for this, based on the sheer scale. One might already exist though, I'm not sure. ~/Bunnypranav:<ping> 08:50, 24 May 2026 (UTC)
Correct (about using a report, that is). Primefac (talk) 10:29, 24 May 2026 (UTC)
Thanks for bringing me to the notice of this. I will have a look. Jacksonvil (talk|contribs) 11:29, 24 May 2026 (UTC)
{{BOTREQ}} Jacksonvil (alt) (talk) (contribs) (Main account) 05:21, 29 May 2026 (UTC)

Useless non-free no reduce tags

{{Non-free no reduce}} is intended to keep a bot (I think it is DatBot (talk · contribs)) from downsizing large non-free images to no more than 100k square pixels. Therefore, it is redundant and pointless if an image with size ≤100k has this tag. I recently found an image File:Gangbusters title.png that had this tag with a size of 418×239 px (99902 px2). –LaundryPizza03 (d) 21:16, 10 February 2026 (UTC)

This page is for bot requests. Are you requesting a bot? SeaDragon1 (talk, contributions) 16:11, 20 February 2026 (UTC)
Yes, because this is an issue that is trivial to handle and likely to recur in the future. –LaundryPizza03 (d) 04:40, 21 February 2026 (UTC)
So from a bot perspective, how often (generally speaking) does this sort of thing happen? Primefac (talk) 12:56, 21 February 2026 (UTC)
Looks like there are currently 204 such images, 41 with the latest file revision in 2025 and one so far in 2026. There are also four audio files with the tag: File:J Dilla - Don't Cry.ogg, File:J Dilla - Time, The Donut of the Heart.ogg, File:Jane Remover - Dreamflasher.ogg, and File:Kanye West - Blood on the Leaves.ogg.
I note 57 of the 204 are in Category:Sports uniforms, 54 of them last uploaded in 2024. Many appear to be templated images showing two variations of a uniform, in contrast to others (over 100000px2) that have three variations (e.g. File:ECM-Uniform-PHI.png vs File:ECA-Uniform-DET.png), which makes me suspect they started tagging all new template images with {{Non-free no reduce}}. Anomie 19:55, 21 February 2026 (UTC)
@27JJ: This is you who uploaded the latest version of sports uniforms. –LaundryPizza03 (d) 19:57, 1 April 2026 (UTC)
I’ve been applying the tag consistently across template-based uploads to prevent unintended resizing. This was inherited from other contributors. Since it’s unnecessary for images under 100k px, I’ll adjust moving forwards. Feel free to clean up existing files where needed. 27JJ (talk) 20:32, 1 April 2026 (UTC)