Wikipedia:Request a query
This is a page for requesting one-off database queries for certain criteria. Users who are interested and able to perform SQL queries on the projects can provide results from the Quarry website.
You may also be interested in the following:
- If you are interested in writing SQL queries or helping out here, visit our tips page.
- If you need to obtain a list of pages that meet certain criteria, consider using PetScan (user manual) or the default search. PetScan can generate lists of articles in subcategories, articles which transclude some template, etc.
- If you need to make changes to a number of articles based on a particular query, you can post to the bot requests page, depending on how many changes are needed.
- For long-term review and checking, database reports are available.
The database replicas do not have access to page content, so requests which require checking wikitext cannot be answered with database queries. In particular, there is no way to query for template parameters or anything related to references. However, someone may be able to assist by querying in another way (for example, checking for external links rather than references) or suggest an alternative tool.
Drawing up a list of users for m:Requests for new languages
editIs it possible to draw up a list of user pages on any language Wikipedia that are in a category for language proficiency for a specific language code? Ideally it'd exclude inactive editors, such as those that haven't edited in the past 3 months. With help from editors involved in the proposal to make the notice, I'd like to notify active editors across the wikiverse about an incubated Wikipedia in a language that they speak, in the hope that they may want to help set up and launch that new-language wiki.
For example, there is currently an eligible proposal for a Xiang Wikipedia (ISO 639-3 code hsn); I'd like a list of semi-active editors whose userpages are in Category:User hsn (or any of its sub-categories) on any Wikipedia, so we can then notify about the proposal. I hope that if a query was made for eg. the Xiang categories/templates, I'd be able to copy the query's code and just replace the ISO code, or Wikidata items for its categories, to do this for other proposals? Kowal2701 (talk, contribs) 17:14, 4 June 2026 (UTC)
- I requested a script as well but this may be easier Kowal2701 (talk, contribs) 17:21, 4 June 2026 (UTC)
- It is not. Different wikis' databases are split among different database servers, and the English Wikipedia in particular is on a server by itself. While there's some support in MariaDB for queries to access data on a different server, it's decidedly nontrivial and we peons don't have the permissions to do it anyway. —Cryptic 17:28, 5 June 2026 (UTC)
- And, especially for user-language-categories that are only on a handful of wikis like Category:User hsn, probably the easiest way is to run Petscan queries for each language. Setting those up is straightforward - a few minutes per wiki looked at if you're familiar with Petscan already, maybe an hour lead time total if you're aren't; plus a half hour at most to manually combine the results - while writing a script to do it is a multiple-hour job. —Cryptic 17:40, 5 June 2026 (UTC)
- Thank you, tbh I'll just do things manually, I was hoping it'd be relatively simple to technically do Kowal2701 (talk, contribs) 18:33, 5 June 2026 (UTC)
- Kowal2701, I was also going to recommend Petscan. If you look at the link in the intro paragraph at Wikipedia:Translators available#Translators available, there is a model Petscan link there that you can use as a starting point. As is, that one is intended to find translators into English, but you can modify it to work in the other direction. However as already pointed out, it only searches the en-wiki categories, so you would have to try an adapted version on other language Wikipedias. Mathglot (talk) 18:51, 5 June 2026 (UTC)
- And, especially for user-language-categories that are only on a handful of wikis like Category:User hsn, probably the easiest way is to run Petscan queries for each language. Setting those up is straightforward - a few minutes per wiki looked at if you're familiar with Petscan already, maybe an hour lead time total if you're aren't; plus a half hour at most to manually combine the results - while writing a script to do it is a multiple-hour job. —Cryptic 17:40, 5 June 2026 (UTC)
- It is not. Different wikis' databases are split among different database servers, and the English Wikipedia in particular is on a server by itself. While there's some support in MariaDB for queries to access data on a different server, it's decidedly nontrivial and we peons don't have the permissions to do it anyway. —Cryptic 17:28, 5 June 2026 (UTC)
Articles having the most Talk page archives
editDraft: https://quarry.wmcloud.org/query/106069
Looking for the articles having the most Talk page archives, as a first step I started off with this Cirrus query which returns the names of Talk pages with over 100 archives (with integer suffixes 100–1000; excluding date-named archives). There are five such: Talk:Donald Trump, Talk:Jesus, Talk:United States, Talk:Israel, and Talk:Race and intelligence. The query results contains those five, but there are also many unnecessary results, because it returns *all* the archives over 100, not just the highest numbered one. (There are also some undesirable results due to redirects that I don't think Cirrus can exclude.)
Next, I vibe-coded Quarry:query/106069 via Chat GPT with numerous twists and turns and some blind alleys. But in order to get something that worked, even if only partially, I had to retain a restriction I don't really want, namely, excluding date-named archives. So for example, if there are articles with lots of archives with Month-YYYY suffixes, they won't be included because the year suffix is > 999. So, two questions:
- How good/bad is the vibe-coded result? That is, let's say I *did* want to exclude date-named archives, is the resulting SQL decent, or is it an inefficient mess?
- Is there a way to get an answer to my real question, namely, 'articles having the most Talk page archives', regardless how those archives are named?
Thanks! Mathglot (talk) 21:37, 6 June 2026 (UTC)
- It's inefficient, but not hugely so, but the main problem is that it doesn't do anything like what you want, only something that looks similar: it just looks for the the page with the highest number and assumes that's the answer. The correct number could be much higher (in case of unusually-named archives, for example Special:Prefixindex/Talk:Sustainability/ or Special:Prefixindex/Talk:Muhammad/) or lower (someone happens to create Talk:Whatever/Archive 948 and nobody notices). Vibe coding in general isn't helpful unless you either know what you're doing or don't have access to someone who does.
- quarry:query/106075. This still assumes
'/Archive ' or '/Archives/''rchive' appears somewhere in the page title following a / - there's almost 8500 of those that don't match /Archive[space] or /Archives/. Know any examples of e.g. date-named archives that don't contain either of those? —Cryptic 22:35, 6 June 2026 (UTC)
- This is great, and that third column with unusual archives is very helpful, and user-friendly-forward; many thanks! Regarding the question, not sure how we are defining date-named archives, but do any of these fit those contraints? I note that Talk:United Kingdom, for example, has a few oddballs. Mathglot (talk) 00:07, 7 June 2026 (UTC)
- The current version of the query includes all subpages containing "rchive" anywhere after a slash, so it'll see almost everything in Special:Prefixindex/Talk:United Kingdom/: all the nonredirects except for /Basic Topics, /FAQ, /GA1 through /GA4, /Terminology, and /to do. (*/archivebox should probably have been filtered out, too.) Everything in your search is counted, as well.I wrote the initial version of that comment when the query required either "/Archive " or "/Archives/", assuming that there'd be no more than few hundred others; when I actually looked for them, though, I found an additional 8481 pages matching /.*\/.*rchive.*/ but neither of the first two patterns, which is why I updated the query.Mostly interested in archives that don't contain "rchive". The closest examples I'm aware of are things like WP:Deletion review/Log/2020 January, and none are in main-talkspace. —Cryptic 00:20, 7 June 2026 (UTC)
- This is great, and that third column with unusual archives is very helpful, and user-friendly-forward; many thanks! Regarding the question, not sure how we are defining date-named archives, but do any of these fit those contraints? I note that Talk:United Kingdom, for example, has a few oddballs. Mathglot (talk) 00:07, 7 June 2026 (UTC)
This is very useful. I have linked it from Template:Archives/doc#When to use (in expl. note d) and from Help:Archiving a talk page#Statistics. See here for links and mentions. Mathglot (talk) 00:00, 8 June 2026 (UTC)
Unimproved Iranian "village" articles
editHi, I'm looking for a search for all of the EN WP mainspace Iranian "village" articles created by USER:Carlossuarez46 with populations lower than 10, 25, 50, 75, and 100 (separate lists for each) with no substantial non-gnoming additions by any other editor. Each article includes the phrase "At the 2006 census, its population was xx, in yy families", for example "At the 2006 census, its population was 44, in 7 families". If a size-cap is needed 2100 bytes should be on the safe side. No page including a photo should be included. Many thanks for any assistance you might be able to offer with this. FOARP (talk) 18:17, 10 June 2026 (UTC)
- Queries can't see wikitext, so there's very little of this we can do here. I can find pages started by CS46, I can find pages with no other editor (or no other non-bot editor), I can find pages currently at or under a certain size, I can find pages without images (or without images other than those on a specific list, like, say, those included by maintenance or stub tags likely to be on his articles). I can't sort them by population unless - unlikely - they're already categorized separately by that; I can't check for "non-gnoming" edits even if you could come up with a programmatic definition of that; I certainly can't check for phrases in the pages, other than to confirm that they're not currently redirects.That leaves you with, well, nothing. There's zero mainspace pages started by this user that have been edited only him and (maybe) bots, that aren't redirects and have no images at all, even without filtering by size. Remove the no-image requirement too and there's 551. This is, as you're aware, an absurd undercount; before I filtered out redirects, there were some 83696 results, and before I restricted to mainspace, there were hundreds of "Populated places in..." categories. —Cryptic 19:07, 10 June 2026 (UTC)
- Thanks, looks like I'm going to have to think of another way of handling this. FOARP (talk) 20:44, 10 June 2026 (UTC)
- It's better, at 23694, if you define "non-gnoming edit" as "not flagged minor" and expand the no-bots rule to include any username containing "bot" in it (so as to see formerly-flagged bots). Still at quarry:query/106088. —Cryptic 20:46, 10 June 2026 (UTC)
- I'll try to see what I can do with that, cheers! FOARP (talk) 21:09, 10 June 2026 (UTC)
- You could try PetScan. First download the output from the Quarry query and strip it down to just a list of page names, one per line. Then, in PetScan's "Other sources" tab, paste that list into "Manual list" and add a search for something like
"At the 2006 census, its population was" insource:/At the 2006 census, its population was \d{1,2}, in \d{1,2} families/(In theory, the bit in quotes doesn't affect the search results, but in practice it will probably time out without it.) I got 7161 results. Certes (talk) 22:01, 10 June 2026 (UTC)
- You could try PetScan. First download the output from the Quarry query and strip it down to just a list of page names, one per line. Then, in PetScan's "Other sources" tab, paste that list into "Manual list" and add a search for something like
- I'll try to see what I can do with that, cheers! FOARP (talk) 21:09, 10 June 2026 (UTC)
- It's better, at 23694, if you define "non-gnoming edit" as "not flagged minor" and expand the no-bots rule to include any username containing "bot" in it (so as to see formerly-flagged bots). Still at quarry:query/106088. —Cryptic 20:46, 10 June 2026 (UTC)
- Thanks, looks like I'm going to have to think of another way of handling this. FOARP (talk) 20:44, 10 June 2026 (UTC)
- Vaguely relevant digression: we also have thousands of articles about Russian places with tiny populations. Ugol, Kharovsky District, Vologda Oblast is a typical example. Many, such as Vlasyevo, have unqualified titles despite not being obvious primary topics. Certes (talk) 22:20, 10 June 2026 (UTC)
Number of registered editors last year
editI want to add the 2025 numbers to Wikipedia talk:Wikipedians#Numbers of editors each year. The creation of Wikipedia:Temporary accounts last year means that the old query doesn't return comparable numbers. Can you please give me a new query that does? WhatamIdoing (talk) 03:10, 14 June 2026 (UTC)
- I don't know whether you have noticed this, I have only just seen it, but the user view has a tinyint (0 or 1) 'user_is_temp' column with a not null constraint. Sean.hoyland (talk) 06:32, 14 June 2026 (UTC)
- Yeah. quarry:query/106283 and quarry:query/106284, eventually. The shiny new ar_actor_deleted index on archive that seems to always take precedence over ar_actor_timestamp might make the latter take ridiculously longer than before, though. —Cryptic 06:46, 14 June 2026 (UTC)
- The first says "OperationalError('table resultsets already exists')". The second seems to have worked, and is the total number of registered editors (732,040 in 2025, which is down from 775,435 in 2024, down from 812,635 in 2023). WhatamIdoing (talk) 01:04, 15 June 2026 (UTC)
- Eugh. It completed successfully last night after about an hour and a half. I still happen to have the results in command history, since they don't add up to exactly the same as the other query and it took me a while to come up with a plausible reason why.
- The first says "OperationalError('table resultsets already exists')". The second seems to have worked, and is the total number of registered editors (732,040 in 2025, which is down from 775,435 in 2024, down from 812,635 in 2023). WhatamIdoing (talk) 01:04, 15 June 2026 (UTC)
- Yeah. quarry:query/106283 and quarry:query/106284, eventually. The shiny new ar_actor_deleted index on archive that seems to always take precedence over ar_actor_timestamp might make the latter take ridiculously longer than before, though. —Cryptic 06:46, 14 June 2026 (UTC)
| cohort | COUNT(*) |
|---|---|
| 1 | 579358 |
| 10 | 124483 |
| 100 | 21898 |
| 1000 | 5687 |
| 10000 | 645 |
| 100000 | 41 |
| 1000000 | 3 |
- (Yes, the same numbers in the 100k and 1m cohorts as in 2025, which is an odd coincidence.) The discrepancy - total 732115 here - is that, as an optimization, the other query only considers users with a user_editcount >= 1, and that field isn't guaranteed to be accurate. Edits imported from other wikis, for example, would create revisions attributed to a user without increasing their editcount here.I remember seeing problems before where Quarry would complete the query and display results for a while, then start showing errors when it was accessed later. Can't say for sure it was the same error. —Cryptic 01:36, 15 June 2026 (UTC)