Preamble

edit
Within-article consistency of spelling variety

By long-standing convention and as detailed at the Manual of Style, a Wikipedia article needs to consistently use one variety of English. Because editors come from all over the English-speaking world, articles are vulnerable to creeping inconsistency in spelling. These inconsistencies are often difficult to identify quickly, and this script has been developed as an aid to the careful manual oversight of spelling.

The variety is prescribed where an article is related to one of the ancestral English-speaking countries; in other cases, the existing variety used in an article—where this is clear—is retained for the sake of stability. According to the EU style guide for publications, British English is the official code of English to be employed for the European Union, implying that en.WP articles about the EU—but not necessarily its individual member states—ought to use British English.

Scope

edit

The EngvarB script is at User:Ohconfucius/script/EngvarB.js.

  • It ensures consistency where Commonwealth English is already predominantly used in an article, or should be based on WP:TIES. Its dictionary covers the most commonly encountered spelling divergences and the script may be used on articles about Australian, Bangladeshi, Ghanaian, Hong Kong, Hiberno-English, Indian, Malaysian, New Zealand, Nigerian, Pakistani, Philippine, Singaporean, South African, or other Commonwealth subjects. It will convert instances of non-British words to 'British Commonwealth spelling' on the basis of American and British English spelling differences.
  • The script buttons cater for four major codes of English:
  1. AMERICAN — -ise-, -isa-, and -isi- words are converted to their -ize- equivalents; -our, -re, -ae-/-oe- and related forms are converted to American equivalents. See American and British English spelling differences.
  2. BRITISH — -ize-, -iza-, and -izi- words (other than analyze, paralyze and their derivatives, which are pure Americanisms) are converted to their -ise- equivalents; American -or, -er, -og and related forms are converted to Commonwealth equivalents.
  3. OXFORD — as Commonwealth English above, with the additional step of converting -ise-, -isa- and -isi- words back to their-ize- equivalents where the -ize suffix is etymologically correct. Note that analyse, paralyse and related forms are always converted to the -yse- form regardless of the option chosen, as -yze- is a pure Americanism.
  4. CANADIAN — retains Commonwealth -our, -re, -ce and doubled-consonant forms, while converting certain ae/oe forms and some other spellings to North American equivalents. The -ize- spelling is standard for Canadian English; -ise- forms are converted accordingly.
  • The following are protected from substitution during script execution:
    • Text within <blockquote>...</blockquote> HTML tags and {{blockquote}} templates is protected in full. Text within plain straight quotation marks is treated as body text and is converted.
    • References (<ref>...</ref>);
    • Image file names, category links, and piped wikilink targets;
    • External URLs and url= parameters within templates;
    • Formatting-sensitive parameter names (color, gray, center, organization, license, analog, coordinates and others) are protected when they appear as template parameter names. However, the same words used as article prose—where unambiguous—are converted.
  • Style attributes (|style=...|) within table cells are protected in full.**Quoted text bounded by typographic double quotation marks (" "); text within plain straight quotes (' or ") is treated as body text and will be converted.
    • Whitelisted proper nouns, Latin phrases, and idiomatic expressions that would otherwise trigger false positives (e.g. rigor mortis,Organisation for Economic Co-operation and Development, World Health Organization).

Template insertion

edit

The script inserts a non-displaying maintenance template depending on the conversion mode chosen:

In all cases, any template belonging to a different dialect group is removed. Where an appropriate template already exists, only the |date= parameter is updated. For the avoidance of doubt, Canadian English and Oxford spelling are stand-alone groups; American, Liberian and Philippine are in the same group; all other {{Use X English}} templates belong to the Commonwealth group. {{EngvarB}} is treated as a Commonwealth-group template and is updated rather than replaced when activating the Commonwealth function; it is removed when activating any other function. {{EngvarA}} and {{EngvarC}} are considered deprecated and are always removed.

Feedback is appreciated at User Talk:Ohconfucius. Please report false negatives as well as false positives. An archive of talk page exchanges specific to this script is maintained at User talk:Ohconfucius/EngvarB.

Installing the script

edit
  1. Open your common.js in edit mode (alternatively, go to your user page and append "/common.js" to the end of the URL and open the page in edit mode).
    • If you prefer to load this only on a specific skin, such as monobook, open your monobook.js in edit mode.
    • If you are proficient in scripting, you may WP:FORK and adapt the script as you see fit.
    • If you make a straight copy of this script, instead of "importing" it, you may not benefit from the enhancements and bug-fixes that are made from time to time. In the latter case, you may choose to watchlist the [relevant script page] so you will know when to update your copy for modifications to this script.
  2. Copy the following code onto the JavaScript page you have chosen in the previous step:
    importScript('User:Ohconfucius/script/EngvarB.js'); // [[User:Ohconfucius/script/EngvarB.js]]
    
  3. Save the page and (re-)load it refresh the cache by following the instructions at the top of your JavaScript page.
  4. Bookmark the script page. This will be your cue to purge the cache on your browser for any updates to take effect.

Disclaimer: Use at your own risk and make sure you check the edit changes before you save.

Actions and test

edit

In edit mode, a sidebar header will appear under the subsection marked EngvarB with the following buttons in the left margin:

Known limitations

edit
  • Upper-case words, typically proper nouns or section headings, are left untouched.
  • Performance may degrade on very large articles (above approximately 200,000 bytes).
  • Spelling substitutions are applied to words preceded by spaces, square brackets ([), pipe symbols (|) or asterisks (*). The -ise/-ize and doubled-L conversion passes use word boundaries and apply more broadly, regardless of the preceding character.
  • The words center and color, when used as HTML or table formatting parameters, are not converted. However, words using these as a stem in article prose—where unambiguous—are converted.
  • Complex nested template structures (for example, infoboxes containing embedded citation templates) may in some cases cause the script to behave unexpectedly.
  • The script does not convert words in all-capitals, words inside piped wikilink targets (only the display text is eligible), or words embedded in URLs. It is not a substitute for careful reading and copyediting.
  • Infobox parameter names that coincide with spelling-sensitive words (color, gray, organization, license, etc.) are protected and will not be converted.

Known issues and conflicts

edit
  • New South Wales is protected to prevent South being affected by -our word patterns.
  • Traveling Wilburys and Rockefeller are protected as proper nouns containing letter sequences that would otherwise trigger substitutions.
  • National Geographic Traveler is protected as a proper title using American spelling.
  • The script does not currently convert words beginning a sentence after a full stop when the following word is capitalised, as the prefix anchor requires a space or bracket immediately before the target word.
  • Burglarise/burglarize — the script normalises both -ise and -ize forms of this verb to the root burgle.
  • The |date= parameter format used when inserting or updating maintenance templates reflects the month and year at the time the script is run, and is not adjustable by the user.

See also

edit