Review waiting, please be patient.
This may take 3 months or more, since drafts are reviewed in no specific order. There are 4,527 pending submissions waiting for review.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Reviewer tools
|
Submission declined on 9 February 2026 by ChrysGalley (talk).
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
This draft has been resubmitted and is currently awaiting re-review. |
Comment: Sources do not support the text. E.g. source 13, https://iscc.codes/conceptdoes not support base32. The CODE and UNIT composition is also not directly mentioned. ChrysGalley (talk) 19:44, 9 February 2026 (UTC)
Comment: In accordance with the Wikimedia Foundation's Terms of Use, I disclose that I have been paid by my employer for my contributions to this article.Correction: this disclosure was added automatically by the Article Wizard in error. I am not paid to edit Wikipedia. I do have a conflict of interest as an unpaid board member of the ISCC Foundation, which is separately disclosed on my user page and on the talk page per WP:COI. Etma 1222 (talk) 11:12, 23 April 2026 (UTC)
Comment: Notability basis: independent significant coverage in NISO/Carpenter (2024), the joint IEC-ISO-ITU AMAS technical report (2025), and the peer-reviewed Fraunhofer paper in Electronic Imaging (2025). The subject also meets WP:NSTANDARD as a published ISO standard (ISO 24138:2024) that has been adopted nationally as AS/NZS ISO 24138:2025. Etma 1222 (talk) 10:42, 4 June 2026 (UTC)
| International Standard Content Code (ISCC) | |
|---|---|
| Abbreviation | ISCC |
| Status | Published |
| Year started | 2016 |
| First published | 15 May 2024 |
| Latest version | ISO 24138:2024 |
| Organization | ISO/TC 46/SC 9 |
| Domain | Digital media content identification |
| Website | iscc |
The International Standard Content Code (ISCC) is a similarity-preserving identifier for digital media assets such as text, images, audio, and video, standardized as ISO 24138:2024.[1] An ISCC is computed from a file rather than assigned by a registration authority, so any party holding the same file can derive the same code. Because an ISCC is derived from the content, it differs from registry-assigned identifiers such as the ISBN or DOI, which are assigned by an authority; the fingerprint it produces also remains largely stable after a work is edited or re-encoded.[1][2] An ISCC combines several code units. Each unit is derived from one aspect of a file: its metadata, its perceptual features, its raw byte sequence, or a cryptographic hash of its contents. Because similar files yield similar codes, an ISCC can be used to detect near-duplicates and match content across formats and encodings.[1][2] The standard was published in 2024 after development within ISO/TC 46/SC 9. Since then the ISCC has been incorporated into the ONIX book-trade metadata standard and added to the C2PA list of soft-binding algorithms, and it has been used in rights-management and AI-governance settings. Its open-source reference implementation is maintained by the ISCC Foundation.[1][3]
History
editDevelopment
editThe ISCC was created by Titusz Pan (ORCID 0000-0002-0521-4214), an open-source developer working on content identification and the inventor of the ISCC.[4] According to the ISCC Foundation, Pan was the principal editor of ISO 24138:2024.[5] Pan developed the first ideas for the ISCC in early 2016; later that year the work was taken up by the Content Blockchain Project, a consortium that studied blockchain technology for journalism and digital media and received funding from the Google Digital News Initiative.[5] The project published an early specification and a prototype, and released an open-source ISCC 1.0 specification and reference code in 2018.[5][6] In 2019 the project received one of Germany's inaugural Digital Publishing Awards at the Leipzig Book Fair.[7]
Standardization
editIn 2019 the International Organization for Standardization took up the ISCC as a work item within Technical Committee 46, Subcommittee 9 (TC 46/SC 9, Identification and description). A dedicated working group, WG 18, was established to develop it and held its first meeting on 29 October 2019.[5] The committee reviewed the draft through the usual ISO ballot stages, including a draft international standard (DIS) review in 2023.[8] ISO 24138:2024 was published on 15 May 2024.[1] It defines the syntax, structure, and algorithms for generating ISCC codes and describes their use alongside existing identifier schemes such as DOI, ISAN, ISBN, ISRC, ISSN, and ISWC.[1] The standard includes a reference implementation, published as a freely available electronic insert under its normative Annex D, "Reference implementation".[1][9] In 2025 the standard was adopted nationally in Australia and New Zealand as AS/NZS ISO 24138:2025.[10]
Industry adoption
editIn July 2020 the ISCC was added to ONIX, the international book-trade metadata standard maintained by EDItEUR, allowing an ISCC to be carried alongside the ISBN in book-trade metadata.[11]
In May 2024 the Coalition for Content Provenance and Authenticity (C2PA) added the ISCC to its list of approved soft binding algorithms, registered as io.iscc.v0 with an entry date of 17 May 2024. Most other entries on the list are digital watermarking schemes. A soft binding algorithm re-associates a content-provenance manifest with a file after the file's embedded metadata has been removed.[12][13]
In 2025 a joint technical report by the IEC, ISO, and ITU, prepared through the AI and Multimedia Authenticity Standards (AMAS) collaboration, reviewed standards for AI-generated and altered media. The report listed the ISCC among asset-identifier standards.[14]
Structure
editAn ISCC-CODE is a composite of several ISCC-UNITs, each computed from a different aspect of the content.[1][15]
Units
editThe standard defines the following unit types:[1][3]
- Meta-Code: a similarity hash (SimHash) of basic metadata, typically the name and an optional description, used to cluster assets with similar descriptive information.[16]
- Semantic-Code: reserved as a code type in ISO 24138, with its algorithm not yet specified in the standard. Experimental implementations use deep learning embeddings of the semantic content of text and images.
- Content-Code: a modality-specific perceptual fingerprint:
- Text: a MinHash over normalized character n-grams.[17]
- Image: a discrete cosine transform-based perceptual hash.[18]
- Audio: a Chromaprint-based fingerprint condensed with SimHash.[19]
- Video: MPEG-7 frame signatures summed across frames and condensed with a Winner-Take-All hash.[20]
- Data-Code: a similarity-preserving hash of the raw file bytes, using content-defined chunking and MinHash.[21]
- Instance-Code: a BLAKE3 cryptographic hash of the file, used for exact integrity verification.[22]
A unit can stand alone or be combined into an ISCC-CODE. Unit bodies are sized in 32-bit steps, from 32 up to 256 bits, with a default of 64 bits.[23] When units are combined into an ISCC-CODE, each body is truncated to 64 bits before they are concatenated. A minimum ISCC-CODE contains the Data-Code and the Instance-Code; the other units are optional and precede them in canonical order.[1][15]
Format
editEach ISCC-UNIT begins with a variable-length header, commonly two bytes, that encodes the unit's main type, subtype, version, and length, followed by a variable-length body holding the fingerprint data. The units are concatenated and encoded in Base32 (RFC 4648, without padding) for the canonical string form, prefixed with ISCC:.[1][23] A representative ISCC-CODE that combines Meta-Code, text Content-Code, Data-Code, and Instance-Code units is:
ISCC:KACT4EBWK27737D2AYCJRAL5Z36G76RFRMO4554RU26HZ4ORJGIVHDISimilarity preservation
editUnlike a cryptographic hash, which changes completely after any edit, an ISCC-UNIT changes only partially: similar inputs produce similar codes, and the Hamming distance between two units approximates the similarity of the underlying content. Applications use this to detect near-duplicates and to cluster related content by comparing codes.[1][2]
Comparison with other identifiers
edit| Identifier | Scope | Assignment | Similarity detection |
|---|---|---|---|
| ISBN | Book editions | Assigned by national agencies | No |
| DOI | Scholarly works | Assigned by registration agencies | No |
| ISRC | Sound recordings | Assigned by national agencies | No |
| ISWC | Musical compositions | Assigned via collecting societies | No |
| ISCC | Digital media files | Computed from content | Yes |
Traditional identifiers are assigned by registration authorities and identify abstract works or specific editions. The ISCC identifies the digital file, so a single book edition with one ISBN can correspond to many ISCCs, one per format, compression level, or excerpt. ISO 24138 specifies the ISCC for use alongside DOI, ISAN, ISBN, ISRC, ISSN, and ISWC rather than as a replacement.[1] A 2026 European Union Intellectual Property Office (EUIPO) study mapping EU copyright databases and metadata standards discussed the ISCC among the content-identification schemes it surveyed.[24]
Applications
editDocumented and proposed uses of the ISCC include:
- Detecting exact and near-duplicate content for deduplication and database synchronization.
- Verifying file integrity through the Instance-Code.[25]
- Tracking versions of the same underlying content.
- Identifying AI-generated content and recording text and data mining (TDM) opt-out declarations under the EU AI Act. In a 2025 Electronic Imaging paper, researchers at the Fraunhofer Institute for Secure Information Technology proposed using the ISCC as a robust hashing method within an infrastructure for tagging AI-generated content.[2] In a January 2026 submission to a European Commission consultation, the International Federation of Reproduction Rights Organisations suggested the ISCC as a possible basis for machine-readable TDM rights reservations.[26]
- Research data management. The ELIXIR Galaxy platform integrated the ISCC for content-based reproducibility validation and dataset deduplication in bioimage analysis workflows.[25]
Use in cultural heritage
editIn a 2024 blog post, staff at the TIB and the Berlin State Library argued that libraries, archives, and museums should adopt the ISCC, citing content authentication, comparison of similar works, and registration of machine-learning training data.[27]
CommonsDB
editCommonsDB is a European Commission-funded pilot registry of rights information for public domain and openly licensed works that uses the ISCC as its content-derived identifier. A user can check the rights status of a file by generating its ISCC and looking up matching declarations.[28] By March 2026 the registry contained over one million rights declarations.[29]
Reception
editIn the scholarly-publishing and standards communities, the ISCC has been described as an example of "intrinsic" identifiers. A 2024 report on the PIDfest 2024 conference, co-authored by the chair of ISO/TC 46/SC 9, named the ISCC as one example of such intrinsic identifier systems.[30] Writing a guest column in Music Ally, Virginie Berger pointed to the ISCC as an existing ISO-standard fingerprinting method that the music industry could use for content traceability.[31]
Implementation
editThe open-source reference implementation of ISO 24138 is maintained by the ISCC Foundation on GitHub.[32] The core repositories are:
| Repository | Description |
|---|---|
| iscc-core | Reference implementation of the ISO 24138 algorithms |
| iscc-sdk | High-level Python SDK for ISCC generation |
| iscc-schema | JSON Schema definitions and metadata models |
| iscc-web | REST API service powering the public demonstration |
The Foundation also publishes experimental generators for semantic codes (iscc-sct for text and iscc-sci for images) and iscc-lib, a Rust implementation of the core algorithms with bindings for several languages.[32] Independent implementations include iscc-core-ts, a TypeScript port,[33] and iscc-sum, a Rust tool that generates the Data-Code and Instance-Code.[34]
ISCC Foundation
editThe ISCC Foundation (Stichting ISCC) is a nonprofit foundation (stichting) under Dutch law, founded in Leiden in May 2019 and based in Hengelo, Netherlands.[5][35] According to the Foundation, its activities include research on content identification, participation in open-standards work, maintenance of the open-source reference implementation, and support for community adoption.[35]
See also
editReferences
edit- 1 2 3 4 5 6 7 8 9 10 11 12 13 "ISO 24138:2024 Information and documentation - International Standard Content Code (ISCC)". International Organization for Standardization. 2024-05-15. Retrieved 2026-06-02.
- 1 2 3 4 Heeger, Julian; Berchtold, Waldemar; Bugert, Simon; Steinebach, Martin (2025). "EU AI-Act: Tagging GenAI Content". Electronic Imaging. 37 (4). Society for Imaging Science and Technology: MWSF-301. doi:10.2352/EI.2025.37.4.MWSF-301. Retrieved 2026-06-02.
- 1 2 Carpenter, Todd A. (June 2024). "Introducing the Newest ISO Identifier Standard". National Information Standards Organization. Retrieved 2026-06-02.
- ↑ Nawotka, Ed (2025-10-15). "Frankfurt Book Fair 2025: Identity Stamps". Publishers Weekly. Retrieved 2026-06-02.
- 1 2 3 4 5 "ISCC – History". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "iscc/iscc-specs: ISCC Specification v1.0.0". GitHub (ISCC Foundation). 2018-03-31. Retrieved 2026-06-02.
- ↑ Anderson, Porter (2019-03-25). "Content Blockchain Project Wins One of Germany's Digital Publishing Awards". Publishing Perspectives. Retrieved 2026-06-02.
- ↑ "ISO/DIS 24138, International Standard Content Code (ISCC)". Association for Information Science and Technology. 2023-12-13. Retrieved 2026-06-02.
- ↑ "ISO 24138:2024 Electronic inserts (reference software)". International Organization for Standardization. Retrieved 2026-06-02.
- ↑ "AS/NZS ISO 24138:2025 Information and documentation - International Standard Content Code (ISCC)". Standards Australia / Standards New Zealand. 2025. Retrieved 2026-06-02.
- ↑ "ONIX for Books Codelists Issue 50, List 5 (Product identifier type), code 39". EDItEUR. 2020-07-09. Retrieved 2026-06-02.
- ↑ "Soft Binding Algorithm List". C2PA. Retrieved 2026-06-02.
- ↑ "C2PA Technical Specification 2.2" (PDF). Coalition for Content Provenance and Authenticity. 2025-05-01. Retrieved 2026-06-02.
- ↑ "Technical Report on AI and Multimedia Authenticity Standards" (PDF). Geneva: World Standards Cooperation (IEC, ISO, ITU). 2025-07-11. ISBN 978-2-8399-4720-6. Retrieved 2026-06-02.
- 1 2 Pan, Titusz (2026-01-19). "IEP-0010: ISCC-CODE". ISCC Enhancement Proposals (ISCC Foundation). Retrieved 2026-06-02.
- ↑ "IEP-0002: ISCC-UNIT Meta-Code". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0003: Content-Code Text". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0004: Content-Code Image". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0005: Content-Code Audio". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0006: Content-Code Video". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0008: Data-Code". ISCC Foundation. Retrieved 2026-06-02.
- ↑ "IEP-0009: Instance-Code". ISCC Foundation. Retrieved 2026-06-02.
- 1 2 Pan, Titusz (2026-01-19). "IEP-0001: ISCC Structure and Format". ISCC Enhancement Proposals (ISCC Foundation). Retrieved 2026-06-02.
- ↑ European Union Intellectual Property Office (2026-05-28). Mapping of EU Databases and Metadata Standards Providing Information on Copyright-Protected Works (Report). European Union Intellectual Property Office. doi:10.2814/4041636. ISBN 978-92-9156-373-9.
- 1 2 Paul, Maarten; Etzrodt, Martin (2026-02-14). "Content Tracking and Verification in Galaxy Workflows with ISCC-SUM". Galaxy Training Network. Retrieved 2026-06-02.
- ↑ International Federation of Reproduction Rights Organisations (January 2026). "Response to European Commission Consultation on Protocols for Reserving Rights from Text and Data Mining under the AI Act and the GPAI Code of Practice" (PDF). Retrieved 2026-06-02.
- ↑ Heller, Lambert; Gragert, Gerrit (2024-07-05). "Why libraries, archives and museums should use the International Standard Content Code (ISCC)". TIB-Blog. Retrieved 2026-06-02.
- ↑ Europeana Foundation. "CommonsDB". Europeana PRO. Retrieved 2026-06-02.
- ↑ Price, Gary (2026-03-31). "Openly Licensed Works: CommonsDB Surpasses One Million Declarations". Library Journal infoDOCKET. Retrieved 2026-06-02.
- ↑ Meadows, Alice; Jones, Phill; Carpenter, Todd A. (2024-07-18). "A Successful Start to a New Festival of Identifiers: PIDfest 2024". The Scholarly Kitchen. Society for Scholarly Publishing. Retrieved 2026-06-02.
- ↑ Berger, Virginie (2025-05-08). "Licensing AI Music: The Industry Is Focusing on the Wrong Problem". Music Ally. Retrieved 2026-06-02.
- 1 2 "ISCC Foundation". GitHub. Retrieved 2026-06-02.
- ↑ "iscc-core-ts: TypeScript implementation of iscc-core". GitHub. Retrieved 2026-06-02.
- ↑ "iscc-sum". GitHub. Retrieved 2026-06-02.
- 1 2 "Foundation". ISCC Foundation. Retrieved 2026-06-02.


LLM-generated pages with certain obvious signs of being machine generated may be deleted without notice.
These tools are prone to specific issues that violate our policies:
Instead, only summarize in your own words a range of independent, reliable, published sources that discuss the subject.
See the advice page on large language models for more information.