ScholarGate
Assistant

TEI and Document Modeling

The Text Encoding Initiative is the dominant standard for encoding humanities texts. Its guidelines offer a vast vocabulary of elements for marking up everything from verse lines to manuscript damage, while document modeling decides which of those features a given project will capture and how.

Find Topic with PaperMindSoonFind papers & topics
Tools & resources
Download slides
Learn & explore
VideoSoon

Definition

The use of the Text Encoding Initiative guidelines to create machine-readable representations of texts, together with the analytical work of deciding which document features to model and how to constrain a project's markup.

Scope

Covers the TEI Guidelines and their use in modeling documents: the structure of TEI P5, the TEI header and metadata, customization through schemas, and the practice of deciding what to encode for a given source and purpose. Includes the institutional history of the TEI Consortium and the role of community standards in scholarly encoding.

Core questions

  • What does the TEI offer that ad hoc markup does not?
  • How does a project customize the TEI to fit its sources without sacrificing interchange?
  • Which features of a document are worth modeling, and at what cost?
  • How do the TEI header and metadata support discovery and reuse?

Key concepts

  • TEI header
  • Customization (ODD)
  • Element set
  • Schema validation
  • Standoff annotation

Key theories

Community-maintained encoding standard
The TEI is governed by a consortium that maintains an extensible, documented vocabulary, so that encoding choices are grounded in shared practice rather than reinvented for every project.
Customization and constraint
Because the full TEI is very large, projects define a customization (a constrained schema) that selects and adapts elements, balancing expressive coverage against consistency and validation.

History

The TEI was launched in 1987 by a consortium of scholarly associations to standardize humanities text encoding. Early editions (P1-P4) were SGML-based; TEI P5, released in 2007 and revised continuously since, is expressed in XML and supports customization through the ODD (One Document Does it all) framework. The standard now underlies a wide range of editions, corpora, and archives.

Debates

Comprehensiveness versus usability
The breadth of the TEI makes it powerful but daunting; debate continues over how much projects should customize and whether simpler subsets better serve interoperability.

Key figures

  • Lou Burnard
  • C. M. Sperberg-McQueen
  • Nancy Ide
  • Allen Renear

Related topics

Seminal works

  • tei2024
  • ide1995
  • burnard2014

Frequently asked questions

Do I have to use the whole TEI to use the TEI?
No. Projects normally define a customization that selects the elements they need and constrains how they are used. This keeps encoding manageable and consistent while remaining compatible with the wider standard.

Methods for this concept

Related concepts