ScholarGate
어시스턴트

방법 비교

선택한 방법을 나란히 검토하세요. 서로 다른 행은 강조 표시됩니다.

Historical Named-Entity Recognition×Handwritten Text Recognition for Archives×
분야Digital HistoryDigital History
계열Process / pipelineMachine learning
기원 연도20192019
창시자Stephen Seaward and colleaguesTranskribus and the READ project
유형text-analysis-pipelineml-recognition-pipeline
원전Muehlberger, G., Seaward, L., Terras, M., et al. (2019). Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study. Journal of Documentation, 75(5), 954-976. DOI ↗Muehlberger, G., Seaward, L., Terras, M., et al. (2019). Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study. Journal of Documentation, 75(5), 954-976. DOI ↗
별칭Historical NER, Entity extraction from historical sources, Diachronic named-entity recognition, Archival entity taggingHTR, Manuscript transcription AI, Automatic handwriting transcription, Neural archival transcription
관련33
요약Historical named-entity recognition adapts a core natural-language-processing task, identifying and classifying the names of persons, places, organizations, and dates in text, to the distinctive difficulties of historical sources. Modern NER systems are trained on clean contemporary text, but historical documents arrive full of archaic and inconsistent spelling, obsolete place-names, OCR or handwriting-transcription errors, and entities that have since changed names or vanished. Work surveyed by Seaward and colleagues addresses these obstacles, combining machine-learning sequence models with historical gazetteers and authority files to recognize entities reliably in noisy diachronic text. The payoff is large: once persons, places, and dates are extracted and linked to standard identifiers, historians can build prosopographies of who appears with whom, populate historical GIS with mapped place-names, and structure vast textual archives for search and analysis. Historical NER thus serves as a crucial bridge, turning the unstructured output of digitization and text mining into structured, linkable data about the actors and settings of the past.Handwritten text recognition for archives converts digital images of manuscript pages into searchable, machine-readable text, unlocking the vast holdings of handwritten material that optical character recognition, designed for print, cannot read. Exemplified by platforms such as Transkribus, developed in the READ project, modern HTR uses deep neural networks trained on transcribed examples to recognize the highly variable scripts of letters, registers, charters, and notebooks across centuries and languages. The pipeline first analyzes page layout and segments the image into text regions and lines, then a recurrent or transformer-based recognizer decodes each line into characters, typically using connectionist temporal classification to align pixels with text without needing character-level segmentation. Crucially, recognition models are trained and improved on ground-truth transcriptions supplied by scholars, so accuracy rises as more material is annotated. By making manuscripts machine-readable at scale, HTR is the gateway technology of digital archival history, feeding full-text search, named-entity recognition, and large-corpus text mining of sources that were previously legible only page by page.
ScholarGate데이터셋
  1. v1
  2. 2 출처
  3. PUBLISHED
  1. v1
  2. 2 출처
  3. PUBLISHED

검색으로 이동 슬라이드 다운로드

ScholarGate방법 비교: Historical Named-Entity Recognition · Handwritten Text Recognition for Archives. 2026-06-25에 다음에서 검색함: https://scholargate.app/ko/compare