Unlocking the archives: A pipeline for scanning, transcribing, and modelling entities of archival documents into Linked Open Data

Leon van Wissen, Chiara Latronico, Veruska Zamborlini, Jirsi Reinders, Charles van den Heuvel

June, 2020

Abstract

In this paper, the full pipeline from archives to annotations is represented that comprehends the successive stages of scanning, indexing, transcribing, correcting, aggregating, and modelling the entities of archival documents into RDF as Linked Open Data. It provides the creation of transparent datasets that can be replicated, evaluated and used for quantitative analyses in digital humanities research.

Type

Conference paper

Publication

DH Benelux 2020