Sign in


An Artificial Intelligence Approach to Comparing Text Versions

image credits: Shutterstock

Literary works are dynamic entities: they go through different stages of development before publication, and often continue to change even after their first publication. The early versions of a work, such as notes, draft manuscripts and typescripts, still show the traces of this dynamic development in the form of deletions, additions or substitutions. Today, these documents are carefully transcribed, annotated and encoded in a machine-readable language. Using text comparison tools, scholars can automatically compare the encoded text versions and examine the different stages in the work’s development. So far, however, it is not possible to include the annotations in the comparison process. This means that relevant scholarly information is lost.

The project employs machine learning technologies to develop a comparison tool that can take into account text as well as annotations. As a result, it will allow scholars to analyze the textual development at unprecedented levels of detail.

Participating organisations

Social Sciences & Humanities
Huygens Instituut
Netherlands eScience Center


Elli Bleeker
Lead Applicant
Huygens Institute for the History of the Netherlands
Jisk Attema
Jisk Attema
Programme Manager
Netherlands eScience Center
Kody Moodley
Lead RSE
Netherlands eScience Center
Ronald Haentjens Dekker
Huygens Institute for the History of the Netherlands