Sign in


doc2vec-based assisted close reading with support for abstract concept-based search and context-based search


Cite this software

What evidence can do for you

  • Provides AI/machine-learning support for close-reading-based research
  • Intuitive example based search throughout large corpora
  • browser-based usage / User interface
  • concept based search using abstract doc2vec representations
  • context based search using word frequency/TF-IDF represenations
  • automated processing of user-supplied corpora

Machine-supported research in humanities

While research in the humanities has been able to leverage the digitization of text corpora and the development of computer based text analysis tools to its benefit, the interface current systems provide the user with is incompatible with the proven method of scholarly close reading of texts which is key in many research scenarios pursuing complex research questions.

What this boils down to, is the fact that it is often restrictive and difficult, if not impossible, to formulate adequate selection criteria, in particular for more complex or abstract concepts, in the framework of a keyword based search which is the standard entry point to digitized text collections.

Querying by example - close reading with tailored suggestions

evidence provides an alternative, intuitive entry point into collections by leveraging the doc2vec framework. Using doc2vec evidence learns abstract representations of the theme and content of the elements of the user's corpus. Then, instead of trying to translate the scientific query into keywords, after compiling a set of relevant elements as starting points, i.e. examples of the concept the user is interested in, the user can query the corpus based on these examples of their concept of interest. Specifically, evidence retrieves elements with similar abstract representations and presents them to the user, using the users feedback to refine its retrieval.
Furthermore, this concept-based query mode is complemented by the ability to perform additional retrieval using more-like-this context based retrieval function provided by elasticsearch.
Together, this enables a user to combine the power of a close-reading approach with that of a large digitized corpus, selecting elements from the entire corpus which are likely to be of interest, but leaving the decision up to the user as to what evidence they deem useful.

No keywords avaliable
Programming languages
  • TypeScript 43%
  • Go 29%
  • Jupyter Notebook 19%
  • Shell 6%
  • CSS 1%
  • Dockerfile 1%
  • HTML 1%
  • Python 1%
  • GPL-3.0
</>Source code

Participating organisations

Social Sciences & Humanities
KNAW Humanities Cluster
Netherlands eScience Center


Digital technologies to analyze eyewitness accounts of mass violence

Author(s): Netherlands eScience Center
Published in 2017


Contact person

Meiert Grootes

Meiert Grootes

Netherlands eScience Center
Mail Meiert
Bas Leenknegt
KNAW Humanities Cluster
Christiaan Meijer
Christiaan Meijer
Netherlands eScience Center
Faruk Diblen
Faruk Diblen
Netherlands eScience Center
Hayco de Jong
KNAW Humanities Cluster
Jurriaan H. Spaaks
Jurriaan H. Spaaks
Netherlands eScience Center
Lars Buitinck
KNAW Humanities Cluster
Meiert Grootes
Meiert Grootes
Netherlands eScience Center
Stefan Verhoeven
Stefan Verhoeven
Netherlands eScience Center
Willem van Hage
Netherlands eScience Center

Related projects


Ego Documents Events modelling – how individuals recall mass violence

Updated 3 months ago