adhtools
Use adhtools for analyzing Arabic corpora.
Digital humanities and the Arabic-Islamic corpus
Despite some pioneering efforts in recent times, the longue durée analysis of conceptual history in the Islamic world remains a largely unexplored field of research. Researchers of Islamic intellectual history still tend to study a certain canon of texts, made available by previous Western researchers of the Islamic world largely based on considerations of the relevance of these texts for Western theories, concepts and ideas. Indigenous conceptual developments and innovations are therefore insufficiently understood, particularly as concerns the transition from premodern to modern thought in Islam.
This project seeks to harness state-of-the art Digital Humanities approaches and technologies to make pioneering forays into the vast corpus of digitised Arabic texts that has become available in the last decade. This is done along the lines of four case studies, each of which examines a separate genre of Arabic and Islamic literary history (jurisprudence, inter-faith literature, early modern and modern journalism, and Arabic poetry).
In this project, an interactive corpus explorer was developed, allowing the researchers to examine the distribution and context of concepts in a large volume of texts. The tool was then applied to the use cases, in addition to advanced text mining methods.
Morphological encoding for texts in Syriac using machine learning
Ego Documents Events modelling – how individuals recall mass violence
Text-induced corpus correction and lexical assessment tool
Evaluation and post-correction of OCR of digitised historical newspapers
Strengthening the methodology of digital humanities
Visualizing the level of international readability of works of fiction
Pillarization and depillarization tested in digitized media historical sources
Extracting relations between people and events
Use adhtools for analyzing Arabic corpora.
A flexible solution to build text mining workflows that allows you to quickly combine Natural Language Processing tools from different sources.