Democracy Datasets
Sentence-wise split datasets of country reports in CSV format
Assessing democratic backsliding in European and its neighborhood
Democracies in Europe and beyond are facing threats of sliding back into authoritarianism. Despite initially promising signs of liberalization, ‘democratic backsliding’ has prominently occurred in Russia and Turkey, but also in Poland and Hungary and in established democracies such as France and the UK. Democratic backsliding has attracted the attention of international agencies (e.g., Freedom House, V.Dem), which regularly assesses the quality of democracy in different countries. Nevertheless, such attempts suffer from subjectivity bias as they mostly rely on qualitative judgments produced by country experts. We lack a comparative view of the dimensions and quality of democratic assessments. BackDem aimed to develop a digital tool for text processing that: 1) maps dimensions of democratic quality in texts and 2) assesses the precision of democratic assessments.
We investigated various ways to tackle these challenges using classical statistical methods like keyword-based extraction methods and AI language models including BERT, RoBERTa and legalBERT. The project involved a substantial amount of scraping online documents and labelling a subset of the corpus to train and fine tune AI models to detect sentiment and democratic category. In the limited time we had for the project we only managed to scrape a subset of the corpus. Nevertheless, the digital tools and the documentation provide the basis for future researchers to further develop and enrich the corpus.
Sentence-wise split datasets of country reports in CSV format
A scraping tool to scrape various sources of country reports.
LLM transfer learning to classify country reports into democracy dimensions.
Various Jupyter notebooks for topic modelling on democracy texts including BERT and dictionary, keyword-based approaches.
A python library to split sentence-wise and convert various document formats to CSV format.