This project aimed at providing social science and humanities scholars with computational tools for analyzing multiple aspects of the quality of online information. Over the course of the project, we focused on mainly two aspects of information quality: argument quality and toxicity. In both cases, our contribution was both theoretical and applied.
Regarding argument quality, we developed theoretical frameworks based on the bipolar argumentation framework,
that we tailored explicitly to online reviews. We then implemented a whole argument checking-based pipeline to assess information quality. This pipeline first uses argument mining components based on crowdsourcing and NLP methods,and then applies argument reasoning using weighted and bipolar argument frameworks through logical reasoners. The pipeline is implemented both in the Orange development framework and in Python. We showed that this unsupervised
approach is effective in identifying high-quality online reviews.
In the second part of the project, we focused on characterizing low-quality online information. In particular, we developed a theoretical framework and performed a large survey on online toxic memes. Based on this framework, we developed an ontology for characterizing toxic symbology, tailored at memes. Also, we performed a study on the ability of small LLMs to identify and explain toxic content in memes.
These two phases of the project focused on the two ends of the information quality spectrum: we first looked at
methods to identify high-quality items and then characterized low-quality ones. The methods employed allow for deep analysis, explanation, and understanding of the content’s quality providing a basis for meeting different quality requirements coming from different users.
We plan to continue this line of research, combining and extending the methodologies explored so far, to characterize different types of information items according to multiple aspects of quality in an explainable manner.