The Cultural Heritage AI Cookbook: Notebooks
Google Colab notebook that uses large language models for analyzing and expanding text metadata related to cultural heritage
Description
The project Enriching Digital Heritage with LLMs and LOD aimed at introducing the cultural heritage community to the strengths of large language models (LLMs) with respect to information extraction from text. Cultural heritage institutions possess large volumes of text, for example metadata texts describing museum artifacts. LLMs are well-suited for analyzing text data, linking concepts to linked open data (LOD) collections, and finding relations between the concepts.
To this end, a Google Colab notebook has been developed (demo_qwen.ipynb) which recognizes named entities, links them to an LOD database (WikiData) and find relations between the concepts and the artifacts described by the texts. The notebook uses a locally run LLM for these tasks (Qwen). Its performance is evaluated on gold standard analyses of a sample of text metadata from the Egyptian Museum of Turin.
The notebook is used for showing the benefits of using LLMs for text analysis to the cultural heritage community. The project partners will use the experiences gained during the project for writing a cookbook with practical suggestions for using LLMs for analyzing and expanding text data related to cultural heritage
Participating organisations
Contributors
Contact person
Erik Tjong Kim Sang
Research Software Engineer
Netherlands eScience Center
0000-0002-8431-081X
Mail ErikRelated projects
Enriching Digital Heritage with LLMs and LOD
Enriching Digital Heritage with Large Language Models and Linked Open Data