Ctrl K

The Cultural Heritage AI Cookbook: Notebooks

Google Colab notebook that uses large language models for analyzing and expanding text metadata related to cultural heritage

3
contributors
Get started
221 commitsLast commit ≈ 2 weeks ago13 stars18 forks

Description

The project Enriching Digital Heritage with LLMs and LOD aimed at introducing the cultural heritage community to the strengths of large language models (LLMs) with respect to information extraction from text. Cultural heritage institutions possess large volumes of text, for example metadata texts describing museum artifacts. LLMs are well-suited for analyzing text data, linking concepts to linked open data (LOD) collections, and finding relations between the concepts.

To this end, a Google Colab notebook has been developed (demo_qwen.ipynb) which recognizes named entities, links them to an LOD database (WikiData) and find relations between the concepts and the artifacts described by the texts. The notebook uses a locally run LLM for these tasks (Qwen). Its performance is evaluated on gold standard analyses of a sample of text metadata from the Egyptian Museum of Turin.

The notebook is used for showing the benefits of using LLMs for text analysis to the cultural heritage community. The project partners will use the experiences gained during the project for writing a cookbook with practical suggestions for using LLMs for analyzing and expanding text data related to cultural heritage

Keywords
Programming languages
  • Jupyter Notebook 95%
  • Python 5%
License
</>Source code

Participating organisations

Netherlands eScience Center
King's College London
University of Southern Denmark

Contributors

ETKS
Research Software Engineer
Netherlands eScience Center
GR
Gethin Rees
TY
Tariq Yousef
Co-Applicant
Akademie der Wissenschaften und der Literatur Mainz

Related projects

Enriching Digital Heritage with LLMs and LOD

Enriching Digital Heritage with Large Language Models and Linked Open Data

Updated 2 months ago
Finished