Ctrl K

NetAudit

Interpretable embeddings for the Dutch population network

image credits: Shutterstock

Network analysis is an increasingly vital tool in the social sciences. It enables researchers to study how information, behaviours, and attitudes spread through social structures. Statistics Netherlands provides a unique and powerful resource for such analysis: a full-scale population network of the Netherlands.

In parallel, machine learning has introduced tools like embeddings to represent complex data (such as text or networks) as low-dimensional numeric vectors. These embeddings can capture meaningful patterns and are commonly used for tasks such as similarity search or attribute prediction. In the NetAudit project, we bring these two worlds together by learning embeddings for the entire Dutch population network.

However, one challenge remains: interpretability. Unlike traditional social science variables, embedding dimensions often lack clear meaning. To address this, we applied a transformation that makes the dimensions sparse and orthogonal, ensuring they capture distinct and interpretable aspects of the population network. This makes the embeddings more useful not only for the prediction tasks, but also for exploratory research and hypothesis generation.

The untransformed and transformed population network embeddings are available for the years 2020, 2021, and 2022 within the secure remote access environment by Statistics Netherlands through the Storage Facility (in collaboration with ODISSEI).

This project is funded by NWA ODISSEI Roadmap grant, task 4.4.

Participating organisations

ODISSEI
Netherlands eScience Center
Social Sciences & Humanities
Social Sciences & Humanities
Delft University of Technology

Output

Team

MK
Megha Khosla
Flavio Hafner
Flavio Hafner
Research Software Engineer
Netherlands eScience Center
Malte Lüken
Malte Lüken
Research Software Engineer
Netherlands eScience Center
JG
Javier Garcia-Bernardo
Contributor
Charles University, Faculty of Social Sciences
SD
Sreeparna Deb
Niels  Drost
Programme Manager
Netherlands eScience Center

Related projects

Modeling life outcomes

Modeling life outcomes through foundational machine learning models

Updated 2 months ago
Finished

PreFer

Data challenge for Predicting Fertility outcomes in the Netherlands

Updated 3 months ago
Finished