Image credit: Ryoji Iwata via Unsplash.
Hundreds of registry data sets record the people's life courses. In this project, we want to understand how well deep learning algorithms can learn to represent and predict life course data.
Our specific goals are to
- develop a generic data model for life course data at population scale and that considers the relational aspects of these data
- develop tokenizers to transform the (relational) event data into machine-learning ready event sequences
- develop and train deep learning architectures suitable for these modalities
- develop evaluation tasks for assessing the models' skill for social science research, focusing on "hard" prediction tasks such as edge prediction and generative modeling.