Integrated omics analysis for small molecule-mediated host-microbiome interactions

Advancing our understanding of molecular mechanisms of health and disease

The microbes in our bodies are fundamental to our health. At the molecular level, many of their interactions with human tissues are mediated by microbial specialized metabolites.

While metabolomics provides a powerful technique to profile these, most microbial molecules have unknown structures; hence, over 95% of detected masses cannot be functionally interpreted or linked to their producers. This currently thwarts efforts to understand important diseased states of our microbiome.

Many innovative computational workflows have recently been designed to predict molecular (sub)structures from genomic or metabolomic data; however, these efforts have remained largely unconnected. Integrating these data will make it possible to complement partial information provided by each field to yield much better functional predictions.

Moreover, it will connect vital information from both data types: while metabolomics informs about in vivo relevance, genomics informs about biological origin. Here, we propose to design a novel algorithm to connect molecular substructures identified in tandem mass-spectrometric data to sets of genes within biosynthetic gene clusters (BGCs) detected in (meta)genomic data. Subsequently, we will integrate this algorithm with our previous methods for metabolome (spectral networking, substructure detection) and genome analysis (BGC identification and clustering) in one comprehensive eScience workflow.

Finally, we will demonstrate its potential by identifying molecules prominent during periods of relapse in a longitudinal study of inflammatory bowel disease (IBD) and connecting them to their producers. Ultimately, our workflow will illuminate the vast unknown metabolic space within the human microbial metabolome, and greatly advance our understanding of molecular mechanisms of health and disease.

Participating organisations

Netherlands eScience Center
Wageningen University & Research
Life Sciences
Life Sciences

Impact

Output

Team

Florian Huber
Florian Huber
eScience Research Engineer
Netherlands eScience Center
JvdH
Justin J. J. van der Hooft
Principal investigator
Wageningen University and Research
Lars Ridder
Lars Ridder
eScience Coordinator
Netherlands eScience Center
Stefan Verhoeven
Senior eScience Research Engineer
Netherlands eScience Center

Related projects

NPLinker

A community-supported workflow connecting microbial genes, and organisms to their molecular products

Updated 3 months ago
In progress

FEDMix

Fusible evolutionary deep neural network mixture learning from distributed data for robust medical...

Updated 21 months ago
Finished

DeepRank

Scoring 3D protein-protein interaction models using deep learning

Updated 21 months ago
Finished

Googling the cancer genome

Identification and prioritization of cancer-causing structural variations in whole genomes

Updated 1 month ago
Finished

Classifying activity types

Gaining insights from wearable movement sensors

Updated 21 months ago
Finished

Enhancing Protein-Drug Binding Prediction

Combining molecular simulation and eScience technologies

Updated 1 month ago
Finished

Related software

matchms

MA

Python library for fuzzy comparison of mass spectrum data and other Python objects

Updated 14 months ago
94 14

MS2DeepScore

MS

Deep learning based similarity measure of mass spectrometry data.

Updated 2 weeks ago
106 2

Paired omics data platform

PA

If you do metabolomics experiments with mass spectra and have sequenced the genomes of the samples, then the platform can help you link them.

Updated 29 months ago
100 2

spec2vec

SP

spec2vec is a novel similarity measure for comparing mass spectrometry data, which learns peak representations using Word2Vec.

Updated 3 months ago
169 10