
Deep learning based similarity measure of mass spectrometry data.

Get started
1398 commitsLast commit ≈ 1 week ago59 stars26 forks

Cite this software

What MS2DeepScore can do for you

  • Predict chemical similarity based on MS/MS mass spectra

A classical way to compare MS/MS mass spectra is to quantify their peak overlap, often done by using variations of cosine similarity scores. Those measures tend to work well for nearly equal spectra, i.e. cases of very high peak overlap. We recently introduced Spec2Vec an unsupervised machine learning approach for computing spectrum similarities based on learned relationships between peaks across large training datasets [ref]. Spec2Vec based similarity scores were observed to correlate more strongly than classical cosine-like scores with actual structural similarities between the underlying compounds. Additional core advantages are its fast computation, which allows to compare query spectra against very large libraries, and the fact that -as an unsupervised method- it can be trained on non-annotated data.
However, the downside of an unsupervised approach is that it does not make use of the large fraction of labels that we have for the training data. The used training data (MS/MS spectra from GNPS) contains smiles/InChI annotations hence does allow to create molecular fingerprints for quantifying the structural similarities.

Logo of MS2DeepScore
No keywords available
Programming languages
  • Jupyter Notebook 99%
  • Python 1%
</>Source code

Participating organisations

Life Sciences
Life Sciences
Hochschule Düsseldorf University of Applied Sciences
Netherlands eScience Center
Wageningen University & Research

Reference papers



Florian Huber
Florian Huber
Sven van der Burg
Sven van der Burg

Related projects

Integrated omics analysis for small molecule-mediated host-microbiome interactions

Advancing our understanding of molecular mechanisms of health and disease

Updated 27 months ago