Sign in
Ctrl K


Deep learning based similarity measure of mass spectrometry data.


Cite this software

What MS2DeepScore can do for you

  • Predict chemical similarity based on MS/MS mass spectra

A classical way to compare MS/MS mass spectra is to quantify their peak overlap, often done by using variations of cosine similarity scores. Those measures tend to work well for nearly equal spectra, i.e. cases of very high peak overlap. We recently introduced Spec2Vec an unsupervised machine learning approach for computing spectrum similarities based on learned relationships between peaks across large training datasets [ref]. Spec2Vec based similarity scores were observed to correlate more strongly than classical cosine-like scores with actual structural similarities between the underlying compounds. Additional core advantages are its fast computation, which allows to compare query spectra against very large libraries, and the fact that -as an unsupervised method- it can be trained on non-annotated data.
However, the downside of an unsupervised approach is that it does not make use of the large fraction of labels that we have for the training data. The used training data (MS/MS spectra from GNPS) contains smiles/InChI annotations hence does allow to create molecular fingerprints for quantifying the structural similarities.

No keywords available
Programming language
  • Jupyter Notebook 100%
</>Source code

Participating organisations

Netherlands eScience Center
Wageningen University & Research



Florian Huber
Florian Huber
Sven van der Burg
Sven van der Burg

Related projects

Integrated omics analysis for small molecule-mediated host-microbiome interactions

Advancing our understanding of molecular mechanisms of health and disease

Updated 18 months ago