MS2DeepScore

Florian Huber

doi:10.5281/zenodo.4584625

Description

Predict chemical similarity based on MS/MS mass spectra

A classical way to compare MS/MS mass spectra is to quantify their peak overlap, often done by using variations of cosine similarity scores. Those measures tend to work well for nearly equal spectra, i.e. cases of very high peak overlap. We recently introduced Spec2Vec an unsupervised machine learning approach for computing spectrum similarities based on learned relationships between peaks across large training datasets [ref]. Spec2Vec based similarity scores were observed to correlate more strongly than classical cosine-like scores with actual structural similarities between the underlying compounds. Additional core advantages are its fast computation, which allows to compare query spectra against very large libraries, and the fact that -as an unsupervised method- it can be trained on non-annotated data.
However, the downside of an unsupervised approach is that it does not make use of the large fraction of labels that we have for the training data. The used training data (MS/MS spectra from GNPS) contains smiles/InChI annotations hence does allow to create molecular fingerprints for quantifying the structural similarities.

MS2DeepScore

Cite this software

DOI:

Description

Participating organisations

Reference papers

Mentions

Contributors

Contact person

Florian Huber

Hochschule Düsseldorf

0000-0002-3535-9406

Related projects

Integrated omics analysis for small molecule-mediated host-microbiome interactions

MS2DeepScore

Cite this software

DOI:

Description

Participating organisations

Reference papers

Journal articles1

Other1

Mentions

Book section3

Journal articles128

Presentations1

Other39

Contributors

Contact person

Florian Huber

Hochschule Düsseldorf

.logo-orcid_svg__st1{fill:#fff}0000-0002-3535-9406

Related projects

Integrated omics analysis for small molecule-mediated host-microbiome interactions

0000-0002-3535-9406