Science Cast

Confronting spurious evaluations of computational methods in small molecule mass spectrometry

librarianMay 7, 2026 5:56am

Views (11)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Confronting spurious evaluations of computational methods in small molecule mass spectrometry

bioRxivPDFMay 6, 2026 12:00am

Authors

Gupta, V.; Skinnider, M. A.

Abstract

Mass spectrometry-based metabolomics detects thousands of small molecule-associated signals in biological samples, but the vast majority cannot be structurally identified. Mounting interest in this metabolomic 'dark matter' has spurred the development of dozens of machine-learning models for structural annotation of small molecules from their MS/MS spectra. Here, we expose a fundamental flaw in the longstanding paradigm by which these models have been evaluated. We show that a trivial machine-learning model can achieve strong performance on existing benchmarks despite wholly discarding the information contained within MS/MS spectra themselves, and without using any other auxiliary information. This performance arises because compounds with reference MS/MS spectra are structurally distinct from those found in generic chemical databases, and machine-learning models can exploit this dissimilarity by learning to predict whether a compound is likely to have been measured by MS/MS. However, we show that this confound can be overcome by using a generative model to sample decoy structures that are chemically indistinguishable from those found in reference MS/MS libraries. The resulting benchmark cannot be solved without attending to MS/MS spectra, and therefore provides an epistemologically valid framework to evaluate computational methods for the annotation of MS/MS spectra from small molecules.

TwitterandLinkedIn

0 comments

Add comment

Confronting spurious evaluations of computational methods in small molecule mass spectrometry

Confronting spurious evaluations of computational methods in small molecule mass spectrometry

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments