Confronting spurious evaluations of computational methods in small molecule mass spectrometry
Confronting spurious evaluations of computational methods in small molecule mass spectrometry
Gupta, V.; Skinnider, M. A.
AbstractMass spectrometry-based metabolomics detects thousands of small molecule-associated signals in biological samples, but the vast majority cannot be structurally identified. Mounting interest in this metabolomic 'dark matter' has spurred the development of dozens of machine-learning models for structural annotation of small molecules from their MS/MS spectra. Here, we expose a fundamental flaw in the longstanding paradigm by which these models have been evaluated. We show that a trivial machine-learning model can achieve strong performance on existing benchmarks despite wholly discarding the information contained within MS/MS spectra themselves, and without using any other auxiliary information. This performance arises because compounds with reference MS/MS spectra are structurally distinct from those found in generic chemical databases, and machine-learning models can exploit this dissimilarity by learning to predict whether a compound is likely to have been measured by MS/MS. However, we show that this confound can be overcome by using a generative model to sample decoy structures that are chemically indistinguishable from those found in reference MS/MS libraries. The resulting benchmark cannot be solved without attending to MS/MS spectra, and therefore provides an epistemologically valid framework to evaluate computational methods for the annotation of MS/MS spectra from small molecules.