Deep spectral component filtering as a foundation model for spectral analysis demonstrated in metabolic profiling

Holmes, E., Wilson, I. D. & Nicholson, J. K. Metabolic phenotyping in health and disease. Cell 134, 714–717 (2008).
Nicholson, J. K. Global systems biology, personalized medicine and molecular epidemiology. Mol. Syst. Biol. 2, 52 (2006).
Alseekh, S. et al. Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices. Nat. Methods 18, 747–756 (2021).
Julkunen, H. et al. Atlas of plasma NMR biomarkers for health and disease in 118,461 individuals from the UK Biobank. Nat. Commun. 14, 604 (2023).
Hu, J. et al. RSPSSL: a novel high-fidelity Raman spectral preprocessing scheme to enhance biomedical applications and chemical resolution visualization. Light Sci. Appl. 13, 52 (2024).
He, H. et al. Noise learning of instruments for high-contrast, high-resolution and fast hyperspectral microscopy and nanoscopy. Nat. Commun. 15, 754 (2024).
Guo, S., Popp, J. & Bocklitz, T. Chemometric analysis in Raman spectroscopy from experimental design to machine learning–based modeling. Nat. Protoc. 16, 5426–5459 (2021).
Felten, J. et al. Vibrational spectroscopic image analysis of biological material using multivariate curve resolution–alternating least squares (MCR-ALS). Nat. Protoc. 10, 217–240 (2015).
Bi, X. et al. Molecule-resolvable SERSome for metabolic profiling. Chem https://doi.org/10.1016/j.chempr.2025.102528 (2025).
Su, H. et al. Surface-enhanced Raman spectroscopy study on the structure changes of 4-mercaptophenylboronic acid under different pH conditions. Spectrochim. Acta A Mol. Biomol. Spectrosc. 185, 336–342 (2017).
Giese, B. & McNaughton, D. Surface-enhanced Raman spectroscopic study of uracil. The influence of the surface substrate, surface potential, and pH. J. Phys. Chem. B 106, 1461–1470 (2002).
Zarei, M. et al. Machine learning analysis of Raman spectra to quantify the organic constituents in complex organic-mineral mixtures. Anal. Chem. 95, 15908–15916 (2023).
Koyun, O. C. et al. RamanFormer: a transformer-based quantification approach for raman mixture components. ACS Omega 9, 23241–23251 (2024).
Brown, T. et al. Language models are few-shot learners. In Proc. Advances in Neural Information Processing Systems Vol. 33, 1877–1901 (NeurIPS, 2020).
Bi, X., Czajkowsky, D. M., Shao, Z. & Ye, J. Digital colloid-enhanced Raman spectroscopy by single-molecule counting. Nature 628, 771–775 (2024).
Luo, Y. et al. Component identification for the SERS spectra of microplastics mixture with convolutional neural network. Sci. Total Environ. 895, 165138 (2023).
Bi, X. et al. SERSomes for metabolic phenotyping and prostate cancer diagnosis. Cell Rep. Med. https://doi.org/10.1016/J.XCRM.2024.101579 (2024).
Ye, J. et al. Hypoxanthine is a metabolic biomarker for inducing GSDME-dependent pyroptosis of endothelial cells during ischemic stroke. Theranostics 14, 6071–6087 (2024).
Xue, B. Source code for deep spectral component filtering: DSCF_V1 (v1.2). Zenodo https://doi.org/10.5281/zenodo.15013288 (2025).
Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2020).
Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. In Proc. IEEE/CVF International Conference on Computer Vision 9992–10002 (IEEE, 2021).
Hughes, C., Gaunt, L., Brown, M., Clarke, N. W. & Gardner, P. Assessment of paraffin removal from prostate FFPE sections using transmission mode FTIR-FPA imaging. Anal. Methods 6, 1028–1035 (2014).
Meuse, C. W. & Barker, P. E. Quantitative infrared spectroscopy of formalin-fixed, paraffin-embedded tissue specimens: paraffin wax removal with organic solvents. Appl. Immunohistochem. Mol. Morphol. 17, 547–552 (2009).
Nallala, J., Lloyd, G. R. & Stone, N. Evaluation of different tissue de-paraffinization procedures for infrared spectral imaging. Analyst 140, 2369–2375 (2015).
De Lima, F. A. et al. Digital de-waxing on FTIR images. Analyst 142, 1358–1370 (2017).
Bai, B. et al. Deep learning-enabled virtual histological staining of biological samples. Light Sci. Appl. 12, 57 (2023).
Lotfollahi, M., Daeinejad, D., Berisha, S. & Mayerich, D. Digital staining of high-resolution FTIR spectroscopic images. In Proc. 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 973–977 (IEEE, 2018).
Gobinet, C. et al. Automatic identification of paraffin pixels on FTIR images acquired on FFPE human samples. Anal. Chem. 93, 3750–3761 (2021).
Bi, X., Fang, Z., Deng, B., Zhou, L. & Ye, J. Ultrahigh Raman-fluorescence dual-enhancement in nanogaps of silver-coated gold nanopetals. Adv. Opt. Mater. 11, 2300188 (2023).
Wang, H. P. et al. Recent advances of chemometric calibration methods in modern spectroscopy: algorithms, strategy, and related issues. trends Anal. Chem. 153, 116648 (2022).
Zou, Z. et al. A deep learning model for predicting selected organic molecular spectra. Nat. Comput. Sci. 3, 957–964 (2023).
Savitzky, A. & Golay, M. J. E. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36, 1627–1639 (1964).
You, L. et al. An exposome atlas of serum reveals the risk of chronic diseases in the Chinese population. Nat. Commun. 15, 2268 (2024).
Oh, H. S. H. et al. Organ aging signatures in the plasma proteome track health and disease. Nature 624, 164–172 (2023).
Enroth, C. et al. Crystal structures of bovine milk xanthine dehydrogenase and xanthine oxidase: structure-based mechanism of conversion. Proc. Natl Acad. Sci. USA 97, 10723–10728 (2000).
Schwedhelm, E. et al. Trimethyllysine, vascular risk factors and outcome in acute ischemic stroke (MARK–STROKE). Amino Acids 53, 555–561 (2021).
Farthing, D. E., Farthing, C. A. & Xi, L. Inosine and hypoxanthine as novel biomarkers for cardiac ischemia: from bench to point-of-care. Exp. Biol. Med. 240, 821–831 (2015).
Dudka, I. et al. Comprehensive metabolomics analysis of prostate cancer tissue in relation to tumor aggressiveness and TMPRSS2-ERG fusion status. BMC Cancer 20, 437 (2020).
Chen, M. M. & Meng, L. H. The double faced role of xanthine oxidoreductase in cancer. Acta Pharmacol. Sin. 43, 1623–1632 (2021).
Sreekumar, A. et al. Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression. Nature 457, 910–914 (2009).
Clark, D. E. et al. The serine/threonine protein kinase, p90 ribosomal S6 kinase, is an important regulator of prostate cancer cell proliferation. Cancer Res. 65, 3108–3116 (2005).
Maitre, M., Klein, C., Patte-Mensah, C. & Mensah-Nyagan, A. G. Tryptophan metabolites modify brain Aβ peptide degradation: a role in Alzheimer’s disease? Prog. Neurobiol. 190, 101800 (2020).
Horgan, C. C. et al. High-throughput molecular imaging via deep-learning-enabled Raman spectroscopy. Anal. Chem. 93, 15850–15860 (2021).
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why should I trust you?’: explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ed. Balaji, K.) 1135–1144 (ACM, 2016).
Radford, A. et al. Language models are unsupervised multitask learners. OpenAI 1, 9 (2019).
Liu, B., Yuan, Y., Pan, X., Shen, H-.B. & Jin, C. AttSiOff: a self-attention-based approach on siRNA design with inhibition and off-target effect prediction. Med-X https://doi.org/10.1007/s44258-024-00019-1 (2024).
Leopold, N. & Lendl, B. A new method for fast preparation of highly surface-enhanced raman scattering (SERS) active silver colloids at room temperature by reduction of silver nitrate with hydroxylamine hydrochloride. J. Phys. Chem. B 107, 5723–5727 (2003).
Lee, P. C. & Meisel, D. Adsorption and surface-enhanced Raman of dyes on silver and gold sols. J. Phys. Chem. 86, 3391–3395 (1982).
Yuan, H. et al. Gold nanostars: surfactant-free synthesis, 3D modelling, and two-photon photoluminescence imaging. Nanotechnology 23, 075102 (2012).
Kaur, V., Tanwar, S., Kaur, G. & Sen, T. DNA-origami-based assembly of Au@Ag Nanostar dimer nanoantennas for label-free sensing of pyocyanin. Chem. Phys. Chem. 22, 160–167 (2021).
He, K. et al. Masked autoencoders are scalable vision learners. In Proc. IEEE/CVF International Conference on Computer Vision 15979–15988 (IEEE, 2022).
Chau, S. L., Hu, R., London, A., Gonzalez, J. & Sejdinovic, D. RKHS-SHAP: Shapley values for kernel methods. Adv. Neural Inf. Process Syst. 35, 13050–13063 (2022).
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In Proc. 34th International Conference on Machine Learning Vol. 70 (eds Precup, D. & Teh, Y. W.) 3145–3153 (PMLR, 2017).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In Proc. IEEE/CVF International Conference on Computer Vision 618–626 (IEEE, 2017).
Zeiler, M. D. & Fergus, R. Visualizing and understanding convolutional networks. In Proc. European Conference on Computer Vision 818–833 (Springer, 2014).
Lundberg, S. M., Allen, P. G. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) 4765–4774 (Curran Associates, 2017).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning (JMLR, 2017).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. In Proc. IEEE/CVF International Conference on Computer Vision 2921–2929 (IEEE, 2016).
Xue, B. FTIR of liver tissue. figshare https://doi.org/10.6084/m9.figshare.28107236.v1 (2024).
Xue, B. ComFilE for PCa. figshare https://doi.org/10.6084/m9.figshare.28107395.v1 (2024).
Xue, B. ComFilE for stroke. figshare https://doi.org/10.6084/m9.figshare.28107431.v1 (2024).
Xue, B. ComFilE for AD. figshare https://doi.org/10.6084/m9.figshare.28107578.v1 (2024).
Xue, B. SERS quantification. figshare https://doi.org/10.6084/m9.figshare.28107281.v1 (2024).
Xue, B. SERS nanoparticle background removal (PCa). figshare https://doi.org/10.6084/m9.figshare.28107326.v1 (2024).
Zou, Z. QM9S dataset. figshare https://doi.org/10.6084/m9.figshare.24235333.v3 (2023).
Don’t miss more hot News like this! Click here to discover the latest in AI news!
2025-05-23 00:00:00