Harnessing the power of single-cell large language models with parameter-efficient fine-tuning using scPEFT
Pike, D. T. et al. Single-cell RNA sequencing in cardiovascular development, disease and medicine. Nat. Rev. Cardiol. 17457-473 (2020).
Zhang, Y. & Zhang, Z. History and progress in cancer immunotherapy: understanding the properties of tumor-infiltrating immune cells and their therapeutic implications. cell. mall. Immunol. 17807-821 (2020).
Li, X. et al. deep learning enables accurate clustering while removing the batch effect in single-cell RNA-seq analysis. Nat. subscriber. 112338 (2020).
Yang, F. et al. scBERT as a large-scale pre-trained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intel. 4852-866 (2022).
Theodoris, CV et al. Transfer learning enables predictions in network biology. nature 618616-624 (2023).
Cui, H. et al. scGPT: Towards building a basic multi-cell model using generative artificial intelligence. Nat. Knock 211470-1480 (2024).
Howe, M. et al. A large-scale basis model on single-cell transcriptomics. Nat. Knock 211481–1491 (2024).
Heimberg, G. et al. A baseline model for a cell atlas for scalable research on similar human cells. nature 6381085-1094 (2025).
Yang, X. et al. GeneCompass: Deciphering global gene regulatory mechanisms using a knowledge-based foundational model across species. Cell resolution. 34830-845 (2024).
Fisher, F. et al. scTab: scaling single-cell annotation models across tissues. Nat. subscriber. 156611 (2024).
Ma, S. et al. Harnessing the power of deep learning for fundamental models in single-cell omics. Nat. Rev. Mall. Cell Biol. 25593-594 (2024).
Kedzierska, C.Z. et al. Null evaluation reveals the limitations of single-cell basis models. Genome Biol. 26101 (2025).
Boyarsky, R. et al. A deeper evaluation of the single-cell basis model. Nat. Mach. Intel. 61443-1446 (2024).
Chen, H. et al. Multi-task quantitative learning for context-specific representation of gene network dynamics. Preprint in com.bioRxiv https://doi.org/10.1101/2024.08.16.608180 (2024).
Bechet, E. et al. Dimensionality reduction for single-cell data visualization using UMAP. Nat. Biotechnology. 3738-44 (2019).
Arran, D. et al. Reference analysis of lung single-cell sequences reveals the presence of transitional fibrotic macrophages. Nat. Immunol. 20163-172 (2019).
Hao, Y. et al. Dictionary learning for integrated, multimodal, and scalable single-cell analysis. Nat. Biotechnology. 42293-304 (2024).
Liu, B. et al. Single-cell time tracking reveals clonal revival and expansion of exhausted T cells during anti-PD-1 therapy in lung cancer. Nat. cancer 3108-112 (2022).
Zhang, H. et al. C/EBPδ drives interactions between human MAIT cells and endothelial cells that are important for extravasation. eLife 7e32532 (2018).
Park, D. et al. Differences in molecular signatures of mucosa-associated invariant T cells and conventional T cells. Science fiction. representative. 97094 (2019).
Teng, X. et al. SIGIRR deficiency contributes to CD4 T cell abnormalities by facilitating the IL1/C/EBPβ/TNF-α signaling axis in rheumatoid arthritis. mall. Med. 28135 (2022).
Yang, R. et al. Distinctive epigenetic features of tumor-reactive CD8+ T cells in colorectal cancer patients detected by genome-wide DNA methylation analysis. Genome Biol. 212 (2020).
Hodge, R.D. et al. Conserved cell types with divergent features in human versus mouse cortex. nature 57361-68 (2019).
Yao, Z. et al. Classification of transcriptional cell types across the isocortical and hippocampal formation. cell 1843222-3241 (2021).
Traag, VA, Waltman, L. & Van Eck, NJ From Louvain to Leiden: ensuring well-connected communities. Science fiction. representative. 95233 (2019).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach to interpreting genome-wide expression profiles. Brooke. Natl Acad. Science fiction. USA 10215545-15550 (2005).
Lopez, R. et al. Deep generative modeling of single-cell transcriptomes. Nat. Knock 151053-1058 (2018).
Hi, PL et al. Scanorama: Integrating large and diverse single-cell transcriptome datasets. Nat. Protoc. 192283-2297 (2024).
Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional consequences of novel polygenic disorders using GEARS. Nat. Biotechnology. 42927-935 (2024).
Ahlmann-Eltze, C., Huber, W. & Anders, S. Deep learning-based gene perturbation effect prediction is not yet superior to simple linear baselines. Nat. Knock 221657-1661 (2025).
Hong, L. et al. Rapid and sensitive detection of protein homologs using deep dense retrieval. Nat. Biotechnology. 43983-995 (2025).
Adori, AK et al. Predicting cellular responses to perturbation across diverse contexts with state. Preprint in com.bioRxiv https://doi.org/10.1101/2025.06.26.661135 (2025).
Lin, Z. et al. Evolutionary prediction of protein structure at the atomic level using a linguistic model. sciences 3791123-1130 (2023).
Kirillov, A. et al. Section anything. in Brooke. IEEE/CVF International Conference on Computer Vision 4015-4026 (IEEE, 2023).
Vaswani, A. et al. Attention is all you need. in Brooke. Advances in neural information processing systemsvol. 30 (eds Guyon, I. et al.) (Curran Associates, 2017).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Preprint at https://doi.org/10.48550/arXiv.1412.6980 (2017).
CZI Cell Sciences Program et al. CZ CELLxGENE Discover: A single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Res nucleic acids. 53D886 – D900 (2024).
Franzen, O., Gunn, L.-M. & Björkegren, J. L. M. PanglaoDB: a web server for exploring human and mouse single-cell RNA sequencing data. Database 2019Baz046 (2019).
Li, XL & Liang, P. Prefix tuning: improving continuous generation claims. Preprint at https://doi.org/10.48550/arXiv.2101.00190 (2021).
Hu, EJ et al. LoRA: Low-level adaptation of large language models. in 10th International Conference on Learning Representations (International Lake Conference, 2022).
O’Leary, N.A. et al. Exploring and retrieving sequences and metadata for species across the tree of life using NCBI datasets. Science fiction. Data 11732 (2024).
Lee, W. COVID-19 dataset. Share fig (2020).
He, F. Processed datasets used in the scPEFT study. Share fig https://doi.org/10.6084/M9.FIGSHARE.30763886 (2025).
He, F. Codebase for scPEFT: Harnessing the power of large single-cell language models with efficient fine-tuning of parameters (version 1.0.0). Zenodu https://doi.org/10.5281/zenodo.17781912 (2025).
Don’t miss more hot News like this! AI/" target="_blank" rel="noopener">Click here to discover the latest in AI news!
2025-12-31 00:00:00



