Deep learning-based prediction of the selection factors for quantifying selection in immune receptor repertoires

Rocha, B. & von Boehmer, H. Peripheral selection of the T cell repertoire. Science 251, 1225–1228 (1991).
Egerton, M., Scollay, R. & Shortman, K. Kinetics of mature T-cell development in the thymus. Proc. Natl Acad. Sci. USA 87, 2579–2582 (1990).
Janeway, C. et al. Immunobiology: The Immune System in Health and Disease Vol. 2 (Garland, 2001).
Roth, D. B. V(D)J recombination: mechanism, errors, and fidelity. In Mobile DNA III 311–324 (American Society for Microbiology, 2015).
Sethna, Z. et al. Population variability in the generation and selection of T-cell repertoires. PLOS Comput. Biol. 16, e1008394 (2020).
Isacchini, G., Walczak, A. M., Mora, T. & Nourmohammad, A. Deep generative selection models of T and B cell receptor repertoires with SONIA. Proc. Natl Acad. Sci. USA 118, e2024364118 (2021).
Pai, J. A. & Satpathy, A. T. High-throughput and single-cell T cell receptor sequencing technologies. Nat. Methods 18, 881–892 (2021).
Murugan, A., Mora, T., Walczak, A. M. & Callan Jr, C. G. Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc. Natl Acad. Sci. USA 109, 16161–16166 (2012).
Marcou, Q., Mora, T. & Walczak, A. M. High-throughput immune repertoire analysis with IGoR. Nat. Commun. 9, 561 (2018).
Sethna, Z., Elhanati, Y., Callan Jr, C. G., Walczak, A. M. & Mora, T. OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs. Bioinformatics 35, 2974–2981 (2019).
Elhanati, Y., Murugan, A., Callan Jr, C. G., Mora, T. & Walczak, A. M. Quantifying selection in immune receptor repertoires. Proc. Natl Acad. Sci. USA 111, 9875–9880 (2014).
Jiang, Y. & Li, S. C. Deep autoregressive generative models capture the intrinsics embedded in T-cell receptor repertoires. Brief. Bioinform. 24, bbad038 (2023).
Davidsen, K. et al. Deep generative models for T cell receptor protein sequences. eLife 8, e46935 (2019).
Zhou, Z.-H. A brief introduction to weakly supervised learning. Natl Sci. Rev. 5, 44–53 (2018).
Li, Y.-F., Guo, L.-Z. & Zhou, Z.-H. Towards safe weakly supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 43, 334–346 (2019).
Hou, X. et al. Analysis of the repertoire features of TCR beta chain CDR3 in human by high-throughput sequencing. Cell. Physiol. Biochem. 39, 651–667 (2016).
Das, J. & Jayaprakash, C. Systems Immunology: An Introduction to Modeling Methods for Scientists (CRC, 2018).
Rao, R. et al. Evaluating protein transfer learning with TAPE. Adv. Neural. Inf. Process. Syst. 32, 9689–9701 (2019).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1 (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).
Huang, H., Wang, C., Rubelt, F., Scriba, T. J. & Davis, M. M. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat. Biotechnol. 38, 1194–1202 (2020).
Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).
Chen, S.-Y., Yue, T., Lei, Q. & Guo, A.-Y. TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function. Nucleic Acids Res. 49, D468–D474 (2021).
Emerson, R. O. et al. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet. 49, 659–665 (2017).
Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at https://arxiv.org/abs/1301.3781 (2013).
Elhanati, Y., Sethna, Z., Callan Jr, C. G., Mora, T. & Walczak, A. M. Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination. Immunol. Rev. 284, 167–179 (2018).
Ruiz Ortega, M., Spisak, N., Mora, T. & Walczak, A. M. Modeling and predicting the overlap of B- and T-cell receptor repertoires in healthy and SARS-CoV-2 infected individuals. PLoS Genet. 19, e1010652 (2023).
Nolan, S. et al. A large-scale database of T-cell receptor beta sequences and binding associations from natural and synthetic exposure to SARS-CoV-2. Front. Immunol 16, 1488851 (2025).
Weyand, C. M. & Goronzy, J. Aging of the immune system. Mechanisms and therapeutic targets. Ann. Am. Thorac. Soc. 13, S422–S428 (2016).
Egorov, E. S. et al. The changing landscape of naive T cell receptor repertoire with human aging. Front. Immunol. 9, 1618 (2018).
Britanova, O. V. et al. Age-related decrease in TCR repertoire diversity measured with deep and normalized sequence profiling. J. Immunol. 192, 2689–2698 (2014).
Palmer, D. B. The effect of age on thymic function. Front. Immunol. 4, 316 (2013).
Krishna, C., Chowell, D., Gönen, M., Elhanati, Y. & Chan, T. A. Genetic and environmental determinants of human TCR repertoire diversity. Immun. Ageing 17, 26 (2020).
Wang, G. C., Dash, P., McCullers, J. A., Doherty, P. C. & Thomas, P. G. T cell receptor αβ diversity inversely correlates with pathogen-specific antibody levels in human cytomegalovirus infection. Sci. Transl. Med. 4, 128ra42 (2012).
Jergović, M., Contreras, N. A. & Nikolich-Žugich, J. Impact of CMV upon immune aging: facts and fiction. Med. Microbiol. Immunol. 208, 263–269 (2019).
Chu, N. D. et al. Longitudinal immunosequencing in healthy people reveals persistent T cell receptors rich in highly public receptors. BMC Immunol. 20, 19 (2019).
Britanova, O. V. et al. Dynamics of individual T cell repertoires: from cord blood to centenarians. J. Immunol. 196, 5005–5013 (2016).
Bensouda Koraichi, M., Ferri, S., Walczak, A. M. & Mora, T. Inferring the T cell repertoire dynamics of healthy individuals. Proc. Natl Acad. Sci. USA 120, e2207516120 (2023).
Pogorelyy, M. V. et al. Method for identification of condition-associated public antigen receptor sequences. eLife 7, e33050 (2018).
Widrich, M. et al. Modern Hopfield networks and attention for immune repertoire classification. Adv. Neural Inf. Process. Syst. 33, 18832–18845 (2020).
Akerman, O., Isakov, H., Levi, R., Psevkin, V. & Louzoun, Y. Counting is almost all you need. Front. Immunol. 13, 1031011 (2023).
Shugay, M. et al. VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 46, D419–D427 (2018).
Joglekar, A. V. & Li, G. T cell antigen discovery. Nat. Methods 18, 873–880 (2021).
Hudson, D., Fernandes, R. A., Basham, M., Ogg, G. & Koohy, H. Can we predict T cell specificity with digital biology and machine learning? Nat. Rev. Immunol. 23, 511–521 (2023).
Gomez-Tourino, I., Kamra, Y., Baptista, R., Lorenc, A. & Peakman, M. T cell receptor β-chains display abnormal shortening and repertoire sharing in type 1 diabetes. Nat. Commun. 8, 1792 (2017).
Savola, P. et al. Somatic mutations in clonally expanded cytotoxic T lymphocytes in patients with newly diagnosed rheumatoid arthritis. Nat. Commun. 8, 15869 (2017).
Huth, A., Liang, X., Krebs, S., Blum, H. & Moosmann, A. Antigen-specific TCR signatures of cytomegalovirus infection. J. Immunol. 202, 979–990 (2019).
Faham, M. et al. Discovery of T cell receptor β motifs specific to HLA–B27–positive ankylosing spondylitis by deep repertoire sequence analysis. Arthritis Rheumatol. 69, 774–784 (2017).
Zhao, Y. et al. Preferential use of public TCR during autoimmune encephalomyelitis. J. Immunol. 196, 4905–4914 (2016).
Lu, C. et al. Clinical significance of T cell receptor repertoire in primary Sjogren’s syndrome. EBioMedicine 84, 104252 (2022).
Seder, R. A., Darrah, P. A. & Roederer, M. T-cell quality in memory and protection: implications for vaccine design. Nat. Rev. Immunol. 8, 247–258 (2008).
Feng, X. et al. A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions. Brief. Bioinform. 26, bbaf030 (2025).
Kobyzev, I., Prince, SimonJ. D. & Brubaker, M. A. Normalizing flows: an introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. 43, 3964–3979 (2020).
Zhang, W., Gou, Y., Jiang, Y. & Zhang, Y. Adversarial VAE with normalizing flows for multi-dimensional classification. In Chinese Conference on Pattern Recognition and Computer Vision (PRCV) 205–219 (Springer, 2022).
Tino, P., Leonardis, Y., Leonardis, A. & Tang, K. A survey on neural network interpretability. IEEE Trans. Emerg. Top. Comput. Intell. 5, 726–742 (2021).
Jiang, Y., Huo, M., Zhang, P., Zou, Y. & Li, S. TCR2vec: a deep representation learning framework of T-cell receptor sequence and function. Preprint at bioRxiv https://doi.org/10.1101/2023.03.31.535142 (2023).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Jiang, Y., Rensi, S., Wang, S. & Altman, R. B. DrugOrchestra: jointly predicting drug response, targets, and side effects via deep multi-task learning. Preprint at biorxiv https://doi.org/10.1101/2020.11.17.385757 (2020).
Dai, Z. et al. BEV-Net: assessing social distancing compliance by joint people localization and geometric reasoning. In Proc. IEEE/CVF International Conference on Computer Vision 5401–5411 (IEEE, 2021).
Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H. & Sugiyama, M. Relative density-ratio estimation for robust distribution comparison. Neural Comput. 25, 1324–1370 (2013).
Google Scholar
Kingma, D. & Ba, J. Adam: a method for stochastic optimization. In Proc. International Conference on Learning Representations (ICLR, 2015)
Slabodkin, A. et al. Individualized VDJ recombination predisposes the available IG sequence space. Genome Res. 31, 2209–2224 (2021).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Jiang, Y. Tcrsep. Zenodo https://doi.org/10.5281/zenodo.15691314 (2025).TS: In panel a, please change the x axis labels 0.5, 0.6 and 0.7 to 0.50, 0.60 and 0.70, respectively.
Don’t miss more hot News like this! AI/" target="_blank" rel="noopener">Click here to discover the latest in AI news!
2025-08-11 00:00:00