AI

Large language models to accelerate organic chemistry synthesis

  • Mendoza, A., Ishihara, Y. & Baran, P. S. Scalable enantioselective total synthesis of taxanes. Nat. Chem. 4, 21–25 (2012).

    Article 

    Google Scholar 

  • Elvira, K. S., i Solvas, X. C., Wootton, R. C. & Demello, A. J. The past, present and potential for microfluidic reactor technology in chemical synthesis. Nat. Chem. 5, 905–915 (2013).

    Article 

    Google Scholar 

  • Ball, P. Chemistry: why synthesize? Nature 528, 327–329 (2015).

    Article 

    Google Scholar 

  • Newman-Stonebraker, S. H. et al. Univariate classification of phosphine ligation state and reactivity in cross-coupling catalysis. Science 374, 301–308 (2021).

    Article 

    Google Scholar 

  • Mikulak-Klucznik, B. et al. Computational planning of the synthesis of complex natural products. Nature 588, 83–88 (2020).

    Article 

    Google Scholar 

  • Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Nat. Mach. Intell. 6, 161–169 (2024).

    Article 

    Google Scholar 

  • Shen, Y. et al. Automation and computer-assisted planning for chemical synthesis. Nat. Rev. Methods Primers 1, 23 (2021).

    Article 

    Google Scholar 

  • Tao, H. et al. Nanoparticle synthesis assisted by machine learning. Nat. Rev. Mater. 6, 701–716 (2021).

    Article 

    Google Scholar 

  • Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

    Article 

    Google Scholar 

  • Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).

    Article 

    Google Scholar 

  • Angello, N. H. et al. Closed-loop optimization of general reaction conditions for heteroaryl Suzuki–Miyaura coupling. Science 378, 399–405 (2022).

    Article 
    MathSciNet 

    Google Scholar 

  • Betinol, I. O., Lai, J., Thakur, S. & Reid, J. P. A data-driven workflow for assigning and predicting generality in asymmetric catalysis. J. Am. Chem. Soc. 145, 12870–12883 (2023).

    Article 

    Google Scholar 

  • Rinehart, N. I. et al. A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C–N couplings. Science 381, 965–972 (2023).

    Article 

    Google Scholar 

  • Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).

    Article 

    Google Scholar 

  • Mehr, S. H. M., Craven, M., Leonov, A. I., Keenan, G. & Cronin, L. A universal system for digitization and automatic execution of the chemical synthesis literature. Science 370, 101–108 (2020).

    Article 

    Google Scholar 

  • Rohrbach, S. et al. Digitization and validation of a chemical synthesis literature database in the ChemPU. Science 377, 172–180 (2022).

    Article 

    Google Scholar 

  • Sanchez-Lengeling, B. & Aspuru-Guzik, A. Inverse molecular design using machine learning: generative models for matter engineering. Science 361, 360–365 (2018).

    Article 

    Google Scholar 

  • Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).

    Article 

    Google Scholar 

  • Toniato, A., Schwaller, P., Cardinale, A., Geluykens, J. & Laino, T. Unassisted noise reduction of chemical reaction datasets. Nat. Mach. Intell. 3, 485–494 (2021).

    Article 

    Google Scholar 

  • Achiam, J. et al. GPT-4 technical report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2023).

  • Lehr, S. A., Caliskan, A., Liyanage, S. & Banaji, M. R. ChatGPT as research scientist: probing GPT’s capabilities as a research librarian. Proc. Natl Acad. Sci. USA 121, e2404328121 (2024).

    Article 

    Google Scholar 

  • Kang, Y. & Kim, J. ChatMOF: an artificial intelligence system for predicting and generating metal–organic frameworks using large language models. Nat. Commun. 15, 4705 (2024).

    Article 

    Google Scholar 

  • Dagdelen, J. et al. Structured information extraction from scientific text with large language models. Nat. Commun. 15, 1418 (2024).

    Article 

    Google Scholar 

  • Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462–1465 (2024).

    Article 

    Google Scholar 

  • Zheng, Z. et al. A GPT-4 reticular chemist for guiding MOF discovery. Angew. Chem. Int. Ed. 62, e202311983 (2023).

    Article 

    Google Scholar 

  • Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).

    Article 

    Google Scholar 

  • Canty, R. B. & Abolhasani, M. Reproducibility in automated chemistry laboratories using computer science abstractions. Nat. Synth. 3, 1327–1339 (2024).

    Article 

    Google Scholar 

  • Ruan, Y. et al. An automatic end-to-end chemical synthesis development platform powered by large language models. Nat. Commun. 15, 10160 (2024).

    Article 

    Google Scholar 

  • Zheng, Z. et al. ChatGPT research group for optimizing the crystallinity of MOFs and COFs. ACS Cent. Sci. 9, 2161–2170 (2023).

    Article 

    Google Scholar 

  • Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).

    Article 

    Google Scholar 

  • Zheng, Z., Zhang, O., Borgs, C., Chayes, J. T. & Yaghi, O. M. ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis. J. Am. Chem. Soc. 145, 18048–18062 (2023).

    Article 

    Google Scholar 

  • Antunes, L. M., Butler, K. T. & Grau-Crespo, R. Crystal structure generation with autoregressive large language modeling. Nat. Commun. 15, 10570 (2024).

    Article 

    Google Scholar 

  • Zheng, Z. et al. Integrating machine learning and large language models to advance exploration of electrochemical reactions. Angew. Chem. Int. Ed. 137, e202418074 (2024).

    Article 

    Google Scholar 

  • Ramos, M. C., Collison, C. J. & White, A. D. A review of large language models and autonomous agents in chemistry. Chem. Sci. 16, 2514–2572 (2025).

    Article 

    Google Scholar 

  • Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

    Article 

    Google Scholar 

  • Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).

    Article 

    Google Scholar 

  • Shields, B. J. et al. Bayesian reaction optimization as a tool for chemical synthesis. Nature 590, 89–96 (2021).

    Article 

    Google Scholar 

  • Tang, T. et al. Interrogating the mechanistic features of Ni (I)-mediated aryl iodide oxidative addition using electroanalytical and statistical modeling techniques. J. Am. Chem. Soc. 145, 8689–8699 (2023).

    Article 

    Google Scholar 

  • Wang, J. Y. et al. Identifying general reaction conditions by bandit optimization. Nature 626, 1025–1033 (2024).

    Article 

    Google Scholar 

  • Raghavan, P. et al. Dataset design for building models of chemical reactivity. ACS Cent. Sci. 9, 2196–2204 (2023).

    Article 

    Google Scholar 

  • Frey, N. C. et al. Neural scaling of deep chemical models. Nat. Mach. Intell. 5, 1297–1305 (2023).

    Article 

    Google Scholar 

  • Kearnes, S. M. et al. The Open Reaction Database. J. Am. Chem. Soc. 143, 18820–18826 (2021).

    Article 

    Google Scholar 

  • Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent. Sci. 3, 1237–1245 (2017).

    Article 

    Google Scholar 

  • Lowe, D. M. Extraction of Chemical Structures and Reactions from the Literature. PhD thesis, University of Cambridge (2012).

  • Tu, Z. & Coley, C. W. Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. J. Chem. Inf. Model. 62, 3503–3513 (2022).

    Article 

    Google Scholar 

  • Sacha, M. et al. Molecule edit graph attention network: modeling chemical reactions as sequences of graph edits. J. Chem. Inf. Model. 61, 3273–3284 (2021).

    Article 

    Google Scholar 

  • Seo, S.-W. et al. GTA: graph truncated attention for retrosynthesis. In Proc. AAAI Conference on Artificial Intelligence Vol. 35, 531–539 (AAAI Press, 2021).

  • Somnath, V. R., Bunne, C., Coley, C., Krause, A. & Barzilay, R. Learning graph models for retrosynthesis prediction. Adv. Neural Inf. Process. Syst. 34, 9405–9415 (2021).

    Google Scholar 

  • Wang, X. et al. RetroPrime: a diverse, plausible and transformer-based method for single-step retrosynthesis predictions. Chem. Eng. J. 420, 129845 (2021).

    Article 

    Google Scholar 

  • Wan, Y., Hsieh, C.-Y., Liao, B. & Zhang, S. Retroformer: pushing the limits of end-to-end retrosynthesis transformer. In International Conference on Machine Learning 22475–22490 (PMLR, 2022).

  • Chen, S. & Jung, Y. Deep retrosynthetic reaction prediction using local reactivity and global attention. JACS Au 1, 1612–1620 (2021).

    Article 

    Google Scholar 

  • Yao, L. et al. Node-aligned graph-to-graph: elevating template-free deep learning approaches in single-step retrosynthesis. JACS Au. 4, 992–1003 (2024).

    Article 

    Google Scholar 

  • Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).

    Article 

    Google Scholar 

  • Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).

    Article 

    Google Scholar 

  • Li, S.-W., Xu, L.-C., Zhang, C., Zhang, S.-Q. & Hong, X. Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge. Nat. Commun. 14, 3569 (2023).

    Article 

    Google Scholar 

  • Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).

    Article 

    Google Scholar 

  • Saebi, M. et al. On the use of real-world datasets for reaction yield prediction. Chem. Sci. 14, 4997–5005 (2023).

    Article 

    Google Scholar 

  • Li, D.-Z. & Gong, X.-Q. Challenges with literature-derived data in machine learning for yield prediction: a case study on Pd-catalyzed carbonylation reactions. J. Phys. Chem. A 128, 10423–10430 (2024).

    Article 

    Google Scholar 

  • Li, X., Zhang, S.-Q., Xu, L.-C. & Hong, X. Predicting regioselectivity in radical C–H functionalization of heterocycles through machine learning. Angew. Chem. Int. Ed. 59, 13253–13259 (2020).

    Article 

    Google Scholar 

  • Zahrt, A. F. et al. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 363, eaau5631 (2019).

    Article 

    Google Scholar 

  • Perera, D. et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science 359, 429–434 (2018).

    Article 

    Google Scholar 

  • Guo, T. et al. What can large language models do in chemistry? A comprehensive benchmark on eight tasks. Adv. Neural Inf. Process. Syst. 36, 59662–59688 (2023).

    Google Scholar 

  • Taylor, R. D., MacCoss, M. & Lawson, A. D. Rings in drugs: miniperspective. J. Med. Chem. 57, 5845–5859 (2014).

    Article 

    Google Scholar 

  • Ma, X. et al. A general approach to stereospecific cross-coupling reactions of nitrogen-containing stereocenters. Chem 6, 781–791 (2020).

    Article 

    Google Scholar 

  • Shu, X., Zhong, D., Lin, Y., Qin, X. & Huo, H. Modular access to chiral α-(hetero) aryl amines via Ni/photoredox-catalyzed enantioselective cross-coupling. J. Am. Chem. Soc. 144, 8797–8806 (2022).

    Article 

    Google Scholar 

  • Sarkar, S., Wagulde, S., Jia, X. & Gevorgyan, V. General and selective metal-free radical α-C–H borylation of aliphatic amines. Chem 8, 3096–3108 (2022).

    Article 

    Google Scholar 

  • Zhang, Y. et al. Large language models to accelerate organic chemistry synthesis. Zenodo https://doi.org/10.5281/zenodo.15295848 (2025).

  • Ruiz-Castillo, P. & Buchwald, S. L. Applications of palladium-catalyzed C–N cross-coupling reactions. Chem. Rev. 116, 12564–12649 (2016).

    Article 

    Google Scholar 

  • Don’t miss more hot News like this! Click here to discover the latest in AI news!

    2025-07-01 00:00:00

    Related Articles

    Back to top button