Large Language Models To Accelerate Organic Chemistry Synthesis

Mendoza, A., Ishihara, Y. & Baran, P. S. Scalable enantioselective total synthesis of taxanes. Nat. Chem. 4, 21–25 (2012).

Article

Google Scholar

Elvira, K. S., i Solvas, X. C., Wootton, R. C. & Demello, A. J. The past, present and potential for microfluidic reactor technology in chemical synthesis. Nat. Chem. 5, 905–915 (2013).

Article

Google Scholar

Ball, P. Chemistry: why synthesize? Nature 528, 327–329 (2015).

Article

Google Scholar

Newman-Stonebraker, S. H. et al. Univariate classification of phosphine ligation state and reactivity in cross-coupling catalysis. Science 374, 301–308 (2021).

Article

Google Scholar

Mikulak-Klucznik, B. et al. Computational planning of the synthesis of complex natural products. Nature 588, 83–88 (2020).

Article

Google Scholar

Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Nat. Mach. Intell. 6, 161–169 (2024).

Article

Google Scholar

Shen, Y. et al. Automation and computer-assisted planning for chemical synthesis. Nat. Rev. Methods Primers 1, 23 (2021).

Article

Google Scholar

Tao, H. et al. Nanoparticle synthesis assisted by machine learning. Nat. Rev. Mater. 6, 701–716 (2021).

Article

Google Scholar

Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

Article

Google Scholar

Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).

Article

Google Scholar

Angello, N. H. et al. Closed-loop optimization of general reaction conditions for heteroaryl Suzuki–Miyaura coupling. Science 378, 399–405 (2022).

Article
MathSciNet

Google Scholar

Betinol, I. O., Lai, J., Thakur, S. & Reid, J. P. A data-driven workflow for assigning and predicting generality in asymmetric catalysis. J. Am. Chem. Soc. 145, 12870–12883 (2023).

Article

Google Scholar

Rinehart, N. I. et al. A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C–N couplings. Science 381, 965–972 (2023).

Article

Google Scholar

Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).

Article

Google Scholar

Mehr, S. H. M., Craven, M., Leonov, A. I., Keenan, G. & Cronin, L. A universal system for digitization and automatic execution of the chemical synthesis literature. Science 370, 101–108 (2020).

Article

Google Scholar

Rohrbach, S. et al. Digitization and validation of a chemical synthesis literature database in the ChemPU. Science 377, 172–180 (2022).

Article

Google Scholar

Sanchez-Lengeling, B. & Aspuru-Guzik, A. Inverse molecular design using machine learning: generative models for matter engineering. Science 361, 360–365 (2018).

Article

Google Scholar

Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).

Article

Google Scholar

Toniato, A., Schwaller, P., Cardinale, A., Geluykens, J. & Laino, T. Unassisted noise reduction of chemical reaction datasets. Nat. Mach. Intell. 3, 485–494 (2021).

Article

Google Scholar

Achiam, J. et al. GPT-4 technical report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2023).

Lehr, S. A., Caliskan, A., Liyanage, S. & Banaji, M. R. ChatGPT as research scientist: probing GPT’s capabilities as a research librarian. Proc. Natl Acad. Sci. USA 121, e2404328121 (2024).

Article

Google Scholar

Kang, Y. & Kim, J. ChatMOF: an artificial intelligence system for predicting and generating metal–organic frameworks using large language models. Nat. Commun. 15, 4705 (2024).

Article

Google Scholar

Dagdelen, J. et al. Structured information extraction from scientific text with large language models. Nat. Commun. 15, 1418 (2024).

Article

Google Scholar

Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462–1465 (2024).

Article

Google Scholar

Zheng, Z. et al. A GPT-4 reticular chemist for guiding MOF discovery. Angew. Chem. Int. Ed. 62, e202311983 (2023).

Article

Google Scholar

Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).

Article

Google Scholar

Canty, R. B. & Abolhasani, M. Reproducibility in automated chemistry laboratories using computer science abstractions. Nat. Synth. 3, 1327–1339 (2024).

Article

Google Scholar

Ruan, Y. et al. An automatic end-to-end chemical synthesis development platform powered by large language models. Nat. Commun. 15, 10160 (2024).

Article

Google Scholar

Zheng, Z. et al. ChatGPT research group for optimizing the crystallinity of MOFs and COFs. ACS Cent. Sci. 9, 2161–2170 (2023).

Article

Google Scholar

Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).

Article

Google Scholar

Zheng, Z., Zhang, O., Borgs, C., Chayes, J. T. & Yaghi, O. M. ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis. J. Am. Chem. Soc. 145, 18048–18062 (2023).

Article

Google Scholar

Antunes, L. M., Butler, K. T. & Grau-Crespo, R. Crystal structure generation with autoregressive large language modeling. Nat. Commun. 15, 10570 (2024).

Article

Google Scholar

Zheng, Z. et al. Integrating machine learning and large language models to advance exploration of electrochemical reactions. Angew. Chem. Int. Ed. 137, e202418074 (2024).

Article

Google Scholar

Ramos, M. C., Collison, C. J. & White, A. D. A review of large language models and autonomous agents in chemistry. Chem. Sci. 16, 2514–2572 (2025).

Article

Google Scholar

Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

Article

Google Scholar

Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).

Article

Google Scholar

Shields, B. J. et al. Bayesian reaction optimization as a tool for chemical synthesis. Nature 590, 89–96 (2021).

Article

Google Scholar

Tang, T. et al. Interrogating the mechanistic features of Ni (I)-mediated aryl iodide oxidative addition using electroanalytical and statistical modeling techniques. J. Am. Chem. Soc. 145, 8689–8699 (2023).

Article

Google Scholar

Wang, J. Y. et al. Identifying general reaction conditions by bandit optimization. Nature 626, 1025–1033 (2024).

Article

Google Scholar

Raghavan, P. et al. Dataset design for building models of chemical reactivity. ACS Cent. Sci. 9, 2196–2204 (2023).

Article

Google Scholar

Frey, N. C. et al. Neural scaling of deep chemical models. Nat. Mach. Intell. 5, 1297–1305 (2023).

Article

Google Scholar

Kearnes, S. M. et al. The Open Reaction Database. J. Am. Chem. Soc. 143, 18820–18826 (2021).

Article

Google Scholar

Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent. Sci. 3, 1237–1245 (2017).

Article

Google Scholar

Lowe, D. M. Extraction of Chemical Structures and Reactions from the Literature. PhD thesis, University of Cambridge (2012).

Tu, Z. & Coley, C. W. Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. J. Chem. Inf. Model. 62, 3503–3513 (2022).

Article

Google Scholar

Sacha, M. et al. Molecule edit graph attention network: modeling chemical reactions as sequences of graph edits. J. Chem. Inf. Model. 61, 3273–3284 (2021).

Article

Google Scholar

Seo, S.-W. et al. GTA: graph truncated attention for retrosynthesis. In Proc. AAAI Conference on Artificial Intelligence Vol. 35, 531–539 (AAAI Press, 2021).

Somnath, V. R., Bunne, C., Coley, C., Krause, A. & Barzilay, R. Learning graph models for retrosynthesis prediction. Adv. Neural Inf. Process. Syst. 34, 9405–9415 (2021).

Google Scholar

Wang, X. et al. RetroPrime: a diverse, plausible and transformer-based method for single-step retrosynthesis predictions. Chem. Eng. J. 420, 129845 (2021).

Article

Google Scholar

Wan, Y., Hsieh, C.-Y., Liao, B. & Zhang, S. Retroformer: pushing the limits of end-to-end retrosynthesis transformer. In International Conference on Machine Learning 22475–22490 (PMLR, 2022).

Chen, S. & Jung, Y. Deep retrosynthetic reaction prediction using local reactivity and global attention. JACS Au 1, 1612–1620 (2021).

Article

Google Scholar

Yao, L. et al. Node-aligned graph-to-graph: elevating template-free deep learning approaches in single-step retrosynthesis. JACS Au. 4, 992–1003 (2024).

Article

Google Scholar

Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).

Article

Google Scholar

Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).

Article

Google Scholar

Li, S.-W., Xu, L.-C., Zhang, C., Zhang, S.-Q. & Hong, X. Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge. Nat. Commun. 14, 3569 (2023).

Article

Google Scholar

Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).

Article

Google Scholar

Saebi, M. et al. On the use of real-world datasets for reaction yield prediction. Chem. Sci. 14, 4997–5005 (2023).

Article

Google Scholar

Li, D.-Z. & Gong, X.-Q. Challenges with literature-derived data in machine learning for yield prediction: a case study on Pd-catalyzed carbonylation reactions. J. Phys. Chem. A 128, 10423–10430 (2024).

Article

Google Scholar

Li, X., Zhang, S.-Q., Xu, L.-C. & Hong, X. Predicting regioselectivity in radical C–H functionalization of heterocycles through machine learning. Angew. Chem. Int. Ed. 59, 13253–13259 (2020).

Article

Google Scholar

Zahrt, A. F. et al. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 363, eaau5631 (2019).

Article

Google Scholar

Perera, D. et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science 359, 429–434 (2018).

Article

Google Scholar

Guo, T. et al. What can large language models do in chemistry? A comprehensive benchmark on eight tasks. Adv. Neural Inf. Process. Syst. 36, 59662–59688 (2023).

Google Scholar

Taylor, R. D., MacCoss, M. & Lawson, A. D. Rings in drugs: miniperspective. J. Med. Chem. 57, 5845–5859 (2014).

Article

Google Scholar

Ma, X. et al. A general approach to stereospecific cross-coupling reactions of nitrogen-containing stereocenters. Chem 6, 781–791 (2020).

Article

Google Scholar

Shu, X., Zhong, D., Lin, Y., Qin, X. & Huo, H. Modular access to chiral α-(hetero) aryl amines via Ni/photoredox-catalyzed enantioselective cross-coupling. J. Am. Chem. Soc. 144, 8797–8806 (2022).

Article

Google Scholar

Sarkar, S., Wagulde, S., Jia, X. & Gevorgyan, V. General and selective metal-free radical α-C–H borylation of aliphatic amines. Chem 8, 3096–3108 (2022).

Article

Google Scholar

Zhang, Y. et al. Large language models to accelerate organic chemistry synthesis. Zenodo https://doi.org/10.5281/zenodo.15295848 (2025).

Ruiz-Castillo, P. & Buchwald, S. L. Applications of palladium-catalyzed C–N cross-coupling reactions. Chem. Rev. 116, 12564–12649 (2016).

Article

Google Scholar