Feed-Forward Neural Network for Predicting Enantioselectivity of the Asymmetric Negishi Reaction DOI Creative Commons

Abbigayle E. Cuomo,

Sebastian Ibarraran,

Sanil Sreekumar

et al.

ACS Central Science, Journal Year: 2023, Volume and Issue: 9(9), P. 1768 - 1774

Published: Aug. 24, 2023

Density functional theory (DFT) is a powerful tool to model transition state (TS) energies predict selectivity in chemical synthesis. However, successful multistep synthesis campaign must navigate energetically narrow differences pathways that create some limits rapid and unambiguous application of DFT these problems. While data science techniques may provide complementary approach overcome this problem, doing so with the relatively small sets are widespread organic presents significant challenge. Herein, we show set can be labeled features from TS calculations train feed-forward neural network for predicting enantioselectivity Negishi cross-coupling reaction P-chiral hindered phosphines. This modeling compared conventional approaches, including exclusive use using ligands or ground states architectures.

Language: Английский

Transition-Metal-Catalyzed Silylation and Borylation of C–H Bonds for the Synthesis and Functionalization of Complex Molecules DOI
Isaac Furay Yu, Jake W. Wilson, John F. Hartwig

et al.

Chemical Reviews, Journal Year: 2023, Volume and Issue: 123(19), P. 11619 - 11663

Published: Sept. 26, 2023

The functionalization of C–H bonds in organic molecules containing functional groups has been one the holy grails catalysis. One synthetically important approach to diverse is catalytic silylation or borylation bonds, which enables a broad array downstream transformations afford structures. Advances both undirected and directed methods for transition-metal-catalyzed have led their rapid adoption early-, mid-, late-stage synthesis complex molecules. In this Review, we review application bioactive molecules, materials, ligands. Overall, aim provide picture state art as applied modification architectures that will spur further development these reactions.

Language: Английский

Citations

69

Designing Target-specific Data Sets for Regioselectivity Predictions on Complex Substrates DOI Creative Commons
Jules Schleinitz, Alba Carretero‐Cerdán, Anjali Gurajapu

et al.

Journal of the American Chemical Society, Journal Year: 2025, Volume and Issue: 147(9), P. 7476 - 7484

Published: Feb. 21, 2025

The development of machine learning models to predict the regioselectivity C(sp3)-H functionalization reactions is reported. A data set for dioxirane oxidations was curated from literature and used generate a model C-H oxidation. To assess whether smaller, intentionally designed sets could provide accuracy on complex targets, series acquisition functions were developed select most informative molecules specific target. Active learning-based that leverage predicted reactivity uncertainty found outperform those based molecular site similarity alone. use elaboration significantly reduced number points needed perform accurate prediction, it machine-designed can give predictions when larger, randomly selected fail. Finally, workflow experimentally validated five substrates shown be applicable predicting arene radical borylation. These studies quantitative alternative intuitive extrapolation "model substrates" frequently estimate molecules.

Language: Английский

Citations

2

Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning DOI Creative Commons
David F. Nippa, Kenneth Atz,

Remo Hohler

et al.

Nature Chemistry, Journal Year: 2023, Volume and Issue: 16(2), P. 239 - 248

Published: Nov. 23, 2023

Abstract Late-stage functionalization is an economical approach to optimize the properties of drug candidates. However, chemical complexity molecules often makes late-stage diversification challenging. To address this problem, a platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as critical step in functionalization, computational model predicted yields for diverse conditions with mean absolute error margin 4–5%, while reactivity novel reactions known unknown substrates classified balanced accuracy 92% 67%, respectively. The regioselectivity major products accurately captured classifier F -score 67%. When applied 23 commercial molecules, successfully identified numerous opportunities structural diversification. influence steric electronic information performance quantified, comprehensive simple user-friendly format introduced that proved be key enabler seamlessly integrating experimentation functionalization.

Language: Английский

Citations

40

Dataset Design for Building Models of Chemical Reactivity DOI Creative Commons
Priyanka Raghavan, Brittany C. Haas, Madeline E. Ruos

et al.

ACS Central Science, Journal Year: 2023, Volume and Issue: 9(12), P. 2196 - 2204

Published: Dec. 8, 2023

Models can codify our understanding of chemical reactivity and serve a useful purpose in the development new synthetic processes via, for example, evaluating hypothetical reaction conditions or silico substrate tolerance. Perhaps most determining factor is composition training data whether it sufficient to train model that make accurate predictions over full domain interest. Here, we discuss design datasets ways are conducive data-driven modeling, emphasizing idea set diversity generalizability rely on choice molecular representation. We additionally experimental constraints associated with generating common types chemistry how these considerations should influence dataset building.

Language: Английский

Citations

36

Kinetic and thermodynamic control of C(sp 2 )–H activation enables site-selective borylation DOI
Jose B. Roque, Alex M. Shimozono, Tyler P. Pabst

et al.

Science, Journal Year: 2023, Volume and Issue: 382(6675), P. 1165 - 1170

Published: Dec. 7, 2023

Catalysts that distinguish between electronically distinct carbon-hydrogen (C–H) bonds without relying on steric effects or directing groups are challenging to design. In this work, cobalt precatalysts supported by N -alkyl-imidazole–substituted pyridine dicarbene (ACNC) pincer ligands described enable undirected, remote borylation of fluoroaromatics and expansion scope include electron-rich arenes, pyridines, tri- difluoromethoxylated thereby addressing one the major limitations first-row transition metal C–H functionalization catalysts. Mechanistic studies established a kinetic preference for bond activation at meta -position despite cobalt-aryl complexes resulting from ortho being thermodynamically preferred. Switchable site selectivity in as function boron reagent was preliminarily demonstrated using single precatalyst.

Language: Английский

Citations

23

AI for organic and polymer synthesis DOI

Hong Xin,

Qi Yang, Kuangbiao Liao

et al.

Science China Chemistry, Journal Year: 2024, Volume and Issue: 67(8), P. 2461 - 2496

Published: June 26, 2024

Language: Английский

Citations

11

Deconvolution and Analysis of the 1H NMR Spectra of Crude Reaction Mixtures DOI Creative Commons
Maxwell C. Venetos, Masha Elkin, Connor P. Delaney

et al.

Journal of Chemical Information and Modeling, Journal Year: 2024, Volume and Issue: 64(8), P. 3008 - 3020

Published: April 4, 2024

Nuclear magnetic resonance (NMR) spectroscopy is an important analytical technique in synthetic organic chemistry, but its integration into high-throughput experimentation workflows has been limited by the necessity of manually analyzing NMR spectra new chemical entities. Current efforts to automate analysis rely on comparisons databases reported for known compounds and, therefore, are incompatible with exploration space. By reframing spectrum a reaction mixture as joint probability distribution, we have used Hamiltonian Monte Carlo Markov Chain and density functional theory fit predicted those crude mixtures. This approach enables deconvolution mixtures without relying spectra. The utility our analyze demonstrated experimental reactions that generate isomers, such Wittig olefination C–H functionalization reactions. correct identification their relative concentrations achieved mean absolute error low 1%.

Language: Английский

Citations

5

Leveraging Language Model Multitasking To Predict C–H Borylation Selectivity DOI Creative Commons

Ruslan Kotlyarov,

Konstantinos Papachristos,

Geoffrey P. F. Wood

et al.

Journal of Chemical Information and Modeling, Journal Year: 2024, Volume and Issue: 64(10), P. 4286 - 4297

Published: May 6, 2024

C–H borylation is a high-value transformation in the synthesis of lead candidates for pharmaceutical industry because wide array downstream coupling reactions available. However, predicting its regioselectivity, especially drug-like molecules that may contain multiple heterocycles, not trivial task. Using data set from Reaxys, we explored how language model originally trained on USPTO_500_MT, broad-scope patent data, can be used to predict reaction product different modes: generation and site reactivity classification. Our fine-tuned T5Chem multitask generate correct 79% cases. It also classify reactive aromatic bonds with 95% accuracy 88% positive predictive value, exceeding purpose-developed graph-based neural networks.

Language: Английский

Citations

4

Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie’s 15-Year Parallel Library Data Set DOI Creative Commons
Priyanka Raghavan, Alexander J. Rago, Pritha Verma

et al.

Journal of the American Chemical Society, Journal Year: 2024, Volume and Issue: 146(22), P. 15070 - 15084

Published: May 20, 2024

Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste be reduced, a greater number relevant compounds delivered advance make, test, analyze (DMTA) cycle. In this work, we detail evaluation AbbVie's library data set build machine learning models for prediction Suzuki coupling yields. The combination density functional theory (DFT)-derived features Morgan fingerprints was identified perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, observe modest generalization unseen reactant structures within 15-year retrospective set. Additionally, compare predictions made by model those expert chemists, finding that can often predict both success with accuracy. Finally, demonstrate application approach suggest structurally electronically similar building blocks replace predicted or observed unsuccessful prior after synthesis, respectively. used select monomers have higher yields, resulting synthesis efficiency drug-like molecules.

Language: Английский

Citations

4

Automated approaches, reaction parameterisation, and data science in organometallic chemistry and catalysis: towards improving synthetic chemistry and accelerating mechanistic understanding DOI Creative Commons
Stuart C. Smith, Christopher S. Horbaczewskyj, Theo F. N. Tanner

et al.

Digital Discovery, Journal Year: 2024, Volume and Issue: 3(8), P. 1467 - 1495

Published: Jan. 1, 2024

This review discusses the use of automation for organometallic reactions to generate rich datasets and, with statistical analysis and reaction component parameterisation, how mechanisms can be probed gain understanding.

Language: Английский

Citations

4