Parametrization of κ2-N,O-Oxazoline Preligands for Enantioselective Cobaltaelectro-Catalyzed C–H Activations DOI Creative Commons
Suman Dana, Neeraj Kumar Pandit, Philipp Boos

et al.

ACS Catalysis, Journal Year: 2025, Volume and Issue: unknown, P. 4450 - 4459

Published: Feb. 28, 2025

Enantioselective electrocatalyzed C–H activations have emerged as a transformative platform for the assembly of value-added chiral organic molecules. Despite recent progress, construction multiple C(sp3)-stereogenic centers via C(sp3)–C(sp3) bond formation has thus far proven to be elusive. In contrast, we herein report an annulative activation strategy, generating Fsp3-rich molecules with high levels diastereo- and enantioselectivity. κ2-N,O-oxazoline preligands were effectively employed in enantioselective cobalt(III)-catalyzed reactions. Using DFT-derived descriptors regression statistical modeling, performed parametrization study on modularity preligands. The resulted model describing ligands' selectivity characterized by key steric, electronic, interaction behaviors.

Language: Английский

A Brief Introduction to Chemical Reaction Optimization DOI Creative Commons
Connor J. Taylor, Alexander Pomberger, Kobi Felton

et al.

Chemical Reviews, Journal Year: 2023, Volume and Issue: 123(6), P. 3089 - 3126

Published: Feb. 23, 2023

From the start of a synthetic chemist's training, experiments are conducted based on recipes from textbooks and manuscripts that achieve clean reaction outcomes, allowing scientist to develop practical skills some chemical intuition. This procedure is often kept long into researcher's career, as new developed similar protocols, intuition-guided deviations through learning failed experiments. However, when attempting understand systems interest, it has been shown model-based, algorithm-based, miniaturized high-throughput techniques outperform human intuition optimization in much more time- material-efficient manner; this covered detail paper. As many chemists not exposed these undergraduate teaching, leads disproportionate number scientists wish optimize their reactions but unable use methodologies or simply unaware existence. review highlights basics, cutting-edge, modern well its relation process scale-up can thereby serve reference for inspired each techniques, detailing several respective applications.

Language: Английский

Citations

210

SELFIES and the future of molecular string representations DOI Creative Commons
Mario Krenn, Qianxiang Ai, Senja Barthel

et al.

Patterns, Journal Year: 2022, Volume and Issue: 3(10), P. 100588 - 100588

Published: Oct. 1, 2022

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks chemistry materials science. Examples include the prediction of properties, discovery new reaction pathways, or design molecules. The needs read write fluently a chemical language each these tasks. Strings common tool represent molecular graphs, most popular string representation, Smiles, has powered cheminformatics since late 1980s. However, context AI ML chemistry, Smiles several shortcomings—most pertinently, combinations symbols lead invalid results with no valid interpretation. To overcome this issue, molecules was introduced 2020 that guarantees 100% robustness: SELF-referencing embedded (Selfies). Selfies simplified enabled numerous chemistry. In perspective, we look future discuss representations, along their respective opportunities challenges. We propose 16 concrete projects robust representations. These involve extension toward domains, exciting questions at interface languages, interpretability both humans machines. hope proposals will inspire follow-up works exploiting full potential representations

Language: Английский

Citations

156

Machine Learning for Chemical Reactivity: The Importance of Failed Experiments DOI
Felix Strieth‐Kalthoff, Frederik Sandfort,

Marius Kühnemund

et al.

Angewandte Chemie International Edition, Journal Year: 2022, Volume and Issue: 61(29)

Published: May 5, 2022

Abstract Assessing the outcomes of chemical reactions in a quantitative fashion has been cornerstone across all synthetic disciplines. Classically approached through empirical optimization, data‐driven modelling bears an enormous potential to streamline this process. However, such predictive models require significant quantities high‐quality data, availability which is limited: Main reasons for include experimental errors and, importantly, human biases regarding experiment selection and result reporting. In series case studies, we investigate impact these drawing general conclusions from reaction revealing utmost importance “negative” examples. Eventually, studies into data expansion approaches showcase directions circumvent limitations—and demonstrate perspectives towards long‐term quality enhancement chemistry.

Language: Английский

Citations

153

Machine Learning May Sometimes Simply Capture Literature Popularity Trends: A Case Study of Heterocyclic Suzuki–Miyaura Coupling DOI Creative Commons
Wiktor Beker, Rafał Roszak, Agnieszka Wołos

et al.

Journal of the American Chemical Society, Journal Year: 2022, Volume and Issue: 144(11), P. 4819 - 4827

Published: March 8, 2022

Applications of machine learning (ML) to synthetic chemistry rely on the assumption that large numbers literature-reported examples should enable construction accurate and predictive models chemical reactivity. This paper demonstrates abundance carefully curated literature data may be insufficient for this purpose. Using an example Suzuki–Miyaura coupling with heterocyclic building blocks─and a selected database >10,000 examples─we show ML cannot offer any meaningful predictions optimum reaction conditions, even if search space is restricted only solvents bases. result holds irrespective model applied (from simple feed-forward state-of-the-art graph-convolution neural networks) or representation describe partners (various fingerprints, descriptors, latent representations, etc.). In all cases, methods fail perform significantly better than naive assignments based sheer frequency certain conditions reported in literature. These unsatisfactory results likely reflect subjective preferences various chemists use protocols, other biasing factors as mundane availability solvents/reagents, and/or lack negative data. findings highlight importance systematically generating reliable standardized sets algorithm training.

Language: Английский

Citations

143

Using Data Science To Guide Aryl Bromide Substrate Scope Analysis in a Ni/Photoredox-Catalyzed Cross-Coupling with Acetals as Alcohol-Derived Radical Sources DOI
Stavros K. Kariofillis,

Shutian Jiang,

A. Zuranski

et al.

Journal of the American Chemical Society, Journal Year: 2022, Volume and Issue: 144(2), P. 1045 - 1055

Published: Jan. 5, 2022

Ni/photoredox catalysis has emerged as a powerful platform for C(sp2)–C(sp3) bond formation. While many of these methods typically employ aryl bromides the C(sp2) coupling partner, variety aliphatic radical sources have been investigated. In principle, reactions enable access to same product scaffolds, but it can be hard discern which method because nonstandardized sets are used in scope evaluation. Herein, we report Ni/photoredox-catalyzed (deutero)methylation and alkylation halides where benzaldehyde di(alkyl) acetals serve alcohol-derived sources. Reaction development, mechanistic studies, late-stage derivatization biologically relevant chloride, fenofibrate, presented. Then, describe integration data science techniques, including DFT featurization, dimensionality reduction, hierarchical clustering, delineate diverse succinct collection that is representative chemical space substrate class. By superimposing examples from published on this space, identify areas sparse coverage high versus low average yields, enabling comparisons between prior art new method. Additionally, demonstrate systematically selected quantify population-wide reactivity trends reveal possible functional group incompatibility with supervised machine learning.

Language: Английский

Citations

131

Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability DOI Creative Commons
Thijs Stuyver,

Connor W. Coley

The Journal of Chemical Physics, Journal Year: 2022, Volume and Issue: 156(8)

Published: Feb. 22, 2022

There is a perceived dichotomy between structure-based and descriptor-based molecular representations used for predictive chemistry tasks. Here, we study the performance, generalizability, explainability of quantum mechanics-augmented graph neural network (ml-QM-GNN) architecture as applied to prediction regioselectivity (classification) activation energies (regression). In our hybrid QM-augmented model architecture, are first predict set atom- bond-level reactivity descriptors derived from density functional theory calculations. These estimated combined with original representation make final prediction. We demonstrate that leads significant improvements over GNNs in not only overall accuracy but also generalization unseen compounds. Even when provided training sets couple hundred labeled data points, ml-QM-GNN outperforms other state-of-the-art architectures have been these tasks well (linear) regressions. As primary contribution this work, bridge data-driven predictions conceptual frameworks commonly gain qualitative insights into phenomena, taking advantage fact models grounded (but restricted to) QM descriptors. This effort results productive synergy science, wherein provide confirmation previous analyses, analyses turn facilitate decision-making process occurring within ml-QM-GNNs.

Language: Английский

Citations

73

On the use of real-world datasets for reaction yield prediction DOI Creative Commons
Mandana Saebi, Bozhao Nan, John E. Herr

et al.

Chemical Science, Journal Year: 2023, Volume and Issue: 14(19), P. 4997 - 5005

Published: Jan. 1, 2023

The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such have been made available. first real-world dataset ELNs pharmaceutical company disclosed its relationship to high-throughput experimentation (HTE) described. For chemical yield predictions, task synthesis, an attributed graph neural network (AGNN) performs as well or better than best previous models on two HTE Suzuki-Miyaura Buchwald-Hartwig reactions. However, training AGNN ELN does not lead predictive model. implications using data ML-based are discussed context predictions.

Language: Английский

Citations

70

Machine learning advancements in organic synthesis: A focused exploration of artificial intelligence applications in chemistry DOI Creative Commons
Rizvi Syed Aal E Ali, Jiaolong Meng, Muhammad Ehtisham Ibraheem Khan

et al.

Artificial Intelligence Chemistry, Journal Year: 2024, Volume and Issue: 2(1), P. 100049 - 100049

Published: Jan. 19, 2024

Artificial intelligence (AI) is driving a revolution in chemistry, reshaping the landscape of molecular design. This review explores AI's pivotal roles field organic synthesis applications. AI accurately predicts reaction outcomes, controls chemical selectivity, simplifies planning, accelerates catalyst discovery, and fuels material innovation so on. It seamlessly integrates data-driven algorithms with intuition to redefine As chemistry advances, it promises accelerated research, sustainability, innovative solutions chemistry's pressing challenges. The fusion poised shape field's future profoundly, offering new horizons precision efficiency. encapsulates transformation marking moment where data converge revolutionize world molecules.

Language: Английский

Citations

25

Universal machine learning aided synthesis approach of two-dimensional perovskites in a typical laboratory DOI Creative Commons
Yilei Wu, Changfeng Wang, Ming‐Gang Ju

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: Jan. 2, 2024

Abstract The past decade has witnessed the significant efforts in novel material discovery use of data-driven techniques, particular, machine learning (ML). However, since it needs to consider precursors, experimental conditions, and availability reactants, synthesis is generally much more complex than property structure prediction, very few computational predictions are experimentally realized. To solve these challenges, a universal framework that integrates high-throughput experiments, priori knowledge chemistry, ML techniques such as subgroup support vector proposed guide materials, which capable disclosing structure-property relationship hidden experiments rapidly screening out materials with high feasibility from vast chemical space. Through application our approach challenging consequential problem 2D silver/bismuth organic-inorganic hybrid perovskites, we have increased success rate by factor four relative traditional approaches. This study provides practical route for solving multidimensional acceleration problems small dataset typical laboratory limited resources available.

Language: Английский

Citations

22

Designing Target-specific Data Sets for Regioselectivity Predictions on Complex Substrates DOI Creative Commons
Jules Schleinitz, Alba Carretero‐Cerdán, Anjali Gurajapu

et al.

Journal of the American Chemical Society, Journal Year: 2025, Volume and Issue: 147(9), P. 7476 - 7484

Published: Feb. 21, 2025

The development of machine learning models to predict the regioselectivity C(sp3)-H functionalization reactions is reported. A data set for dioxirane oxidations was curated from literature and used generate a model C-H oxidation. To assess whether smaller, intentionally designed sets could provide accuracy on complex targets, series acquisition functions were developed select most informative molecules specific target. Active learning-based that leverage predicted reactivity uncertainty found outperform those based molecular site similarity alone. use elaboration significantly reduced number points needed perform accurate prediction, it machine-designed can give predictions when larger, randomly selected fail. Finally, workflow experimentally validated five substrates shown be applicable predicting arene radical borylation. These studies quantitative alternative intuitive extrapolation "model substrates" frequently estimate molecules.

Language: Английский

Citations

2