
Digital Discovery, Год журнала: 2024, Номер 3(9), С. 1878 - 1888
Опубликована: Янв. 1, 2024
How much chemistry can be described by looking only at each atom, its neighbours and next-nearest neighbours?
Язык: Английский
Digital Discovery, Год журнала: 2024, Номер 3(9), С. 1878 - 1888
Опубликована: Янв. 1, 2024
How much chemistry can be described by looking only at each atom, its neighbours and next-nearest neighbours?
Язык: Английский
Science Advances, Год журнала: 2024, Номер 10(3)
Опубликована: Янв. 17, 2024
Data science is assuming a pivotal role in guiding reaction optimization and streamlining experimental workloads the evolving landscape of synthetic chemistry. A discipline-wide goal development workflows that integrate computational chemistry data tools with high-throughput experimentation as it provides experimentalists ability to maximize success expensive campaigns. Here, we report an end-to-end data-driven process effectively predict how structural features coupling partners ligands affect Cu-catalyzed C–N reactions. The established workflow underscores limitations posed by substrates while also providing systematic ligand prediction tool uses probability assess when will be successful. This platform strategically designed confront intrinsic unpredictability frequently encountered deployment.
Язык: Английский
Процитировано
25Science Advances, Год журнала: 2025, Номер 11(1)
Опубликована: Янв. 1, 2025
The application of statistical modeling in organic chemistry is emerging as a standard practice for probing structure-activity relationships and predictive tool many optimization objectives. This review aimed tutorial those entering the area chemistry. We provide case studies to highlight considerations approaches that can be used successfully analyze datasets low data regimes, common situation encountered given experimental demands Statistical hinges on (what being modeled), descriptors (how are represented), algorithms modeled). Herein, we focus how various reaction outputs (e.g., yield, rate, selectivity, solubility, stability, turnover number) structures binned, heavily skewed, distributed) influence choice algorithm constructing chemically insightful models.
Язык: Английский
Процитировано
3ACS Central Science, Год журнала: 2023, Номер 9(12), С. 2196 - 2204
Опубликована: Дек. 8, 2023
Models can codify our understanding of chemical reactivity and serve a useful purpose in the development new synthetic processes via, for example, evaluating hypothetical reaction conditions or silico substrate tolerance. Perhaps most determining factor is composition training data whether it sufficient to train model that make accurate predictions over full domain interest. Here, we discuss design datasets ways are conducive data-driven modeling, emphasizing idea set diversity generalizability rely on choice molecular representation. We additionally experimental constraints associated with generating common types chemistry how these considerations should influence dataset building.
Язык: Английский
Процитировано
42The Journal of Chemical Physics, Год журнала: 2024, Номер 161(5)
Опубликована: Авг. 2, 2024
This paper is dedicated to the quantum chemical package Jaguar, which commercial software developed and distributed by Schrödinger, Inc. We discuss Jaguar’s scientific features that are relevant research as well describe those aspects of program pertinent user interface, organization computer code, its maintenance testing. Among topics feature prominently in this methods grounded pseudospectral approach. A number multistep workflows dependent on Jaguar covered: prediction protonation equilibria aqueous solutions (particularly calculations tautomeric stability pKa), reactivity predictions based automated transition state search, assembly Boltzmann-averaged spectra such vibrational electronic circular dichroism, nuclear magnetic resonance. Discussed also oriented toward materials science applications, particular, properties optoelectronic organic semiconductors, molecular catalyst design. The topic treatment conformations inevitably comes up real world projects considered part all mentioned above. In addition, we examine role machine learning performed from auxiliary functions return approximate calculation runtime a actual properties. current work second series reviews first having been published more than ten years ago. Thus, serves rare milestone path being traversed development thirty existence.
Язык: Английский
Процитировано
9Journal of the American Chemical Society, Год журнала: 2024, Номер 146(23), С. 16052 - 16061
Опубликована: Июнь 1, 2024
The application of machine learning models to the prediction reaction outcomes currently needs large and/or highly featurized data sets. We show that a chemistry-aware model, NERF, which mimics bonding changes occur during reactions, allows for accurate predictions Diels–Alder reactions using relatively small training set, with no pretraining and additional features. establish diverse set 9537 intramolecular, hetero-, aromatic, inverse electron demand reactions. This is used train NERF performance compared against state-of-the-art classification generative across low- high-data regimes, without pretraining. predictive accuracy (regio- site selectivity in major product) achieved by exceeds 90% when as little 40% training. Another high-performing Chemformer, requires larger (>45%) reach Top-1 accuracy. Accurate less-represented subclasses, such those involving heteroatomic or aromatic substrates, require higher percentages data. also how can use amounts quickly learn new systems improve its overall understanding reactivity. Synthetic chemists stand benefit this model be rapidly expanded tailored areas chemistry corresponding low-data regime.
Язык: Английский
Процитировано
8Chemical Science, Год журнала: 2025, Номер unknown
Опубликована: Янв. 1, 2025
Label ranking is introduced as a conceptually new means for prioritizing experiments. Their simplicity, ease of application, and the use aggregation facilitate their ability to make accurate predictions with small datasets.
Язык: Английский
Процитировано
1Communications Chemistry, Год журнала: 2024, Номер 7(1)
Опубликована: Июнь 14, 2024
Recent years have seen a rapid growth in the application of various machine learning methods for reaction outcome prediction. Deep models gained popularity due to their ability learn representations directly from molecular structure. Gaussian processes (GPs), on other hand, provide reliable uncertainty estimates but are unable data. We combine feature neural networks (NNs) with quantification GPs deep kernel (DKL) framework predict outcome. The DKL model is observed obtain very good predictive performance across different input representations. It significantly outperforms standard and provides comparable graph networks, estimation. Additionally, predictions provided by facilitated its incorporation as surrogate Bayesian optimization (BO). proposed method, therefore, has great potential towards accelerating discovery integrating accurate that BO.
Язык: Английский
Процитировано
4Chemical Engineering Journal, Год журнала: 2024, Номер 493, С. 152300 - 152300
Опубликована: Май 16, 2024
Язык: Английский
Процитировано
3Pure and Applied Chemistry, Год журнала: 2025, Номер unknown
Опубликована: Апрель 24, 2025
Abstract Computational methods for predicting product ratios in dynamically controlled reactions with shallow intermediates or bifurcating pathways after an ambimodal transition state are reviewed and benchmarked. The range of includes molecular dynamics simulations, machine learning-based models recent advancements correlational methods, all which rely on quantum mechanical computations. Together, these approaches form a computational toolbox that enhances the efficiency effectiveness exploring reaction selectivity influenced by dynamic effects.
Язык: Английский
Процитировано
0Journal of the American Chemical Society, Год журнала: 2025, Номер unknown
Опубликована: Май 22, 2025
When developing machine learning models for yield prediction, the two main challenges are effectively exploring condition space and substrate space. In this article, we disclose an approach mapping Ni/photoredox-catalyzed cross-electrophile coupling of alkyl bromides aryl in a high-throughput experimentation (HTE) context. This model employs active (in particular, uncertainty querying) as strategy to rapidly construct model. Given vastness space, focused on that builds initial then uses minimal data set expand into new chemical spaces. built virtual 22,240 compounds using less than 400 points. We demonstrated can be expanded 33,312 by adding information around 24 building blocks (<100 additional reactions). Comparing learning-based one constructed randomly selected showed was significantly better at predicting which reactions will successful. A combination density function theory (DFT) difference Morgan fingerprints employed random forest Feature importance analysis indicates key DFT features related reaction mechanism (e.g., radical LUMO energy) were crucial performance predictions outside training set. anticipate combining featurization uncertainty-based querying help synthetic organic community build predictive data-efficient manner other feature large diverse scopes.
Язык: Английский
Процитировано
0