Computational methods for asymmetric catalysis DOI
Sharon Pinus, Jérôme Genzling, Mihai Burai Patrascu

et al.

Nature Catalysis, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 3, 2024

Language: Английский

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk DOI Creative Commons
Mengxian Yu, Yin‐Ning Zhou, Qiang Wang

et al.

Digital Discovery, Journal Year: 2024, Volume and Issue: 3(5), P. 1058 - 1067

Published: Jan. 1, 2024

A generic machine learning model validation method named extrapolation (EV) has been proposed, which evaluates the trustworthiness of predictions to mitigate risk before transitions applications.

Language: Английский

Citations

6

Machine learning-guided strategies for reaction conditions design and optimization DOI Creative Commons
Lung-Yi Chen, Yi‐Pei Li

Beilstein Journal of Organic Chemistry, Journal Year: 2024, Volume and Issue: 20, P. 2476 - 2492

Published: Oct. 4, 2024

This review surveys the recent advances and challenges in predicting optimizing reaction conditions using machine learning techniques. The paper emphasizes importance of acquiring processing large diverse datasets chemical reactions, use both global local models to guide design synthetic processes. Global exploit information from comprehensive databases suggest general for new while fine-tune specific parameters a given family improve yield selectivity. also identifies current limitations opportunities this field, such as data quality availability, integration high-throughput experimentation. demonstrates how combination engineering, science, ML algorithms can enhance efficiency effectiveness design, enable novel discoveries chemistry.

Language: Английский

Citations

5

Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie’s 15-Year Parallel Library Data Set DOI Creative Commons
Priyanka Raghavan, Alexander J. Rago, Pritha Verma

et al.

Journal of the American Chemical Society, Journal Year: 2024, Volume and Issue: 146(22), P. 15070 - 15084

Published: May 20, 2024

Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste be reduced, a greater number relevant compounds delivered advance make, test, analyze (DMTA) cycle. In this work, we detail evaluation AbbVie's library data set build machine learning models for prediction Suzuki coupling yields. The combination density functional theory (DFT)-derived features Morgan fingerprints was identified perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, observe modest generalization unseen reactant structures within 15-year retrospective set. Additionally, compare predictions made by model those expert chemists, finding that can often predict both success with accuracy. Finally, demonstrate application approach suggest structurally electronically similar building blocks replace predicted or observed unsuccessful prior after synthesis, respectively. used select monomers have higher yields, resulting synthesis efficiency drug-like molecules.

Language: Английский

Citations

4

Using Data Science Tools to Reveal and Understand Subtle Relationships of Inhibitor Structure in Frontal Ring-Opening Metathesis Polymerization DOI
Timothy Patrick McFadden,

Reid B. Cope,

Rachel Muhlestein

et al.

Journal of the American Chemical Society, Journal Year: 2024, Volume and Issue: 146(24), P. 16375 - 16380

Published: June 5, 2024

The rate of frontal ring-opening metathesis polymerization (FROMP) using the Grubbs generation II catalyst is impacted by both concentration and choice monomers inhibitors, usually organophosphorus derivatives. Herein we report a data-science-driven workflow to evaluate how these factors impact FROMP long formulation mixture stable (pot life). Using this workflow, built classification model single-node decision tree determine simple phosphine structural descriptor (Vbur-near) can bin versus short pot life. Additionally, applied nonlinear kernel ridge regression predict inhibitor selection/concentration comonomers rate. analysis provides selection criteria for material network structures that span from highly cross-linked thermosets non-cross-linked thermoplastics as well degradable nondegradable materials.

Language: Английский

Citations

4

Prof. Eric Jacobsen and #MSDChemistry: Past, present and future DOI
Rebecca T. Ruck, Petr Váchal

Tetrahedron, Journal Year: 2025, Volume and Issue: 174, P. 134498 - 134498

Published: Jan. 25, 2025

Language: Английский

Citations

0

Evaluating Predictive Accuracy in Asymmetric Catalysis: A Machine Learning Perspective on Local Reaction Space DOI
Isaiah O. Betinol,

Aleksandra Demchenko,

Jolene P. Reid

et al.

ACS Catalysis, Journal Year: 2025, Volume and Issue: unknown, P. 6067 - 6077

Published: March 31, 2025

Language: Английский

Citations

0

Integrating a multitask graph neural network with DFT calculations for site-selectivity prediction of arenes and mechanistic knowledge generation DOI Creative Commons
Xinran Chen, Zijing Zhang, Xin Hong

et al.

Nature Synthesis, Journal Year: 2025, Volume and Issue: unknown

Published: April 7, 2025

Language: Английский

Citations

0

A meta-learning approach for selectivity prediction in asymmetric catalysis DOI Creative Commons
Sukriti Singh, José Miguel Hernández-Lobato

Nature Communications, Journal Year: 2025, Volume and Issue: 16(1)

Published: April 15, 2025

Abstract Transition metal-catalyzed asymmetric reactions are of high contemporary importance in organic synthesis. Recently, machine learning (ML) has shown promise accelerating the development newer catalytic protocols. However, need for large amount experimental data can present a bottleneck implementing ML models. Here, we propose meta-learning workflow that harness literature-derived to extract shared reaction features and requires only few examples predict outcome new reactions. Prototypical networks used as method enantioselectivity hydrogenation olefins. This model consistently provides significant performance improvement over other popular methods such random forests graph neural networks. The our meta-model is analyzed with varying sizes training demonstrate its utility even limited data. A good on an out-of-sample test set further indicates general applicability approach. We believe this work will provide leap forward identifying promising early phases when minimal available.

Language: Английский

Citations

0

Local reaction condition optimization via machine learning DOI
Wenwei Song,

Honggang Sun

Journal of Molecular Modeling, Journal Year: 2025, Volume and Issue: 31(5)

Published: April 23, 2025

Language: Английский

Citations

0

Exploring BERT for Reaction Yield Prediction: Evaluating the Impact of Tokenization, Molecular Representation, and Pretraining Data Augmentation DOI
Adrian Krzyzanowski, Stephen D. Pickett, Péter Pogány

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: May 1, 2025

Predicting reaction yields in synthetic chemistry remains a significant challenge. This study systematically evaluates the impact of tokenization, molecular representation, pretraining data, and adversarial training on BERT-based model for yield prediction Buchwald-Hartwig Suzuki-Miyaura coupling reactions using publicly available HTE data sets. We demonstrate that representation choice (SMILES, DeepSMILES, SELFIES, Morgan fingerprint-based notation, IUPAC names) has minimal performance, while typically BPE SentencePiece tokenization outperform other methods. WordPiece is strongly discouraged SELFIES notation. Furthermore, with relatively small sets (<100 K reactions) achieves comparable performance to larger containing millions examples. The use artificially generated domain-specific proposed. prove be good surrogate schemes extracted from such as Pistachio or Reaxys. best was observed hybrid combining real domain-specific, artificial data. Finally, we show novel approach, perturbing input embeddings dynamically, improves robustness generalizability success prediction. These findings provide valuable insights developing robust practical machine learning models chemistry. GSK's BERT code base made community this work.

Language: Английский

Citations

0