Cited by Data-Driven Insights into the Transition-Metal-Catalyzed Asymmetric Hydrogenation of Olefins

Designing Target-specific Data Sets for Regioselectivity Predictions on Complex Substrates DOI

Jules Schleinitz, Alba Carretero‐Cerdán, Anjali Gurajapu

et al.

Journal of the American Chemical Society, Journal Year: 2025, Volume and Issue: 147(9), P. 7476 - 7484

Published: Feb. 21, 2025

The development of machine learning models to predict the regioselectivity C(sp3)-H functionalization reactions is reported. A data set for dioxirane oxidations was curated from literature and used generate a model C-H oxidation. To assess whether smaller, intentionally designed sets could provide accuracy on complex targets, series acquisition functions were developed select most informative molecules specific target. Active learning-based that leverage predicted reactivity uncertainty found outperform those based molecular site similarity alone. use elaboration significantly reduced number points needed perform accurate prediction, it machine-designed can give predictions when larger, randomly selected fail. Finally, workflow experimentally validated five substrates shown be applicable predicting arene radical borylation. These studies quantitative alternative intuitive extrapolation "model substrates" frequently estimate molecules.

Language: Английский

Citations

Data Science-Driven Discovery of Optimal Conditions and a Condition-Selection Model for the Chan–Lam Coupling of Primary Sulfonamides DOI

Shivaani Gandhi, Gregory Brown, Santeri Aikonen

et al.

ACS Catalysis, Journal Year: 2025, Volume and Issue: unknown, P. 2292 - 2304

Published: Jan. 24, 2025

Language: Английский

Citations

Dedenser: A Python Package for Clustering and Downsampling Chemical Libraries DOI

Armen G. Beck, Jonathan Fine, Yu‐hong Lam

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 30, 2025

The screening of chemical libraries is an essential starting point in the drug discovery process. While some researchers desire a more thorough targets against narrower scope molecules, it not uncommon for diverse sets to be favored during early stages discovery. However, cost burden associated with potential drawbacks if particular areas space are needlessly overrepresented. To facilitate triaged sampling and other collections we have developed Dedenser, tool downsampling clusters. Dedenser functions by reducing membership clusters within clouds while maintaining initial topology or distribution space. Python package that utilizes Hierarchical Density-Based Spatial Clustering Applications Noise first identify present 3D then downsamples applying Poisson disk based on either their volume density A command line interface graphic user available which allow generation clouds, using Mordred QSAR descriptor calculations uniform manifold approximation projection embedding, as well visualization. We hope will serve community enabling quick access reduced molecules representative larger selecting even distributions rather than single from All code open source at https://github.com/MSDLLCpapers/dedenser.

Language: Английский

Citations

Probability Guided Chemical Reaction Scopes DOI

Inbal Lorena Eshel,

Shahar Barkai,

Sergio Barranco

et al.

Published: Jan. 1, 2025

Language: Английский

Citations

Revealing the Relationship between Publication Bias and Chemical Reactivity with Contrastive Learning DOI

Wenhao Gao, Priyanka Raghavan, Ron Shprints

et al.

Journal of the American Chemical Society, Journal Year: 2025, Volume and Issue: unknown

Published: March 2, 2025

A synthetic method's substrate tolerance and generality are often showcased in a "substrate scope" table. However, selection exhibits frequently discussed publication bias: unsuccessful experiments or low-yielding results rarely reported. In this work, we explore more deeply the relationship between such bias chemical reactivity beyond simple analysis of yield distributions using novel neural network training strategy, scope contrastive learning. By treating reported substrates as positive samples nonreported negative samples, our learning strategy teaches model to group molecules within numerical embedding space, based on historical trends published tables. Training 20,798 aryl halides CAS Content CollectionTM, spanning thousands publications from 2010 2015, demonstrate that learned embeddings exhibit correlation with physical organic descriptors through both intuitive visualizations quantitative regression analyses. Additionally, these applicable various reaction modeling tasks like prediction regioselectivity prediction, underscoring potential use data pretraining task. This work not only presents chemistry-specific machine learn literature new way but also represents unique approach uncover reflected by publications.

Language: Английский

Citations

The Implementation and Impact of Chemical High-Throughput Experimentation at AstraZeneca DOI

James J. Douglas, Andrew D. Campbell, David Buttar

et al.

ACS Catalysis, Journal Year: 2025, Volume and Issue: unknown, P. 5229 - 5256

Published: March 13, 2025

Language: Английский

Citations

Applying Active Learning toward Building a Generalizable Model for Ni-Photoredox Cross-Electrophile Coupling of Aryl and Alkyl Bromides DOI

Lucas W. Souza, Nathan D. Ricke, Braden C. Chaffin

et al.

Journal of the American Chemical Society, Journal Year: 2025, Volume and Issue: unknown

Published: May 22, 2025

When developing machine learning models for yield prediction, the two main challenges are effectively exploring condition space and substrate space. In this article, we disclose an approach mapping Ni/photoredox-catalyzed cross-electrophile coupling of alkyl bromides aryl in a high-throughput experimentation (HTE) context. This model employs active (in particular, uncertainty querying) as strategy to rapidly construct model. Given vastness space, focused on that builds initial then uses minimal data set expand into new chemical spaces. built virtual 22,240 compounds using less than 400 points. We demonstrated can be expanded 33,312 by adding information around 24 building blocks (<100 additional reactions). Comparing learning-based one constructed randomly selected showed was significantly better at predicting which reactions will successful. A combination density function theory (DFT) difference Morgan fingerprints employed random forest Feature importance analysis indicates key DFT features related reaction mechanism (e.g., radical LUMO energy) were crucial performance predictions outside training set. anticipate combining featurization uncertainty-based querying help synthetic organic community build predictive data-efficient manner other feature large diverse scopes.

Language: Английский

Citations

Catalysing (organo-)catalysis: Trends in the application of machine learning to enantioselective organocatalysis DOI

Stefan P. Schmid, Leon Schlosser, Frank Glorius

et al.

Beilstein Journal of Organic Chemistry, Journal Year: 2024, Volume and Issue: 20, P. 2280 - 2304

Published: Sept. 10, 2024

Organocatalysis has established itself as a third pillar of homogeneous catalysis, besides transition metal catalysis and biocatalysis, its use for enantioselective reactions gathered significant interest over the last decades. Concurrent to this development, machine learning (ML) been increasingly applied in chemical domain efficiently uncover hidden patterns data accelerate scientific discovery. While uptake ML organocatalysis comparably slow, two decades have showed an increased from community. This review gives overview work field organocatalysis. The starts by giving short primer on experimental chemists, before discussing application predicting selectivity organocatalytic transformations. Subsequently, we employed privileged catalysts, focusing catalyst reaction design. Concluding, give our view current challenges future directions field, drawing inspiration other domains.

Language: Английский

Citations

Data-Driven Insights into the Transition-Metal-Catalyzed Asymmetric Hydrogenation of Olefins DOI

Sukriti Singh, José Miguel Hernández-Lobato

The Journal of Organic Chemistry, Journal Year: 2024, Volume and Issue: 89(17), P. 12467 - 12478

Published: Aug. 16, 2024

The transition-metal-catalyzed asymmetric hydrogenation of olefins is one the key transformations with great utility in various industrial applications. field has been dominated by use noble metal catalysts, such as iridium and rhodium. reactions earth-abundant cobalt have increased only recent years. In this work, we analyze large amount literature data available on iridium- rhodium-catalyzed hydrogenation. limited using Co catalysts are then examined context Ir Rh to obtain a better understanding reactivity pattern. A detailed data-driven study types olefins, ligands, reaction conditions solvent, temperature, pressure carried out. Our analysis provides an trends demonstrates that few olefin–ligand combinations or frequently used. knowledge bias toward certain group substrates can be useful for practitioners design new sets suitable meaningful predictions from machine-learning models.

Language: Английский

Citations