Molecular Design for Cardiac Cell Differentiation Using a Small Data Set and Decorated Shape Features DOI
Fatemeh Etezadi,

Shunichi Ito,

Kosuke Yasui

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2024, Номер 64(23), С. 8824 - 8837

Опубликована: Ноя. 25, 2024

The discovery of small organic compounds for inducing stem cell differentiation is a time- and resource-intensive process. While data science could, in principle, streamline the these compounds, novel approaches are required due to difficulty acquiring training from large numbers example compounds. In this paper, we present design new compound cardiomyocyte using simple regression models trained with set containing only 80 examples. We introduce decorated shape descriptors, an information-rich molecular feature representation that integrates both hydrophilicity information. These demonstrate improved performance compared ones standard descriptors based on alone. Model overtraining diagnosed type sensitivity analysis. Our designed conservative strategy, its effectiveness confirmed through expression profiles cardiomyocyte-related marker genes real-time polymerase chain reaction experiments human iPS lines. This work demonstrates viable data-driven strategy designing protocols will be useful situations where limited.

Язык: Английский

Machine learning-guided strategies for reaction conditions design and optimization DOI Creative Commons
Lung-Yi Chen, Yi‐Pei Li

Beilstein Journal of Organic Chemistry, Год журнала: 2024, Номер 20, С. 2476 - 2492

Опубликована: Окт. 4, 2024

This review surveys the recent advances and challenges in predicting optimizing reaction conditions using machine learning techniques. The paper emphasizes importance of acquiring processing large diverse datasets chemical reactions, use both global local models to guide design synthetic processes. Global exploit information from comprehensive databases suggest general for new while fine-tune specific parameters a given family improve yield selectivity. also identifies current limitations opportunities this field, such as data quality availability, integration high-throughput experimentation. demonstrates how combination engineering, science, ML algorithms can enhance efficiency effectiveness design, enable novel discoveries chemistry.

Язык: Английский

Процитировано

7

Machine Learning for Reaction Performance Prediction in Allylic Substitution Enhanced by Automatic Extraction of a Substrate-Aware Descriptor DOI
Gufeng Yu, Xi Wang,

Yugong Luo

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2025, Номер 65(1), С. 312 - 325

Опубликована: Янв. 2, 2025

Despite remarkable advancements in the organic synthesis field facilitated by use of machine learning (ML) techniques, prediction reaction outcomes, including yield estimation, catalyst optimization, and mechanism identification, continues to pose a significant challenge. This challenge arises primarily from lack appropriate descriptors capable retaining crucial molecular information for accurate while also ensuring computational efficiency. study presents successful application ML predicting performance Ir-catalyzed allylic substitution reactions. We introduce SubA, an innovative substrate-aware descriptor that is inspired fact specific atoms or motifs reactants drive outcomes. By employing graph matching algorithms backbone identification incorporating atomic properties derived density functional theory calculations, SubA extracts essential at both level level. Compared four mainstream descriptors, achieves reduced dimensionality enhanced accuracy with over 2% mean absolute error reduction random scaffold splitting evaluations. It demonstrates better generalization when confronted previously unreported substrate combinations extended experiments. Furthermore, interpretable analysis shows predictor focuses on key features, offering insights into mechanisms.

Язык: Английский

Процитировано

0

Unveiling high-performance hosts for blue OLEDs via deep learning and high-throughput virtual screening DOI

Sunghyuck An,

Young Hun Jung,

Gunwook Nam

и другие.

Chemical Engineering Journal, Год журнала: 2025, Номер unknown, С. 159697 - 159697

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

0

Enhancing Activation Energy Predictions under Data Constraints Using Graph Neural Networks DOI Creative Commons

Han-Chung Chang,

Ming‐Hsuan Tsai, Yi‐Pei Li

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2025, Номер unknown

Опубликована: Янв. 25, 2025

Accurately predicting activation energies is crucial for understanding chemical reactions and modeling complex reaction systems. However, the high computational cost of quantum chemistry methods often limits feasibility large-scale studies, leading to a scarcity high-quality energy data. In this work, we explore compare three innovative approaches (transfer learning, delta feature engineering) enhance accuracy predictions using graph neural networks, specifically focusing on that incorporate low-cost, low-level Using Chemprop model, systematically evaluated how these leverage data from semiempirical mechanics (SQM) calculations improve predictions. Delta which adjusts SQM align with high-level CCSD(T)-F12a targets, emerged as most effective method, achieving substantially reduced requirements. Notably, learning trained just 20–30% matched or exceeded performance other full sets, making it advantageous in data-scarce scenarios. its reliance transition state searches imposes significant demands during model application. Transfer pretrains models large sets data, provided mixed results, particularly when there was mismatch distributions between training target sets. Feature engineering, involves adding computed molecular properties input features, showed modest gains, thermodynamic properties. Our study highlights trade-offs demand selecting best approach enhancing These insights provide valuable guidelines researchers aiming apply machine helping balance resource constraints.

Язык: Английский

Процитировано

0

Computational Tools for the Prediction of Site- and Regioselectivity of Organic Reactions DOI Creative Commons
Lukas M. Sigmund,

Michele Assante,

Magnus J. Johansson

и другие.

Chemical Science, Год журнала: 2025, Номер unknown

Опубликована: Янв. 1, 2025

This article reviews computational tools for the prediction of regio- and site-selectivity organic reactions. It spans from quantum chemical procedures to deep learning models showcases application presented tools.

Язык: Английский

Процитировано

0

Improved Solubility Predictions in scCO2 Using Thermodynamics-Informed Machine Learning Models DOI
Dmitriy M. Makarov, Nikolai N. Kalikin, Yury A. Budkov

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2025, Номер unknown

Опубликована: Апрель 15, 2025

Accurate solubility prediction in supercritical carbon dioxide (scCO2) is crucial for optimizing experimental design by eliminating unnecessary and costly trials at an early stage, thereby streamlining the workflow. A comprehensive database containing 31,975 records has been compiled, providing a foundation developing predictive models applicable to diverse class of chemical compounds, with particular focus on drug-like substances. In this study, we propose domain-aware machine learning approach that incorporates thermodynamic properties governing phase transitions predictions scCO2. Predictive were developed using CatBoost algorithm graph-based architecture employing directed message passing identify most effective approach. Furthermore, auxiliary solute, including melting point, critical parameters, enthalpy vaporization, Gibbs free energy solvation, predicted as part work. The findings underscore efficacy incorporating domain-specific features enhance accuracy scCO2 modeling. interpretation applicability domain assessment have confirmed qualitative selection employed descriptors, demonstrating their ability generalize unique compounds fall outside defined domain.

Язык: Английский

Процитировано

0

Uncovering ion transport mechanisms in ionic liquids using data science DOI Creative Commons
J. E. Umaña, Ryan K. Cashen, Ví­ctor M. Zavala

и другие.

Digital Discovery, Год журнала: 2025, Номер unknown

Опубликована: Янв. 1, 2025

Integration of data science tools with physics-informed scaling analysis reveals new descriptors, theories, and molecular design principles.

Язык: Английский

Процитировано

0

Data-Driven Kinetic Reaction Networks for Separation Chemistry DOI
Jiyoung Lee, Logan J. Augustine, Graeme Henkelman

и другие.

Journal of Chemical Theory and Computation, Год журнала: 2025, Номер unknown

Опубликована: Май 13, 2025

Understanding complex, multistep chemical reactions at the molecular level is a major challenge whose solution would greatly benefit design and optimization of numerous processes. The separation rare-earth (4f) actinide (5f) elements an example where improving our understanding important for designing optimizing new chemistries, even with limited number observations. In this work, we leverage data-driven artificial intelligence machine-learning approaches to develop kinetic reaction networks that describe liquid-liquid extraction mechanism uranium using N,N-di-2-ethylhexyl-isobutyramide (DEHiBA). Specifically, compare contrast properties two classes models: (1) purely models are regularized chemistry-agnostic, L1 regression (2) chemistry-informed relative energies provided by quantum mechanical calculations. We observe unbiased, simple, accurate in their predictions experimental measurements when sufficient data but difficult fully constrain interpret. contrast, exhibit significantly improved interpretability consistency, providing detailed description process while achieving high accuracy through ensemble averaging. Overall, dominant species predicted be extracted into organic phase UO2(NO3)2(DEHiBA)2, agreeing slope analysis, thermodynamic modeling, EXAFS, crystal structures. This work demonstrates leveraging fundamental structure problem can lead efficient learning schemes provide both insights low computational cost.

Язык: Английский

Процитировано

0

ASKCOS: Open-Source, Data-Driven Synthesis Planning DOI
Zhengkai Tu,

Sourabh J. Choure,

Mun Hong Fong

и другие.

Accounts of Chemical Research, Год журнала: 2025, Номер unknown

Опубликована: Май 21, 2025

ConspectusThe advancement of machine learning and the availability large-scale reaction datasets have accelerated development data-driven models for computer-aided synthesis planning (CASP) in past decade. In this Account, we describe range methods that been incorporated into newest version ASKCOS, an open-source software suite developing since 2016. This ongoing effort has driven by importance bridging gap between research development, making advances available through a freely practical tool. ASKCOS integrates modules retrosynthetic planning, complementary capabilities condition prediction product prediction, several supplementary utilities with various roles planning. For developed Interactive Path Planner (IPP) user-guided search as well Tree Builder automatic two well-known tree algorithms, Monte Carlo Search (MCTS) Retro*. Four one-step retrosynthesis covering template-based template-free strategies form basis predictions can be used simultaneously to combine their advantages propose diverse suggestions. Strategies assessing feasibility proposed steps evaluating full pathways are built on top pioneering efforts made subtasks recommendation, pathway scoring clustering, outcomes including major product, impurities, site selectivity, regioselectivity. addition, also auxiliary based our work solubility quantum mechanical descriptor which provide more insight suitability solvents or hypothetical selectivity desired transformations. each these capabilities, highlight its relevance context present comprehensive overview how it is not only but other recent advancements field. We detail chemists easily interact via user-friendly interfaces. assisted hundreds medicinal, synthetic, process day-to-day tasks complementing expert decision route ideation. It belief CASP tools important part modern chemistry offer ever-increasing utility accessibility.

Язык: Английский

Процитировано

0

Multi-fidelity graph neural networks for predicting toluene/water partition coefficients DOI Creative Commons
Thomas Nevolianis, Jan G. Rittig, Alexander Mitsos

и другие.

Опубликована: Авг. 8, 2024

Accurate prediction of toluene/water partition coefficients neutral species is crucial in drug discovery and separation processes; however, data-driven modeling these remains challenging due to limited available experimental data. To address the limitation data, we apply multi-fidelity learning approaches leveraging a quantum chemical dataset (low fidelity) approximately 9000 entries generated by COSMO-RS an (high about 250 collected from literature. We explore transfer learning, feature-augmented multi-target combination with graph neural networks, validating them on two external datasets: one molecules similar training data (EXT-Zamora) more (EXT-SAMPL9). Our results show that significantly improves predictive accuracy, achieving Root-Mean-Square Error (RMSE) 0.44 logP units for EXT-Zamora, compared RMSE 0.63 single-task models. For EXT-SAMPL9 dataset, achieves 1.02 units, indicating reasonable performance even complex molecular structures. These findings highlight potential leverage improve coefficient predictions challenges posed expect applicability methods used beyond just coefficients.

Язык: Английский

Процитировано

3