Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction DOI
Yan Liu, Yiming Mo, Youwei Cheng

et al.

Journal of Chemical Information and Modeling, Journal Year: 2024, Volume and Issue: 64(21), P. 8131 - 8141

Published: Oct. 23, 2024

The prediction of the thermodynamic and kinetic properties elementary reactions has shown rapid improvement due to implementation deep learning (DL) methods. While various studies have reported success in predicting reaction properties, quantification uncertainty seldom been investigated, thus compromising confidence using these predicted practical applications. Here, we integrated graph convolutional neural networks (GCNN) with three techniques, including ensemble, Monte Carlo (MC)-dropout, evidential learning, provide insights into utility. ensemble model outperforms others accuracy shows highest reliability estimating across all property data sets. We also verified that showed a satisfactory capability recognizing epistemic aleatoric uncertainties. Additionally, adopted Tree Search method for extracting explainable substructures, providing chemical explanation DL corresponding Finally, demonstrate utility qualification applications, performed an uncertainty-guided calibration DL-constructed model, which achieved 25% higher hit ratio identifying dominant pathways compared without guidance.

Language: Английский

An algorithmic framework for synthetic cost-aware decision making in molecular design DOI
Jenna C. Fromer,

Connor W. Coley

Nature Computational Science, Journal Year: 2024, Volume and Issue: 4(6), P. 440 - 450

Published: June 17, 2024

Language: Английский

Citations

10

Graph-Based Deep Learning Models for Thermodynamic Property Prediction: The Interplay between Target Definition, Data Distribution, Featurization, and Model Architecture DOI
Bowen Deng, Thijs Stuyver

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 9, 2025

In this contribution, we examine the interplay between target definition, data distribution, featurization approaches, and model architectures on graph-based deep learning models for thermodynamic property prediction. Through consideration of five curated sets, exhibiting diversity in elemental composition, multiplicity, charge state, size, impact each these factors accuracy. We observe that i.e., using formation instead atomization energy/enthalpy, is a decisive factor, so careful selection approach. Our attempts at directly modifying result more modest, though not negligible, accuracy gains. Remarkably, molecule-level predictions tend to outperform atom-level increment predictions, contrast previous findings. Overall, work paves way toward development robust with universal capabilities, can reach excellent across sets compound domains.

Language: Английский

Citations

0

Improving the Reliability of, and Confidence in, DFT Functional Benchmarking through Active Learning DOI
Javier Emilio Alfonso Ramos, Carlo Adamo, Éric Brémond

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 2, 2025

Validating the performance of exchange-correlation functionals is vital to ensure reliability density functional theory (DFT) calculations. Typically, these validations involve benchmarking data sets. Currently, such sets are usually assembled in an unprincipled manner, suffering from uncontrolled chemical bias, and limiting transferability results a broader space. In this work, data-efficient solution based on active learning explored address issue. Focusing─as proof principle─on pericyclic reactions, we start BH9 set design reaction space around initial by combinatorially combining templates substituents. Next, surrogate model trained predict standard deviation activation energies computed across selection 20 distinct DFT functionals. With model, designed explored, enabling identification challenging regions, i.e., regions with large divergence, for which representative reactions subsequently acquired as additional training points. Remarkably, it turns out that function mapping molecular structure divergence readily learnable; convergence reached upon acquisition fewer than 100 reactions. our final updated more challenging─and arguably representative─pericyclic curated, demonstrate has changed significantly compared original subset.

Language: Английский

Citations

0

Repurposing quantum chemical descriptor datasets for on-the-fly generation of informative reaction representations: application to hydrogen atom transfer reactions DOI Creative Commons
Javier Emilio Alfonso Ramos, Rebecca M. Neeser, Thijs Stuyver

et al.

Digital Discovery, Journal Year: 2024, Volume and Issue: 3(5), P. 919 - 931

Published: Jan. 1, 2024

In this work, we explore how existing datasets of quantum chemical properties can be repurposed to build data-efficient downstream ML models, with a particular focus on predicting the activation energy hydrogen atom transfer reactions.

Language: Английский

Citations

2

Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction DOI
Yan Liu, Yiming Mo, Youwei Cheng

et al.

Journal of Chemical Information and Modeling, Journal Year: 2024, Volume and Issue: 64(21), P. 8131 - 8141

Published: Oct. 23, 2024

The prediction of the thermodynamic and kinetic properties elementary reactions has shown rapid improvement due to implementation deep learning (DL) methods. While various studies have reported success in predicting reaction properties, quantification uncertainty seldom been investigated, thus compromising confidence using these predicted practical applications. Here, we integrated graph convolutional neural networks (GCNN) with three techniques, including ensemble, Monte Carlo (MC)-dropout, evidential learning, provide insights into utility. ensemble model outperforms others accuracy shows highest reliability estimating across all property data sets. We also verified that showed a satisfactory capability recognizing epistemic aleatoric uncertainties. Additionally, adopted Tree Search method for extracting explainable substructures, providing chemical explanation DL corresponding Finally, demonstrate utility qualification applications, performed an uncertainty-guided calibration DL-constructed model, which achieved 25% higher hit ratio identifying dominant pathways compared without guidance.

Language: Английский

Citations

0