Improving the Reliability of, and Confidence in, DFT Functional Benchmarking through Active Learning DOI
Javier Emilio Alfonso Ramos, Carlo Adamo, Éric Brémond

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 2, 2025

Validating the performance of exchange-correlation functionals is vital to ensure reliability density functional theory (DFT) calculations. Typically, these validations involve benchmarking data sets. Currently, such sets are usually assembled in an unprincipled manner, suffering from uncontrolled chemical bias, and limiting transferability results a broader space. In this work, data-efficient solution based on active learning explored address issue. Focusing─as proof principle─on pericyclic reactions, we start BH9 set design reaction space around initial by combinatorially combining templates substituents. Next, surrogate model trained predict standard deviation activation energies computed across selection 20 distinct DFT functionals. With model, designed explored, enabling identification challenging regions, i.e., regions with large divergence, for which representative reactions subsequently acquired as additional training points. Remarkably, it turns out that function mapping molecular structure divergence readily learnable; convergence reached upon acquisition fewer than 100 reactions. our final updated more challenging─and arguably representative─pericyclic curated, demonstrate has changed significantly compared original subset.

Language: Английский

Improving the Reliability of, and Confidence in, DFT Functional Benchmarking through Active Learning DOI
Javier Emilio Alfonso Ramos, Carlo Adamo, Éric Brémond

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 2, 2025

Validating the performance of exchange-correlation functionals is vital to ensure reliability density functional theory (DFT) calculations. Typically, these validations involve benchmarking data sets. Currently, such sets are usually assembled in an unprincipled manner, suffering from uncontrolled chemical bias, and limiting transferability results a broader space. In this work, data-efficient solution based on active learning explored address issue. Focusing─as proof principle─on pericyclic reactions, we start BH9 set design reaction space around initial by combinatorially combining templates substituents. Next, surrogate model trained predict standard deviation activation energies computed across selection 20 distinct DFT functionals. With model, designed explored, enabling identification challenging regions, i.e., regions with large divergence, for which representative reactions subsequently acquired as additional training points. Remarkably, it turns out that function mapping molecular structure divergence readily learnable; convergence reached upon acquisition fewer than 100 reactions. our final updated more challenging─and arguably representative─pericyclic curated, demonstrate has changed significantly compared original subset.

Language: Английский

Citations

0