Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown
Published: Feb. 2, 2025
Validating the performance of exchange-correlation functionals is vital to ensure reliability density functional theory (DFT) calculations. Typically, these validations involve benchmarking data sets. Currently, such sets are usually assembled in an unprincipled manner, suffering from uncontrolled chemical bias, and limiting transferability results a broader space. In this work, data-efficient solution based on active learning explored address issue. Focusing─as proof principle─on pericyclic reactions, we start BH9 set design reaction space around initial by combinatorially combining templates substituents. Next, surrogate model trained predict standard deviation activation energies computed across selection 20 distinct DFT functionals. With model, designed explored, enabling identification challenging regions, i.e., regions with large divergence, for which representative reactions subsequently acquired as additional training points. Remarkably, it turns out that function mapping molecular structure divergence readily learnable; convergence reached upon acquisition fewer than 100 reactions. our final updated more challenging─and arguably representative─pericyclic curated, demonstrate has changed significantly compared original subset.
Language: Английский