SMART DATA FACTORY: VOLUNTEER COMPUTING PLATFORM FOR ACTIVE LEARNING-DRIVEN MOLECULAR DATA ACQUISITION DOI Creative Commons
Tsolak Ghukasyan, Vahagn Altunyan,

Aram Bughdaryan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 25, 2024

Abstract This paper presents the Smart Distributed Data Factory (SDDF), an AI-driven distributed computing platform designed to address challenges in drug discovery by creating comprehensive datasets of molecular conformations and their properties. SDDF uses volunteer computing, leveraging processing power personal computers worldwide accelerate quantum chemistry (DFT) calculations. To tackle vast chemical space limited high-quality data, employs ensemble machine learning models predict properties selectively choose most challenging data points for further DFT The also generates new using dynamics with forces derived from these models. makes several contributions: calculations; active framework constructing a dataset conformations; large public diverse ENAMINE molecules calculated energies; state-of-the-art ML accurate energy prediction. was generated validate approach reducing need extensive With its strict scaffold split, can be used training benchmarking By combining learning, chemistry, offers scalable, cost-effective solution developing ultimately accelerating discovery.

Language: Английский

MOLPIPx: An end-to-end differentiable package for permutationally invariant polynomials in Python and Rust DOI

Manuel S. Drehwald,

Asma Jamali, Rodrigo A. Vargas–Hernández

et al.

The Journal of Chemical Physics, Journal Year: 2025, Volume and Issue: 162(8)

Published: Feb. 28, 2025

In this work, we present MOLPIPx, a versatile library designed to seamlessly integrate permutationally invariant polynomials with modern machine learning frameworks, enabling the efficient development of linear models, neural networks, and Gaussian process models. These methodologies are widely employed for parameterizing potential energy surfaces across diverse molecular systems. MOLPIPx leverages two powerful automatic differentiation engines—JAX EnzymeAD-Rust—to facilitate computation gradients higher-order derivatives, which essential tasks such as force field dynamic simulations. is available at https://github.com/ChemAI-Lab/molpipx.

Language: Английский

Citations

1

Revolutionizing Molecular Design for Innovative Therapeutic Applications through Artificial Intelligence DOI Creative Commons

Ahrum Son,

Jongham Park, Woojin Kim

et al.

Molecules, Journal Year: 2024, Volume and Issue: 29(19), P. 4626 - 4626

Published: Sept. 29, 2024

The field of computational protein engineering has been transformed by recent advancements in machine learning, artificial intelligence, and molecular modeling, enabling the design proteins with unprecedented precision functionality. Computational methods now play a crucial role enhancing stability, activity, specificity for diverse applications biotechnology medicine. Techniques such as deep reinforcement transfer learning have dramatically improved structure prediction, optimization binding affinities, enzyme design. These innovations streamlined process allowing rapid generation targeted libraries, reducing experimental sampling, rational tailored properties. Furthermore, integration approaches high-throughput techniques facilitated development multifunctional novel therapeutics. However, challenges remain bridging gap between predictions validation addressing ethical concerns related to AI-driven This review provides comprehensive overview current state future directions engineering, emphasizing their transformative potential creating next-generation biologics advancing synthetic biology.

Language: Английский

Citations

5

Random Sampling Versus Active Learning Algorithms for Machine Learning Potentials of Quantum Liquid Water DOI
Nore Stolte, János Daru, Harald Forbert

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 14, 2025

Training accurate machine learning potentials requires electronic structure data comprehensively covering the configurational space of system interest. As construction this is computationally demanding, many schemes for identifying most important structures have been proposed. Here, we compare performance high-dimensional neural network (HDNNPs) quantum liquid water at ambient conditions trained to sets constructed using random sampling as well various flavors active based on query by committee. Contrary common understanding learning, find that a given set size, leads smaller test errors not included in training process. In our analysis, show can be related small energy offsets caused bias added which overcome instead correlations an error measure invariant such shifts. Still, all HDNNPs yield very similar and structural properties water, demonstrates robustness procedure with respect algorithm even when few 200 structures. However, preliminary potentials, reasonable initial avoid unnecessary extension covered configuration less relevant regions.

Language: Английский

Citations

0

AI-empowered digital design of zeolites: Progress, challenges, and perspectives DOI Creative Commons
Mengfan Wu, Shiyi Zhang, Jie Ren

et al.

APL Materials, Journal Year: 2025, Volume and Issue: 13(2)

Published: Feb. 1, 2025

The rise of artificial intelligence (AI) as a powerful research tool in materials science has been extensively acknowledged. Particularly, exploring zeolites with target properties is vital significance for industrial applications, integrating AI technologies into zeolite design undoubtedly brings immense promise the advancements this field. Here, we provide comprehensive review AI-empowered digital zeolites. It showcases state-of-the-art progress predicting zeolite-related properties, employing machine learning potentials simulations, using generative models inverse design, and aiding experimental synthesis challenges perspectives are also discussed, emphasizing new opportunities at intersection This expected to offer crucial guidance advancing innovations through future.

Language: Английский

Citations

0

Smart distributed data factory volunteer computing platform for active learning-driven molecular data acquisition DOI Creative Commons
Tsolak Ghukasyan, Vahagn Altunyan,

Aram Bughdaryan

et al.

Scientific Reports, Journal Year: 2025, Volume and Issue: 15(1)

Published: Feb. 28, 2025

Abstract This paper presents the smart distributed data factory (SDDF), an AI-driven computing platform designed to address challenges in drug discovery by creating comprehensive datasets of molecular conformations and their properties. SDDF uses volunteer computing, leveraging processing power personal computers worldwide accelerate quantum chemistry (DFT) calculations. To tackle vast chemical space limited high-quality data, employs ensemble machine learning (ML) models predict properties selectively choose most challenging points for further DFT The also generates new using dynamics with forces derived from these models. makes several contributions: calculations; active framework constructing a dataset conformations; large public diverse ENAMINE molecules calculated energies; ML accurate energy prediction. was generated validate approach reducing need extensive With its strict scaffold split, can be used training benchmarking By combining learning, chemistry, offers scalable, cost-effective solution developing ultimately accelerating discovery.

Language: Английский

Citations

0

Transferable machine learning model for multi-target nanoscale simulations in hydrogen-carbon system from crystal to amorphous DOI Creative Commons
Weiqi Chen,

Zhiyue Xu,

Kang Wang

et al.

npj Computational Materials, Journal Year: 2025, Volume and Issue: 11(1)

Published: May 3, 2025

Language: Английский

Citations

0

Biphasic solvents for post-combustion CO2 capture from natural gas flue Gas DOI

Alexander I. Wiechert,

Gang Seob Jung, Diāna Stamberga

et al.

Chemical Engineering Journal, Journal Year: 2025, Volume and Issue: unknown, P. 163351 - 163351

Published: May 1, 2025

Language: Английский

Citations

0

Atomic Energy Accuracy of Neural Network Potentials: Harnessing Pretraining and Transfer Learning DOI
Gang Seob Jung

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: May 5, 2025

Machine learning-based interatomic potentials (MLIPs) have transformed the prediction of potential energy surfaces (PESs), achieving accuracy comparable to ab initio calculations. However, atomic predictions, often assumed lack physical meaning, remain underexplored. In this study, we demonstrate that inaccuracies in predictions reduce robustness and transferability Neural Network Potentials (NNPs) error can be masked total due cancellation. We validate finding using challenging configurations involving deformation failure under tensile loading. By pretraining empirical applying transfer learning with density functional theory (DFT) data, achieve notable improvements energy, forces, stress predictions. Furthermore, approach enhances NNPs, emphasizing importance developing high-quality reliable MLIPs.

Language: Английский

Citations

0

Enhancing High-Fidelity Neural Network Potentials through Low-Fidelity Sampling DOI Creative Commons
Gang Seob Jung

Published: May 23, 2024

The efficacy of neural network potentials (NNPs) critically depends on the quality configurational datasets used for training. Prior research using empirical has shown that well-selected liquid-solid transitional configurations a metallic system can be translated to other systems. This study demonstrates such validated relabeled density functional theory (DFT) calculations, thereby enhancing development high-fidelity NNPs. Training strategies and sampling approaches are efficiently assessed subsequently via DFT in highly parallelized fashion NNP Our results reveal relying solely energy force training is inadequate prevent overfitting, highlighting necessity incorporating stress terms into loss functions. To optimize involving terms, we propose employing transfer learning fine-tune weights, ensuring potential surface smooth these quantities composed derivatives. approach markedly improves accuracy elastic constants derived from simulations both potential-based DFT-based NNP. Overall, this offers significant insights leveraging expedite reliable robust NNPs at level.

Language: Английский

Citations

1

Polymers Simulation using Machine Learning Interatomic Potentials DOI
Teng Long, Li Jia,

Chenlu Wang

et al.

Polymer, Journal Year: 2024, Volume and Issue: 308, P. 127416 - 127416

Published: July 17, 2024

Language: Английский

Citations

1