Generator of Neural Network Potential for Molecular Dynamics: Constructing Robust and Accurate Potentials with Active Learning for Nanosecond-Scale Simulations DOI

Naoki Matsumura,

Yuta Yoshimoto,

Tamio Yamazaki

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: April 7, 2025

Neural network potentials (NNPs) enable large-scale molecular dynamics (MD) simulations of systems containing >10,000 atoms with the accuracy comparable to ab initio methods and play a crucial role in material studies. Although NNPs are valuable for short-duration MD simulations, maintaining stability long-duration remains challenging due uncharted regions potential energy surface (PES). Currently, there is no effective methodology address this issue. To overcome challenge, we developed an automatic generator robust accurate based on active learning (AL) framework. This provides fully integrated solution encompassing initial data set creation, NNP training, evaluation, sampling additional structures, screening, labeling. Crucially, our approach uses strategy that focuses generating unstable structures short interatomic distances, combined screening efficiently samples these configurations distances structural features. greatly enhances simulation stability, enabling nanosecond-scale simulations. We evaluated performance terms its physical properties by applying it liquid propylene glycol (PG) polyethylene (PEG). The generated stable 20 ns. predicted properties, such as density self-diffusion coefficient, show excellent agreement experimental values. work represents remarkable advance generation organic materials, paving way complex systems.

Language: Английский

Representations of Materials for Machine Learning DOI Creative Commons

James Damewood,

Jessica Karaguesian,

Jaclyn R. Lunger

et al.

Annual Review of Materials Research, Journal Year: 2023, Volume and Issue: 53(1), P. 399 - 426

Published: April 18, 2023

High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by the relations between composition, structure, properties exploiting such for design. However, build these connections, must be translated into numerical form, called representation, that can processed an ML model. Data sets in vary format (ranging from images spectra), size, fidelity. Predictive models scope interest. Here, we review context-dependent strategies constructing representations enable use as inputs or outputs models. Furthermore, discuss how modern techniques learn transfer chemical physical information tasks. Finally, outline high-impact questions not been fully resolved thus require further investigation.

Language: Английский

Citations

51

Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks DOI Creative Commons
Sergio Pablo‐García, Santiago Morandi, Rodrigo A. Vargas–Hernández

et al.

Nature Computational Science, Journal Year: 2023, Volume and Issue: 3(5), P. 433 - 442

Published: May 1, 2023

Modeling in heterogeneous catalysis requires the extensive evaluation of energy molecules adsorbed on surfaces. This is done via density functional theory but for large organic it enormous computational time, compromising viability approach. Here we present GAME-Net, a graph neural network to quickly evaluate adsorption energy. GAME-Net trained well-balanced chemically diverse dataset with C

Language: Английский

Citations

47

Exploiting redundancy in large materials datasets for efficient machine learning with less data DOI Creative Commons
Kangming Li, Daniel Persaud, Kamal Choudhary

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Nov. 10, 2023

Extensive efforts to gather materials data have largely overlooked potential redundancy. In this study, we present evidence of a significant degree redundancy across multiple large datasets for various material properties, by revealing that up 95% can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant is related over-represented types and does not mitigate the severe performance degradation out-of-distribution samples. addition, show uncertainty-based active algorithms construct much smaller but equally informative datasets. We discuss effectiveness in improving robustness provide insights into efficient acquisition training. This work challenges "bigger better" mentality calls attention information richness rather than narrow emphasis volume.

Language: Английский

Citations

44

The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture DOI Creative Commons
Anuroop Sriram, Sihoon Choi, Xiaohan Yu

et al.

ACS Central Science, Journal Year: 2024, Volume and Issue: 10(5), P. 923 - 941

Published: May 1, 2024

Direct air capture (DAC) of CO

Language: Английский

Citations

23

Unlocking the potential: machine learning applications in electrocatalyst design for electrochemical hydrogen energy transformation DOI Creative Commons
Rui Ding, Junhong Chen, Yuxin Chen

et al.

Chemical Society Reviews, Journal Year: 2024, Volume and Issue: unknown

Published: Jan. 1, 2024

This review explores machine learning's impact on designing electrocatalysts for hydrogen energy, detailing how it transcends traditional methods by utilizing experimental and computational data to enhance electrocatalyst efficiency discovery.

Language: Английский

Citations

20

A reactive neural network framework for water-loaded acidic zeolites DOI Creative Commons
Andreas Erlebach, Martin Šípka, Indranil Saha

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: May 17, 2024

Language: Английский

Citations

17

Machine learning for CO2 capture and conversion: A review DOI Creative Commons
Sung Eun Jerng, Yang Jeong Park, Ju Li

et al.

Energy and AI, Journal Year: 2024, Volume and Issue: 16, P. 100361 - 100361

Published: March 30, 2024

Coupled electrochemical systems for the direct capture and conversion of CO2 have garnered significant attention owing to their potential enhance energy- cost-efficiency by circumventing amine regeneration step. However, optimizing coupled system is more challenging than handling separated because its complexity, caused incorporation solvent heterogeneous catalysts. Nevertheless, deployment machine learning can be immensely beneficial, reducing both time cost ability simulate describe complex with numerous parameters involved. In this review, we summarized techniques employed in development solvents such as ionic liquids, well To optimize a system, these two separately developed will need combined via future.

Language: Английский

Citations

16

JARVIS-Leaderboard: a large scale benchmark of materials design methods DOI Creative Commons
Kamal Choudhary, Daniel Wines, Kangming Li

et al.

npj Computational Materials, Journal Year: 2024, Volume and Issue: 10(1)

Published: May 7, 2024

Abstract Lack of rigorous reproducibility and validation are significant hurdles for scientific development across many fields. Materials science, in particular, encompasses a variety experimental theoretical approaches that require careful benchmarking. Leaderboard efforts have been developed previously to mitigate these issues. However, comprehensive comparison benchmarking on an integrated platform with multiple data modalities perfect defect materials is still lacking. This work introduces JARVIS-Leaderboard, open-source community-driven facilitates enhances reproducibility. The allows users set up benchmarks custom tasks enables contributions the form dataset, code, meta-data submissions. We cover following design categories: Artificial Intelligence (AI), Electronic Structure (ES), Force-fields (FF), Quantum Computation (QC), Experiments (EXP). For AI, we several types input data, including atomic structures, atomistic images, spectra, text. ES, consider ES approaches, software packages, pseudopotentials, materials, properties, comparing results experiment. FF, compare material property predictions. QC, benchmark Hamiltonian simulations using various quantum algorithms circuits. Finally, experiments, use inter-laboratory approach establish benchmarks. There 1281 274 152 methods more than 8 million points, leaderboard continuously expanding. JARVIS-Leaderboard available at website: https://pages.nist.gov/jarvis_leaderboard/

Language: Английский

Citations

16

Big Data in a Nano World: A Review on Computational, Data-Driven Design of Nanomaterials Structures, Properties, and Synthesis DOI Creative Commons
Ruoxi Yang, Caitlin A. McCandler, Oxana Andriuc

et al.

ACS Nano, Journal Year: 2022, Volume and Issue: 16(12), P. 19873 - 19891

Published: Nov. 15, 2022

The recent rise of computational, data-driven research has significant potential to accelerate materials discovery. Automated workflows and databases are being rapidly developed, contributing high-throughput data bulk that growing in quantity complexity, allowing for correlation between structural-chemical features functional properties. In contrast, computational approaches still relatively rare nanomaterials discovery due the rapid scaling cost finite systems. However, distinct behaviors at nanoscale as compared parent vast tunability space with respect dimensionality morphology motivate development sets nanometric materials. this review, we discuss progress two aspects: design guided synthesis, including commonly used metrics designing properties predicting synthesis routes. More importantly, a result nanosizing implications research. Finally, share our perspectives on future directions extending current into nano realm.

Language: Английский

Citations

43

AdsorbML: a leap in efficiency for adsorption energy calculations using generalizable machine learning potentials DOI Creative Commons

Janice Lan,

Aini Palizhati,

Muhammed Shuaibi

et al.

npj Computational Materials, Journal Year: 2023, Volume and Issue: 9(1)

Published: Sept. 22, 2023

Abstract Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range applications. A common task for many computational methods need to accurately compute adsorption energy adsorbate and catalyst surface interest. Traditionally, identification low-energy adsorbate-surface configurations relies on heuristic researcher intuition. As desire perform high-throughput screening increases, it becomes challenging use heuristics intuition alone. In this paper, we demonstrate machine learning potentials can be leveraged identify more efficiently. Our algorithm provides spectrum trade-offs between accuracy efficiency, with one balanced option finding lowest configuration 87.36% time, while achieving ~2000× speedup computation. To standardize benchmarking, introduce Open Catalyst Dense dataset containing nearly 1000 diverse surfaces ~100,000 unique configurations.

Language: Английский

Citations

33