Science China Chemistry, Journal Year: 2024, Volume and Issue: 67(8), P. 2461 - 2496
Published: June 26, 2024
Language: Английский
Science China Chemistry, Journal Year: 2024, Volume and Issue: 67(8), P. 2461 - 2496
Published: June 26, 2024
Language: Английский
Wiley Interdisciplinary Reviews Computational Molecular Science, Journal Year: 2022, Volume and Issue: 12(5)
Published: Feb. 18, 2022
Abstract Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation molecules a machine‐readable format, as this is central to computational chemistry. Four classes representations are introduced: string, connection table, feature‐based, computer‐learned representations. Three most significant simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), MDL molfile, which SMILES was first successfully be used conjunction variational autoencoder (VAE) yield continuous molecules. This noteworthy because allows for efficient navigation immensely large chemical space possible Since 2018, when model type published, considerable effort has been put into developing novel improved methodologies. Most, if not all, researchers community make their easily accessible on GitHub, though discussion computation time domain applicability often overlooked. Herein, we present questions consideration future believe will VAEs even more accessible. article categorized under: Data Science > Chemoinformatics
Language: Английский
Citations
205Patterns, Journal Year: 2022, Volume and Issue: 3(10), P. 100588 - 100588
Published: Oct. 1, 2022
Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks chemistry materials science. Examples include the prediction of properties, discovery new reaction pathways, or design molecules. The needs read write fluently a chemical language each these tasks. Strings common tool represent molecular graphs, most popular string representation, Smiles, has powered cheminformatics since late 1980s. However, context AI ML chemistry, Smiles several shortcomings—most pertinently, combinations symbols lead invalid results with no valid interpretation. To overcome this issue, molecules was introduced 2020 that guarantees 100% robustness: SELF-referencing embedded (Selfies). Selfies simplified enabled numerous chemistry. In perspective, we look future discuss representations, along their respective opportunities challenges. We propose 16 concrete projects robust representations. These involve extension toward domains, exciting questions at interface languages, interpretability both humans machines. hope proposals will inspire follow-up works exploiting full potential representations
Language: Английский
Citations
156Nature Reviews Chemistry, Journal Year: 2022, Volume and Issue: 6(9), P. 653 - 672
Published: Aug. 25, 2022
Language: Английский
Citations
90Chemical Science, Journal Year: 2022, Volume and Issue: 14(2), P. 226 - 244
Published: Nov. 28, 2022
This review outlines several organic chemistry tasks for which predictive machine learning models have been and can be applied.
Language: Английский
Citations
79The Journal of Chemical Physics, Journal Year: 2022, Volume and Issue: 156(8)
Published: Feb. 22, 2022
There is a perceived dichotomy between structure-based and descriptor-based molecular representations used for predictive chemistry tasks. Here, we study the performance, generalizability, explainability of quantum mechanics-augmented graph neural network (ml-QM-GNN) architecture as applied to prediction regioselectivity (classification) activation energies (regression). In our hybrid QM-augmented model architecture, are first predict set atom- bond-level reactivity descriptors derived from density functional theory calculations. These estimated combined with original representation make final prediction. We demonstrate that leads significant improvements over GNNs in not only overall accuracy but also generalization unseen compounds. Even when provided training sets couple hundred labeled data points, ml-QM-GNN outperforms other state-of-the-art architectures have been these tasks well (linear) regressions. As primary contribution this work, bridge data-driven predictions conceptual frameworks commonly gain qualitative insights into phenomena, taking advantage fact models grounded (but restricted to) QM descriptors. This effort results productive synergy science, wherein provide confirmation previous analyses, analyses turn facilitate decision-making process occurring within ml-QM-GNNs.
Language: Английский
Citations
73Journal of the American Chemical Society, Journal Year: 2022, Volume and Issue: 145(1), P. 110 - 121
Published: Dec. 27, 2022
Optimization of the catalyst structure to simultaneously improve multiple reaction objectives (e.g., yield, enantioselectivity, and regioselectivity) remains a formidable challenge. Herein, we describe machine learning workflow for multi-objective optimization catalytic reactions that employ chiral bisphosphine ligands. This was demonstrated through two sequential required in asymmetric synthesis an active pharmaceutical ingredient. To accomplish this, density functional theory-derived database >550 ligands constructed, designer chemical space mapping technique established. The protocol used classification methods identify catalysts, followed by linear regression model selectivity. led prediction validation significantly improved all outputs, suggesting general strategy can be readily implemented optimizations where performance is controlled
Language: Английский
Citations
70Nature Synthesis, Journal Year: 2023, Volume and Issue: 2(4), P. 321 - 330
Published: Jan. 30, 2023
Language: Английский
Citations
48Industrial Chemistry and Materials, Journal Year: 2023, Volume and Issue: 2(2), P. 191 - 225
Published: Sept. 29, 2023
This review systematically summarizes the research progress of functional binders in lithium-ion batteries and elucidates main functions advanced to deal with challenges high-specific-energy electrodes.
Language: Английский
Citations
43ACS Nano, Journal Year: 2024, Volume and Issue: 18(23), P. 14791 - 14840
Published: May 30, 2024
We explore the potential of nanocrystals (a term used equivalently to nanoparticles) as building blocks for nanomaterials, and current advances open challenges fundamental science developments applications. Nanocrystal assemblies are inherently multiscale, generation revolutionary material properties requires a precise understanding relationship between structure function, former being determined by classical effects latter often quantum effects. With an emphasis on theory computation, we discuss that hamper assembly strategies what extent nanocrystal represent thermodynamic equilibrium or kinetically trapped metastable states. also examine dynamic optimization protocols. Finally, promising functions examples their realization with assemblies.
Language: Английский
Citations
31Chemical Science, Journal Year: 2021, Volume and Issue: 12(36), P. 12012 - 12026
Published: Jan. 1, 2021
Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate chemical structure, bonding, stereochemistry, and conformation organic compounds. The distinct shifts in an NMR spectrum depend upon each atom's local environment are influenced by both through-bond through-space interactions with other atoms functional groups. silico prediction using quantum mechanical (QM) calculations now commonplace aiding structural assignment since spectra can be computed for several candidate structures then compared experimental values find best possible match. However, computational demands calculating multiple structural- stereo-isomers, which may typically exist as ensemble rapidly-interconverting conformations, expensive. Additionally, QM predictions themselves lack sufficient accuracy identify a correct structure. In this work, we address these shortcomings developing rapid machine learning (ML) protocol predict 1H 13C through efficient graph neural network (GNN) 3D input. Transfer data improve final model trained calculations. When tested on CHESHIRE dataset, proposed predicts observed comparable best-performing DFT functionals (1.5 ppm) around 1/6000 CPU time. An automated webserver graphical interface accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate three applications: first, use decide structure from candidates spectra, including complex stereoisomers; second, automatically detect revise incorrect shift assignments popular database, NMRShiftDB; third, descriptors determination sites electrophilic aromatic substitution.
Language: Английский
Citations
94