AI for organic and polymer synthesis DOI

Hong Xin,

Qi Yang, Kuangbiao Liao

et al.

Science China Chemistry, Journal Year: 2024, Volume and Issue: 67(8), P. 2461 - 2496

Published: June 26, 2024

Language: Английский

A review of molecular representation in the age of machine learning DOI Creative Commons
Daniel Wigh, Jonathan M. Goodman, Alexei A. Lapkin

et al.

Wiley Interdisciplinary Reviews Computational Molecular Science, Journal Year: 2022, Volume and Issue: 12(5)

Published: Feb. 18, 2022

Abstract Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation molecules a machine‐readable format, as this is central to computational chemistry. Four classes representations are introduced: string, connection table, feature‐based, computer‐learned representations. Three most significant simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), MDL molfile, which SMILES was first successfully be used conjunction variational autoencoder (VAE) yield continuous molecules. This noteworthy because allows for efficient navigation immensely large chemical space possible Since 2018, when model type published, considerable effort has been put into developing novel improved methodologies. Most, if not all, researchers community make their easily accessible on GitHub, though discussion computation time domain applicability often overlooked. Herein, we present questions consideration future believe will VAEs even more accessible. article categorized under: Data Science > Chemoinformatics

Language: Английский

Citations

205

SELFIES and the future of molecular string representations DOI Creative Commons
Mario Krenn, Qianxiang Ai, Senja Barthel

et al.

Patterns, Journal Year: 2022, Volume and Issue: 3(10), P. 100588 - 100588

Published: Oct. 1, 2022

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks chemistry materials science. Examples include the prediction of properties, discovery new reaction pathways, or design molecules. The needs read write fluently a chemical language each these tasks. Strings common tool represent molecular graphs, most popular string representation, Smiles, has powered cheminformatics since late 1980s. However, context AI ML chemistry, Smiles several shortcomings—most pertinently, combinations symbols lead invalid results with no valid interpretation. To overcome this issue, molecules was introduced 2020 that guarantees 100% robustness: SELF-referencing embedded (Selfies). Selfies simplified enabled numerous chemistry. In perspective, we look future discuss representations, along their respective opportunities challenges. We propose 16 concrete projects robust representations. These involve extension toward domains, exciting questions at interface languages, interpretability both humans machines. hope proposals will inspire follow-up works exploiting full potential representations

Language: Английский

Citations

156

Extending machine learning beyond interatomic potentials for predicting molecular properties DOI
Nikita Fedik, R.I. Zubatyuk, Maksim Kulichenko

et al.

Nature Reviews Chemistry, Journal Year: 2022, Volume and Issue: 6(9), P. 653 - 672

Published: Aug. 25, 2022

Language: Английский

Citations

90

Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery DOI Creative Commons
Zhengkai Tu, Thijs Stuyver,

Connor W. Coley

et al.

Chemical Science, Journal Year: 2022, Volume and Issue: 14(2), P. 226 - 244

Published: Nov. 28, 2022

This review outlines several organic chemistry tasks for which predictive machine learning models have been and can be applied.

Language: Английский

Citations

79

Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability DOI Creative Commons
Thijs Stuyver,

Connor W. Coley

The Journal of Chemical Physics, Journal Year: 2022, Volume and Issue: 156(8)

Published: Feb. 22, 2022

There is a perceived dichotomy between structure-based and descriptor-based molecular representations used for predictive chemistry tasks. Here, we study the performance, generalizability, explainability of quantum mechanics-augmented graph neural network (ml-QM-GNN) architecture as applied to prediction regioselectivity (classification) activation energies (regression). In our hybrid QM-augmented model architecture, are first predict set atom- bond-level reactivity descriptors derived from density functional theory calculations. These estimated combined with original representation make final prediction. We demonstrate that leads significant improvements over GNNs in not only overall accuracy but also generalization unseen compounds. Even when provided training sets couple hundred labeled data points, ml-QM-GNN outperforms other state-of-the-art architectures have been these tasks well (linear) regressions. As primary contribution this work, bridge data-driven predictions conceptual frameworks commonly gain qualitative insights into phenomena, taking advantage fact models grounded (but restricted to) QM descriptors. This effort results productive synergy science, wherein provide confirmation previous analyses, analyses turn facilitate decision-making process occurring within ml-QM-GNNs.

Language: Английский

Citations

73

Data-Driven Multi-Objective Optimization Tactics for Catalytic Asymmetric Reactions Using Bisphosphine Ligands DOI

Jordan J. Dotson,

Lucy van Dijk, Jacob C. Timmerman

et al.

Journal of the American Chemical Society, Journal Year: 2022, Volume and Issue: 145(1), P. 110 - 121

Published: Dec. 27, 2022

Optimization of the catalyst structure to simultaneously improve multiple reaction objectives (e.g., yield, enantioselectivity, and regioselectivity) remains a formidable challenge. Herein, we describe machine learning workflow for multi-objective optimization catalytic reactions that employ chiral bisphosphine ligands. This was demonstrated through two sequential required in asymmetric synthesis an active pharmaceutical ingredient. To accomplish this, density functional theory-derived database >550 ligands constructed, designer chemical space mapping technique established. The protocol used classification methods identify catalysts, followed by linear regression model selectivity. led prediction validation significantly improved all outputs, suggesting general strategy can be readily implemented optimizations where performance is controlled

Language: Английский

Citations

70

Enantioselectivity prediction of pallada-electrocatalysed C–H activation using transition state knowledge in machine learning DOI
Li‐Cheng Xu, Johanna Frey, Xiaoyan Hou

et al.

Nature Synthesis, Journal Year: 2023, Volume and Issue: 2(4), P. 321 - 330

Published: Jan. 30, 2023

Language: Английский

Citations

48

Design of functional binders for high-specific-energy lithium-ion batteries: from molecular structure to electrode properties DOI Creative Commons
Qin Tian,

Haoyi Yang,

Quan Li

et al.

Industrial Chemistry and Materials, Journal Year: 2023, Volume and Issue: 2(2), P. 191 - 225

Published: Sept. 29, 2023

This review systematically summarizes the research progress of functional binders in lithium-ion batteries and elucidates main functions advanced to deal with challenges high-specific-energy electrodes.

Language: Английский

Citations

43

Nanocrystal Assemblies: Current Advances and Open Problems DOI
Carlos L. Bassani, Greg van Anders, Uri Banin

et al.

ACS Nano, Journal Year: 2024, Volume and Issue: 18(23), P. 14791 - 14840

Published: May 30, 2024

We explore the potential of nanocrystals (a term used equivalently to nanoparticles) as building blocks for nanomaterials, and current advances open challenges fundamental science developments applications. Nanocrystal assemblies are inherently multiscale, generation revolutionary material properties requires a precise understanding relationship between structure function, former being determined by classical effects latter often quantum effects. With an emphasis on theory computation, we discuss that hamper assembly strategies what extent nanocrystal represent thermodynamic equilibrium or kinetically trapped metastable states. also examine dynamic optimization protocols. Finally, promising functions examples their realization with assemblies.

Language: Английский

Citations

31

Real-time prediction of 1H and 13C chemical shifts with DFT accuracy using a 3D graph neural network DOI Creative Commons
Yanfei Guan,

S. V. Shree Sowndarya,

Liliana C. Gallegos

et al.

Chemical Science, Journal Year: 2021, Volume and Issue: 12(36), P. 12012 - 12026

Published: Jan. 1, 2021

Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate chemical structure, bonding, stereochemistry, and conformation organic compounds. The distinct shifts in an NMR spectrum depend upon each atom's local environment are influenced by both through-bond through-space interactions with other atoms functional groups. silico prediction using quantum mechanical (QM) calculations now commonplace aiding structural assignment since spectra can be computed for several candidate structures then compared experimental values find best possible match. However, computational demands calculating multiple structural- stereo-isomers, which may typically exist as ensemble rapidly-interconverting conformations, expensive. Additionally, QM predictions themselves lack sufficient accuracy identify a correct structure. In this work, we address these shortcomings developing rapid machine learning (ML) protocol predict 1H 13C through efficient graph neural network (GNN) 3D input. Transfer data improve final model trained calculations. When tested on CHESHIRE dataset, proposed predicts observed comparable best-performing DFT functionals (1.5 ppm) around 1/6000 CPU time. An automated webserver graphical interface accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate three applications: first, use decide structure from candidates spectra, including complex stereoisomers; second, automatically detect revise incorrect shift assignments popular database, NMRShiftDB; third, descriptors determination sites electrophilic aromatic substitution.

Language: Английский

Citations

94