Fast uncertainty estimates in deep learning interatomic potentials DOI Open Access
Albert Zhu, Simon Batzner, Albert Musaelian

и другие.

The Journal of Chemical Physics, Год журнала: 2023, Номер 158(16)

Опубликована: Апрель 27, 2023

Deep learning has emerged as a promising paradigm to give access highly accurate predictions of molecular and material properties. A common short-coming shared by current approaches, however, is that neural networks only point estimates their do not come with predictive uncertainties associated these estimates. Existing uncertainty quantification efforts have primarily leveraged the standard deviation across an ensemble independently trained networks. This incurs large computational overhead in both training prediction, resulting order-of-magnitude more expensive predictions. Here, we propose method estimate based on single network without need for ensemble. allows us obtain virtually no additional over inference. We demonstrate quality matches those obtained from deep ensembles. further examine our methods ensembles configuration space test system compare potential energy surface. Finally, study efficacy active setting find results match ensemble-based strategy at reduced cost.

Язык: Английский

Leveraging molecular structure and bioactivity with chemical language models for de novo drug design DOI Creative Commons
Michaël Moret, Irène Pachón-Angona,

Leandro Cotos

и другие.

Nature Communications, Год журнала: 2023, Номер 14(1)

Опубликована: Янв. 7, 2023

Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs additionally leverage the bioactivity information available training compounds. To computationally design ligands phosphoinositide 3-kinase gamma (PI3Kγ), collection virtual molecules was created with generative CLM. This compound library refined using CLM-based classifier prediction. second CLM pretrained patented structures and fine-tuned known PI3Kγ ligands. Several computer-generated designs were commercially available, enabling fast prescreening preliminary experimental validation. A new ligand sub-micromolar activity identified, highlighting method's scaffold-hopping potential. Chemical synthesis biochemical testing two top-ranked designed their derivatives corroborated model's ability to generate medium low nanomolar hit-to-lead expansion. The most potent compounds led pronounced inhibition PI3K-dependent Akt phosphorylation in medulloblastoma cell model, demonstrating efficacy PI3K/Akt pathway repression human tumor cells. results positively advocate screening activity-focused design.

Язык: Английский

Процитировано

97

Characterizing Uncertainty in Machine Learning for Chemistry DOI Creative Commons
Esther Heid, Charles J. McGill, Florence H. Vermeire

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2023, Номер 63(13), С. 4012 - 4029

Опубликована: Июнь 20, 2023

Characterizing uncertainty in machine learning models has recently gained interest the context of reliability, robustness, safety, and active learning. Here, we separate total into contributions from noise data (aleatoric) shortcomings model (epistemic), further dividing epistemic bias variance contributions. We systematically address influence noise, bias, chemical property predictions, where diverse nature target properties vast space give rise to many different distinct sources prediction error. demonstrate that error can each be significant contexts must individually addressed during development. Through controlled experiments on sets molecular properties, show important trends performance associated with level set, size architecture, molecule representation, ensemble size, set splitting. In particular, 1) test limit a model's observed when actual is much better, 2) using size-extensive aggregation structures crucial for extensive prediction, 3) ensembling reliable tool quantification improvement specifically contribution variance. develop general guidelines how improve an underperforming falling contexts.

Язык: Английский

Процитировано

43

Graph neural networks DOI
Gabriele Corso, H. Stärk,

Stefanie Jegelka

и другие.

Nature Reviews Methods Primers, Год журнала: 2024, Номер 4(1)

Опубликована: Март 7, 2024

Язык: Английский

Процитировано

43

Pareto optimization to accelerate multi-objective virtual screening DOI Creative Commons
Jenna C. Fromer, David Graff,

Connor W. Coley

и другие.

Digital Discovery, Год журнала: 2024, Номер 3(3), С. 467 - 481

Опубликована: Янв. 1, 2024

Pareto optimization is suited to multi-objective problems when the relative importance of objectives not known a priori. We report an open source tool accelerate docking-based virtual screening with strong empirical performance.

Язык: Английский

Процитировано

16

Deep learning and knowledge-based methods for computer-aided molecular design—toward a unified approach: State-of-the-art and future directions DOI
Abdulelah S. Alshehri, Rafiqul Gani, Fengqi You

и другие.

Computers & Chemical Engineering, Год журнала: 2020, Номер 141, С. 107005 - 107005

Опубликована: Июль 2, 2020

Язык: Английский

Процитировано

100

Accelerating high-throughput virtual screening through molecular pool-based active learning DOI Creative Commons
David Graff,

Eugene I. Shakhnovich,

Connor W. Coley

и другие.

Chemical Science, Год журнала: 2021, Номер 12(22), С. 7866 - 7881

Опубликована: Янв. 1, 2021

Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As libraries continue to grow (in excess of 108 molecules), so too do resources necessary conduct exhaustive campaigns on these libraries. However, Bayesian optimization techniques, previously employed other scientific problems, can aid their exploration: surrogate structure-property relationship model trained predicted affinities subset library be applied remaining members, allowing least promising compounds excluded from evaluation. In this study, we explore application techniques computational docking datasets assess impact architecture, acquisition function, batch size performance. We observe significant reductions costs; for example, using directed-message passing neural network identify 94.8% or 89.3% top-50 000 ligands 100M member after testing only 2.4% upper confidence bound greedy strategy, respectively. Such model-guided searches mitigate increasing costs increasingly large accelerate high-throughput with applications beyond docking.

Язык: Английский

Процитировано

95

Comparative analysis of molecular fingerprints in prediction of drug combination effects DOI Creative Commons
Bulat Zagidullin, Ziyan Wang, Yuanfang Guan

и другие.

Briefings in Bioinformatics, Год журнала: 2021, Номер 22(6)

Опубликована: Авг. 9, 2021

Application of machine and deep learning methods in drug discovery cancer research has gained a considerable amount attention the past years. As field grows, it becomes crucial to systematically evaluate performance novel computational solutions relation established techniques. To this end, we compare rule-based data-driven molecular representations prediction combination sensitivity synergy scores using standardized results 14 high-throughput screening studies, comprising 64 200 unique combinations 4153 molecules tested 112 cell lines. We clustering quantify their similarity by adapting Centered Kernel Alignment metric. Our work demonstrates that identify an optimal representation type, is necessary supplement quantitative benchmark with qualitative considerations, such as model interpretability robustness, which may vary between throughout preclinical development projects.

Язык: Английский

Процитировано

74

Multi-fidelity prediction of molecular optical peaks with deep learning DOI Creative Commons
Kevin P. Greenman, William H. Green, Rafael Gómez‐Bombarelli

и другие.

Chemical Science, Год журнала: 2022, Номер 13(4), С. 1152 - 1162

Опубликована: Янв. 1, 2022

Optical properties are central to molecular design for many applications, including solar cells and biomedical imaging. A variety of

Язык: Английский

Процитировано

59

Critical assessment of AI in drug discovery DOI
W. Patrick Walters,

Regina Barzilay

Expert Opinion on Drug Discovery, Год журнала: 2021, Номер 16(9), С. 937 - 947

Опубликована: Апрель 19, 2021

Introduction: Artificial Intelligence (AI) has become a component of our everyday lives, with applications ranging from recommendations on what to buy the analysis radiology images. Many techniques originally developed for other fields such as language translation and computer vision are now being applied in drug discovery. AI enabled multiple aspects discovery including high content screening data, design synthesis new molecules.Areas covered: This perspective provides an overview application several areas relevant property prediction, molecule generation, image analysis, organic planning.Expert opinion: While variety machine learning methods routinely used predict biological activity ADME properties, representing molecules continue evolve. Molecule generation relatively unproven but hold potential access new, unexplored chemical space. The will benefit dedicated research, well developments fields. With this pairing algorithmic advancements high-quality impact grow coming years.

Язык: Английский

Процитировано

58

Uncertainty-aware prediction of chemical reaction yields with graph neural networks DOI Creative Commons
Youngchun Kwon, Dongseon Lee, Youn-Suk Choi

и другие.

Journal of Cheminformatics, Год журнала: 2022, Номер 14(1)

Опубликована: Янв. 10, 2022

In this paper, we present a data-driven method for the uncertainty-aware prediction of chemical reaction yields. The reactants and products in are represented as set molecular graphs. predictive distribution yield is modeled graph neural network that directly processes graphs with permutation invariance. Uncertainty-aware learning inference applied to model make accurate predictions evaluate their uncertainty. We demonstrate effectiveness proposed on benchmark datasets various settings. Compared existing methods, improves uncertainty quantification performance most

Язык: Английский

Процитировано

47