Materials property prediction with uncertainty quantification: A benchmark study DOI Open Access
Daniel Varivoda, Rongzhi Dong, Sadman Sadeed Omee

et al.

Applied Physics Reviews, Journal Year: 2023, Volume and Issue: 10(2)

Published: May 23, 2023

Uncertainty quantification (UQ) has increasing importance in the building of robust high-performance and generalizable materials property prediction models. It can also be used active learning to train better models by focusing on gathering new training data from uncertain regions. There are several categories UQ methods, each considering different types uncertainty sources. Here, we conduct a comprehensive evaluation methods for graph neural network-based evaluate how they truly reflect that want error bound estimation or learning. Our experimental results over four crystal datasets (including formation energy, adsorption total bandgap properties) show popular ensemble NOT always best choice prediction. For convenience community, all source code accessed freely at https://github.com/usccolumbia/materialsUQ.

Language: Английский

A survey of uncertainty in deep neural networks DOI Creative Commons

Jakob Gawlikowski,

Cedrique Rovile Njieutcheu Tassi, Mohsin Ali

et al.

Artificial Intelligence Review, Journal Year: 2023, Volume and Issue: 56(S1), P. 1513 - 1589

Published: July 29, 2023

Abstract Over the last decade, neural networks have reached almost every field of science and become a crucial part various real world applications. Due to increasing spread, confidence in network predictions has more important. However, basic do not deliver certainty estimates or suffer from over- under-confidence, i.e. are badly calibrated. To overcome this, many researchers been working on understanding quantifying uncertainty network’s prediction. As result, different types sources identified approaches measure quantify proposed. This work gives comprehensive overview estimation networks, reviews recent advances field, highlights current challenges, identifies potential research opportunities. It is intended give anyone interested broad introduction, without presupposing prior knowledge this field. For that, introduction most given their separation into reducible model irreducible data presented. The modeling these uncertainties based deterministic Bayesian (BNNs), ensemble test-time augmentation introduced branches fields as well latest developments discussed. practical application, we discuss measures uncertainty, for calibrating an existing baselines available implementations. Different examples wide spectrum challenges medical image analysis, robotics, earth observation idea needs regarding applications networks. Additionally, limitations quantification methods mission- safety-critical discussed outlook next steps towards broader usage such given.

Language: Английский

Citations

492

ADMETlab 3.0: an updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support DOI Creative Commons
Li Fu, Shaohua Shi, Jiacai Yi

et al.

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 52(W1), P. W422 - W431

Published: April 4, 2024

Abstract ADMETlab 3.0 is the second updated version of web server that provides a comprehensive and efficient platform for evaluating ADMET-related parameters as well physicochemical properties medicinal chemistry characteristics involved in drug discovery process. This new release addresses limitations previous offers broader coverage, improved performance, API functionality, decision support. For supporting data endpoints, this includes 119 features, an increase 31 compared to version. The number entries 1.5 times larger than with over 400 000 entries. incorporates multi-task DMPNN architecture coupled molecular descriptors, method not only guaranteed calculation speed each endpoint simultaneously, but also achieved superior performance terms accuracy robustness. In addition, has been introduced meet growing demand programmatic access large amounts 3.0. Moreover, uncertainty estimates prediction results, aiding confident selection candidate compounds further studies experiments. publicly without need registration at: https://admetlab3.scbdd.com.

Language: Английский

Citations

173

Artificial intelligence for natural product drug discovery DOI
Michael W. Mullowney, Katherine Duncan, Somayah S. Elsayed

et al.

Nature Reviews Drug Discovery, Journal Year: 2023, Volume and Issue: 22(11), P. 895 - 916

Published: Sept. 11, 2023

Language: Английский

Citations

149

A Guide to In Silico Drug Design DOI Creative Commons
Yiqun Chang, Bryson A. Hawkins, Jonathan J. Du

et al.

Pharmaceutics, Journal Year: 2022, Volume and Issue: 15(1), P. 49 - 49

Published: Dec. 23, 2022

The drug discovery process is a rocky path that full of challenges, with the result very few candidates progress from hit compound to commercially available product, often due factors, such as poor binding affinity, off-target effects, or physicochemical properties, solubility stability. This further complicated by high research and development costs time requirements. It thus important optimise every step in order maximise chances success. As recent advancements computer power technology, computer-aided design (CADD) has become an integral part modern guide accelerate process. In this review, we present overview CADD methods applications, silico structure prediction, refinement, modelling target validation, are commonly used area.

Language: Английский

Citations

121

Leveraging molecular structure and bioactivity with chemical language models for de novo drug design DOI Creative Commons
Michaël Moret, Irène Pachón-Angona,

Leandro Cotos

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Jan. 7, 2023

Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs additionally leverage the bioactivity information available training compounds. To computationally design ligands phosphoinositide 3-kinase gamma (PI3Kγ), collection virtual molecules was created with generative CLM. This compound library refined using CLM-based classifier prediction. second CLM pretrained patented structures and fine-tuned known PI3Kγ ligands. Several computer-generated designs were commercially available, enabling fast prescreening preliminary experimental validation. A new ligand sub-micromolar activity identified, highlighting method's scaffold-hopping potential. Chemical synthesis biochemical testing two top-ranked designed their derivatives corroborated model's ability to generate medium low nanomolar hit-to-lead expansion. The most potent compounds led pronounced inhibition PI3K-dependent Akt phosphorylation in medulloblastoma cell model, demonstrating efficacy PI3K/Akt pathway repression human tumor cells. results positively advocate screening activity-focused design.

Language: Английский

Citations

97

Scoring Functions for Protein-Ligand Binding Affinity Prediction Using Structure-based Deep Learning: A Review DOI Creative Commons
Rocco Meli, Garrett M. Morris, Philip C. Biggin

et al.

Frontiers in Bioinformatics, Journal Year: 2022, Volume and Issue: 2

Published: June 17, 2022

The rapid and accurate in silico prediction of protein-ligand binding free energies or affinities has the potential to transform drug discovery. In recent years, there been a growth interest deep learning methods for based on structural information complexes. These structure-based scoring functions often obtain better results than classical when applied within their applicability domain. Here we review affinity learning, focussing different types architectures, featurization strategies, data sets, training evaluation, role explainable artificial intelligence building useful models real drug-discovery applications.

Language: Английский

Citations

78

Applied machine learning as a driver for polymeric biomaterials design DOI Creative Commons
Samantha M. McDonald,

Emily K. Augustine,

Quinn Lanners

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Aug. 10, 2023

Abstract Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity commercial polymers used medicine stunningly low. Considerable time resources have been extended over years towards development new polymeric biomaterials which address unmet needs left by current generation medical-grade polymers. Machine learning (ML) presents an unprecedented opportunity this field bypass need for trial-and-error synthesis, thus reducing invested into discoveries critical advancing treatments. Current efforts pioneering applied ML polymer design employed combinatorial high throughput experimental data availability concerns. However, lack available standardized characterization parameters relevant medicine, including degradation biocompatibility, represents a nearly insurmountable obstacle ML-aided biomaterials. Herein, we identify gap at intersection biomedical design, highlight works junction more broadly provide outlook on challenges future directions.

Language: Английский

Citations

55

Characterizing Uncertainty in Machine Learning for Chemistry DOI Creative Commons
Esther Heid, Charles J. McGill, Florence H. Vermeire

et al.

Journal of Chemical Information and Modeling, Journal Year: 2023, Volume and Issue: 63(13), P. 4012 - 4029

Published: June 20, 2023

Characterizing uncertainty in machine learning models has recently gained interest the context of reliability, robustness, safety, and active learning. Here, we separate total into contributions from noise data (aleatoric) shortcomings model (epistemic), further dividing epistemic bias variance contributions. We systematically address influence noise, bias, chemical property predictions, where diverse nature target properties vast space give rise to many different distinct sources prediction error. demonstrate that error can each be significant contexts must individually addressed during development. Through controlled experiments on sets molecular properties, show important trends performance associated with level set, size architecture, molecule representation, ensemble size, set splitting. In particular, 1) test limit a model's observed when actual is much better, 2) using size-extensive aggregation structures crucial for extensive prediction, 3) ensembling reliable tool quantification improvement specifically contribution variance. develop general guidelines how improve an underperforming falling contexts.

Language: Английский

Citations

43

Empowering biomedical discovery with AI agents DOI Creative Commons

Shanghua Gao,

Ada Fang,

Yepeng Huang

et al.

Cell, Journal Year: 2024, Volume and Issue: 187(22), P. 6125 - 6151

Published: Oct. 1, 2024

We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents integrate AI models tools with experimental platforms. Rather than taking humans out the discovery process, combine human creativity expertise AI's ability to analyze large datasets, navigate hypothesis spaces, execute repetitive tasks. are poised be proficient in various tasks, planning workflows performing self-assessment identify mitigate gaps their knowledge. These use language generative feature structured memory for continual machine incorporate scientific knowledge, biological principles, theories. can impact areas ranging from virtual cell simulation, programmable control phenotypes, design cellular circuits developing new therapies.

Language: Английский

Citations

21

Bayesian optimization of nanoporous materials DOI
Aryan Deshwal, Cory M. Simon, Janardhan Rao Doppa

et al.

Molecular Systems Design & Engineering, Journal Year: 2021, Volume and Issue: 6(12), P. 1066 - 1086

Published: Jan. 1, 2021

In Bayesian optimization, we efficiently search for an optimal material by iterating between (i) conducting experiment on a material, (ii) updating our knowledge, and (iii) selecting the next experiment.

Language: Английский

Citations

69