Large Language Models for Inorganic Synthesis Predictions DOI
Seong-Min Kim, Yousung Jung, Joshua Schrier

et al.

Journal of the American Chemical Society, Journal Year: 2024, Volume and Issue: 146(29), P. 19654 - 19659

Published: July 11, 2024

We evaluate the effectiveness of pretrained and fine-tuned large language models (LLMs) for predicting synthesizability inorganic compounds selection precursors needed to perform synthesis. The predictions LLMs are comparable to─and sometimes better than─recent bespoke machine learning these tasks but require only minimal user expertise, cost, time develop. Therefore, this strategy can serve both as an effective strong baseline future studies various chemical applications a practical tool experimental chemists.

Language: Английский

Machine Learning-Guided Protein Engineering DOI Creative Commons
Petr Kouba, Pavel Kohout, Faraneh Haddadi

et al.

ACS Catalysis, Journal Year: 2023, Volume and Issue: 13(21), P. 13863 - 13895

Published: Oct. 13, 2023

Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid the discovery annotation of enzymes, as well suggesting beneficial mutations for improving known targets. The field protein is gathering steam, driven by recent success stories notable other areas. It already encompasses ambitious tasks such understanding predicting structure function, catalytic efficiency, enantioselectivity, dynamics, stability, solubility, aggregation, more. Nonetheless, still evolving, with many challenges overcome questions address. In this Perspective, we provide an overview ongoing trends domain, highlight case studies, examine current limitations learning-based We emphasize crucial importance thorough validation emerging models before their use rational design. present our opinions on fundamental problems outline potential directions future research.

Language: Английский

Citations

91

Revolutionizing drug formulation development: The increasing impact of machine learning DOI
Zeqing Bao,

Jack Bufton,

Riley J. Hickman

et al.

Advanced Drug Delivery Reviews, Journal Year: 2023, Volume and Issue: 202, P. 115108 - 115108

Published: Sept. 27, 2023

Language: Английский

Citations

53

Reinvent 4: Modern AI–driven generative molecule design DOI Creative Commons
Hannes H. Loeffler, Jiazhen He, Alessandro Tibo

et al.

Journal of Cheminformatics, Journal Year: 2024, Volume and Issue: 16(1)

Published: Feb. 21, 2024

REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within general machine learning optimization algorithms, transfer learning, reinforcement curriculum learning. enables facilitates de novo design, R-group replacement, library linker scaffold hopping optimization. This contribution gives an overview describes its design. Algorithms their applications discussed in detail. command line tool which reads user configuration either TOML or JSON format. aim this release provide reference implementations some most common algorithms based An additional goal with create education future innovation molecular available from https://github.com/MolecularAI/REINVENT4 released under permissive Apache 2.0 license. Scientific contribution. provides implementation where also being used production support in-house drug discovery projects. publication one code full documentation thereof will increase transparency foster innovation, collaboration education.

Language: Английский

Citations

51

In Pursuit of the Exceptional: Research Directions for Machine Learning in Chemical and Materials Science DOI
Joshua Schrier, Alexander J. Norquist,

Tonio Buonassisi

et al.

Journal of the American Chemical Society, Journal Year: 2023, Volume and Issue: 145(40), P. 21699 - 21716

Published: Sept. 27, 2023

Exceptional molecules and materials with one or more extraordinary properties are both technologically valuable fundamentally interesting, because they often involve new physical phenomena compositions that defy expectations. Historically, exceptionality has been achieved through serendipity, but recently, machine learning (ML) automated experimentation have widely proposed to accelerate target identification synthesis planning. In this Perspective, we argue the data-driven methods commonly used today well-suited for optimization not realization of exceptional molecules. Finding such outliers should be possible using ML, only by shifting away from traditional ML approaches tweak composition, crystal structure, reaction pathway. We highlight case studies high-Tc oxide superconductors superhard demonstrate challenges ML-guided discovery discuss limitations automation task. then provide six recommendations development capable discovery: (i) Avoid tyranny middle focus on extrema; (ii) When data limited, qualitative predictions direction than interpolative accuracy; (iii) Sample what can made how make it defer optimization; (iv) Create room (and look) unexpected while pursuing your goal; (v) Try fill-in-the-blanks input output space; (vi) Do confuse human understanding model interpretability. conclude a description these integrated into workflows, which enable materials.

Language: Английский

Citations

46

Machine learning in preclinical drug discovery DOI

Denise B. Catacutan,

Jeremie Alexander,

Autumn Arnold

et al.

Nature Chemical Biology, Journal Year: 2024, Volume and Issue: 20(8), P. 960 - 973

Published: July 19, 2024

Language: Английский

Citations

41

Embracing data science in catalysis research DOI
Manu Suvarna, Javier Pérez‐Ramírez

Nature Catalysis, Journal Year: 2024, Volume and Issue: 7(6), P. 624 - 635

Published: April 23, 2024

Language: Английский

Citations

27

AI for targeted polypharmacology: The next frontier in drug discovery DOI
Anna Cichońska, Balaguru Ravikumar, Rayees Rahman

et al.

Current Opinion in Structural Biology, Journal Year: 2024, Volume and Issue: 84, P. 102771 - 102771

Published: Jan. 11, 2024

Language: Английский

Citations

22

Invalid SMILES are beneficial rather than detrimental to chemical language models DOI Creative Commons
Michael A. Skinnider

Nature Machine Intelligence, Journal Year: 2024, Volume and Issue: 6(4), P. 437 - 448

Published: March 29, 2024

Abstract Generative machine learning models have attracted intense interest for their ability to sample novel molecules with desired chemical or biological properties. Among these, language trained on SMILES (Simplified Molecular-Input Line-Entry System) representations been subject the most extensive experimental validation and widely adopted. However, these what is perceived be a major limitation: some fraction of strings that they generate are invalid, meaning cannot decoded structure. This shortcoming has motivated remarkably broad spectrum work designed mitigate generation invalid correct them post hoc. Here I provide causal evidence produce outputs not harmful but instead beneficial models. show provides self-corrective mechanism filters low-likelihood samples from model output. Conversely, enforcing valid produces structural biases in generated molecules, impairing distribution limiting generalization unseen space. Together, results refute prevailing assumption reframe as feature, bug.

Language: Английский

Citations

17

Recent Advances in Machine Learning‐Assisted Multiscale Design of Energy Materials DOI Creative Commons
Bohayra Mortazavi

Advanced Energy Materials, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 10, 2024

Abstract This review highlights recent advances in machine learning (ML)‐assisted design of energy materials. Initially, ML algorithms were successfully applied to screen materials databases by establishing complex relationships between atomic structures and their resulting properties, thus accelerating the identification candidates with desirable properties. Recently, development highly accurate interatomic potentials generative models has not only improved robust prediction physical but also significantly accelerated discovery In past couple years, methods have enabled high‐precision first‐principles predictions electronic optical properties for large systems, providing unprecedented opportunities science. Furthermore, ML‐assisted microstructure reconstruction physics‐informed solutions partial differential equations facilitated understanding microstructure–property relationships. Most recently, seamless integration various platforms led emergence autonomous laboratories that combine quantum mechanical calculations, language models, experimental validations, fundamentally transforming traditional approach novel synthesis. While highlighting aforementioned advances, existing challenges are discussed. Ultimately, is expected fully integrate atomic‐scale simulations, reverse engineering, process optimization, device fabrication, empowering system design. will drive transformative innovations conversion, storage, harvesting technologies.

Language: Английский

Citations

17

Allosteric drugs: New principles and design approaches DOI
Wei-Ven Tee, Igor N. Berezovsky

Current Opinion in Structural Biology, Journal Year: 2024, Volume and Issue: 84, P. 102758 - 102758

Published: Jan. 2, 2024

Language: Английский

Citations

16