Featurization strategies for polymer sequence or composition design by machine learning DOI
Roshan Patel, Carlos H. Borca, Michael Webb

et al.

Molecular Systems Design & Engineering, Journal Year: 2022, Volume and Issue: 7(6), P. 661 - 676

Published: Jan. 1, 2022

In this work, we present, evaluate, and analyze strategies for representing polymer chemistry to machine learning models the advancement of data-driven sequence or composition design macromolecules.

Language: Английский

Machine learning in materials science: From explainable predictions to autonomous design DOI Creative Commons
Ghanshyam Pilania

Computational Materials Science, Journal Year: 2021, Volume and Issue: 193, P. 110360 - 110360

Published: March 10, 2021

Language: Английский

Citations

180

Benchmarking Machine Learning Models for Polymer Informatics: An Example of Glass Transition Temperature DOI
Lei Tao, Vikas Varshney, Ying Li

et al.

Journal of Chemical Information and Modeling, Journal Year: 2021, Volume and Issue: 61(11), P. 5395 - 5413

Published: Oct. 18, 2021

In the field of polymer informatics, utilizing machine learning (ML) techniques to evaluate glass transition temperature Tg and other properties polymers has attracted extensive attention. This data-centric approach is much more efficient practical than laborious experimental measurements when encountered a daunting number structures. Various ML models are demonstrated perform well for prediction. Nevertheless, they trained on different data sets, using structure representations, based feature engineering methods. Thus, critical question arises selecting proper model better handle prediction with generalization ability. To provide fair comparison examine key factors that affect performance, we carry out systematic benchmark study by compiling 79 training them large diverse set. The three major components in setting up an algorithms. terms representation, consider monomer, repeat unit, oligomer longer chain structure. Based feature, representation calculated, including Morgan fingerprinting or without substructure frequency, RDKit descriptors, molecular embedding, graph, etc. Afterward, obtained input algorithms, such as deep neural networks, convolutional random forest, support vector machine, LASSO regression, Gaussian process regression. We performance these holdout test set extra unlabeled from high-throughput dynamics simulation. model's ability especially focused, sensitivity topology weight also taken into consideration. provides not only guideline task but useful reference informatics tasks.

Language: Английский

Citations

136

Machine Learning on a Robotic Platform for the Design of Polymer–Protein Hybrids DOI Creative Commons
Matthew Tamasi, Roshan Patel, Carlos H. Borca

et al.

Advanced Materials, Journal Year: 2022, Volume and Issue: 34(30)

Published: May 20, 2022

Abstract Polymer–protein hybrids are intriguing materials that can bolster protein stability in non‐native environments, thereby enhancing their utility diverse medicinal, commercial, and industrial applications. One stabilization strategy involves designing synthetic random copolymers with compositions attuned to the surface, but rational design is complicated by vast chemical composition space. Here, a reported protein‐stabilizing based on active machine learning, facilitated automated material synthesis characterization platforms. The versatility robustness of approach demonstrated successful identification preserve, or even enhance, activity three chemically distinct enzymes following exposure thermal denaturing conditions. Although systematic screening results mixed success, learning appropriately identifies unique effective copolymer chemistries for each enzyme. Overall, this work broadens capabilities fit‐for‐purpose promote otherwise manipulate activity, extensions toward robust polymer–protein hybrid materials.

Language: Английский

Citations

119

Bioactive Synthetic Polymers DOI
Kenward Jung, Nathaniel Corrigan, Edgar H. H. Wong

et al.

Advanced Materials, Journal Year: 2021, Volume and Issue: 34(2)

Published: Oct. 5, 2021

Abstract Synthetic polymers are omnipresent in society as textiles and packaging materials, construction medicine, among many other important applications. Alternatively, natural play a crucial role sustaining life allowing organisms to adapt their environments by performing key biological functions such molecular recognition transmission of genetic information. In general, the synthetic polymer worlds completely separated due inability for perform specific functions; some cases, cause uncontrolled unwanted responses. However, owing advancement polymerization techniques recent years, new have emerged that provide targeted peptides, or present antiviral, anticancer, antimicrobial activities. this review, emergence generation bioactive bioapplications summarized. Finally, future opportunities area discussed.

Language: Английский

Citations

113

Machine-Learning-Guided Discovery of 19F MRI Agents Enabled by Automated Copolymer Synthesis DOI

Marcus H. Reis,

Filipp Gusev, Nicholas G. Taylor

et al.

Journal of the American Chemical Society, Journal Year: 2021, Volume and Issue: 143(42), P. 17677 - 17689

Published: Oct. 12, 2021

Modern polymer science suffers from the curse of multidimensionality. The large chemical space imposed by including combinations monomers into a statistical copolymer overwhelms synthesis and characterization technology limits ability to systematically study structure–property relationships. To tackle this challenge in context 19F magnetic resonance imaging (MRI) agents, we pursued computer-guided materials discovery approach that combines synergistic innovations automated flow machine learning (ML) method development. A software-controlled, continuous platform was developed enable iterative experimental–computational cycles resulted 397 unique compositions within six-variable compositional space. nonintuitive design criteria identified ML, which were accomplished exploring <0.9% overall space, lead identification >10 outperformed state-of-the-art materials.

Language: Английский

Citations

111

polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics DOI Creative Commons

Christopher Kuenneth,

Rampi Ramprasad

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: July 11, 2023

Polymers are a vital part of everyday life. Their chemical universe is so large that it presents unprecedented opportunities as well significant challenges to identify suitable application-specific candidates. We present complete end-to-end machine-driven polymer informatics pipeline can search this space for candidates at speed and accuracy. This includes fingerprinting capability called polyBERT (inspired by Natural Language Processing concepts), multitask learning approach maps the fingerprints host properties. linguist treats structure polymers language. The outstrips best presently available concepts property prediction based on handcrafted fingerprint schemes in two orders magnitude while preserving accuracy, thus making strong candidate deployment scalable architectures including cloud infrastructures.

Language: Английский

Citations

101

A graph representation of molecular ensembles for polymer property prediction DOI Creative Commons
Matteo Aldeghi,

Connor W. Coley

Chemical Science, Journal Year: 2022, Volume and Issue: 13(35), P. 10486 - 10498

Published: Jan. 1, 2022

Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction virtual screening can accelerate polymer design by prioritizing candidates expected have favorable properties. However, in contrast often not well-defined single structures but an ensemble similar which poses unique challenges traditional representations machine learning approaches. Here, we introduce graph representation molecular ensembles associated neural network architecture that tailored prediction. We demonstrate this approach captures critical features polymeric materials, like chain architecture, monomer stoichiometry, degree polymerization, achieves superior accuracy off-the-shelf cheminformatics methodologies. While doing so, built dataset simulated electron affinity ionization potential values for >40k with varying composition, may be the development other The models presented work pave path toward new classes algorithms informatics and, more broadly, framework modeling ensembles.

Language: Английский

Citations

90

Emerging Trends in Machine Learning: A Polymer Perspective DOI Creative Commons
Tyler B. Martin, Debra J. Audus

ACS Polymers Au, Journal Year: 2023, Volume and Issue: 3(3), P. 239 - 258

Published: Jan. 18, 2023

In the last five years, there has been tremendous growth in machine learning and artificial intelligence as applied to polymer science. Here, we highlight unique challenges presented by polymers how field is addressing them. We focus on emerging trends with an emphasis topics that have received less attention review literature. Finally, provide outlook for field, outline important areas science discuss advances from greater material community.

Language: Английский

Citations

87

TransPolymer: a Transformer-based language model for polymer property predictions DOI Creative Commons
Changwen Xu, Yuyang Wang, Amir Barati Farimani

et al.

npj Computational Materials, Journal Year: 2023, Volume and Issue: 9(1)

Published: April 22, 2023

Abstract Accurate and efficient prediction of polymer properties is great significance in design. Conventionally, expensive time-consuming experiments or simulations are required to evaluate functions. Recently, Transformer models, equipped with self-attention mechanisms, have exhibited superior performance natural language processing. However, such methods not been investigated sciences. Herein, we report TransPolymer, a Transformer-based model for property prediction. Our proposed tokenizer chemical awareness enables learning representations from sequences. Rigorous on ten benchmarks demonstrate the TransPolymer. Moreover, show that TransPolymer benefits pretraining large unlabeled dataset via Masked Language Modeling. Experimental results further manifest important role modeling We highlight this as promising computational tool promoting rational design understanding structure-property relationships data science view.

Language: Английский

Citations

87

A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing DOI Creative Commons
Pranav Shetty, Arunkumar Chitteth Rajan,

Chris Kuenneth

et al.

npj Computational Materials, Journal Year: 2023, Volume and Issue: 9(1)

Published: April 5, 2023

The ever-increasing number of materials science articles makes it hard to infer chemistry-structure-property relations from literature. We used natural language processing methods automatically extract material property data the abstracts polymer As a component our pipeline, we trained MaterialsBERT, model, using 2.4 million abstracts, which outperforms other baseline models in three out five named entity recognition datasets. Using this obtained ~300,000 records ~130,000 60 hours. extracted was analyzed for diverse range applications such as fuel cells, supercapacitors, and solar cells recover non-trivial insights. through pipeline is made available at polymerscholar.org can be locate recorded abstracts. This work demonstrates feasibility an automatic that starts published literature ends with information.

Language: Английский

Citations

69