Nature Machine Intelligence, Год журнала: 2024, Номер unknown
Опубликована: Ноя. 27, 2024
Язык: Английский
Nature Machine Intelligence, Год журнала: 2024, Номер unknown
Опубликована: Ноя. 27, 2024
Язык: Английский
Coordination Chemistry Reviews, Год журнала: 2023, Номер 484, С. 215112 - 215112
Опубликована: Март 21, 2023
The reticular chemistry of metal–organic frameworks (MOFs) allows for the generation an almost boundless number materials some which can be a substitute traditionally used porous in various fields including gas storage and separation, catalysis, drug delivery. MOFs their potential applications are growing so quickly that, when novel synthesized, testing them all possible is not practical. High-throughput computational screening approaches based on molecular simulations have been widely to investigate identify optimal specific application. Despite resources, given enormous MOF material space, identification promising requires more efficient terms time effort. Leveraging data-driven science techniques offer key benefits such as accelerated design discovery pathways via establishment machine learning (ML) models interpretation complex structure-performance relationships that reach beyond expert intuition. In this review, we present scientific breakthroughs propelled modeling discuss state-of-the-art extending from ML algorithms. Finally, provide our perspective opportunities challenges future big discovery.
Язык: Английский
Процитировано
115npj Computational Materials, Год журнала: 2023, Номер 9(1)
Опубликована: Апрель 22, 2023
Abstract Accurate and efficient prediction of polymer properties is great significance in design. Conventionally, expensive time-consuming experiments or simulations are required to evaluate functions. Recently, Transformer models, equipped with self-attention mechanisms, have exhibited superior performance natural language processing. However, such methods not been investigated sciences. Herein, we report TransPolymer, a Transformer-based model for property prediction. Our proposed tokenizer chemical awareness enables learning representations from sequences. Rigorous on ten benchmarks demonstrate the TransPolymer. Moreover, show that TransPolymer benefits pretraining large unlabeled dataset via Masked Language Modeling. Experimental results further manifest important role modeling We highlight this as promising computational tool promoting rational design understanding structure-property relationships data science view.
Язык: Английский
Процитировано
85Nature Communications, Год журнала: 2024, Номер 15(1)
Опубликована: Март 1, 2024
Abstract Gas separation is crucial for industrial production and environmental protection, with metal-organic frameworks (MOFs) offering a promising solution due to their tunable structural properties chemical compositions. Traditional simulation approaches, such as molecular dynamics, are complex computationally demanding. Although feature engineering-based machine learning methods perform better, they susceptible overfitting because of limited labeled data. Furthermore, these typically designed single tasks, predicting gas adsorption capacity under specific conditions, which restricts the utilization comprehensive datasets including all capacities. To address challenges, we propose Uni-MOF, an innovative framework large-scale, three-dimensional MOF representation learning, multi-purpose prediction. Specifically, Uni-MOF serves versatile estimator materials, employing pure representations learned from over 631,000 collected COF structures. Our experimental results show that can automatically extract predict capacities various operating conditions using model. For simulated data, exhibits remarkably high predictive accuracy across datasets. Additionally, values predicted by correspond outcomes experiments. demonstrates considerable potential broad applicability in wide array other properties.
Язык: Английский
Процитировано
33Journal of Molecular Structure, Год журнала: 2024, Номер 1304, С. 137687 - 137687
Опубликована: Фев. 2, 2024
Язык: Английский
Процитировано
18Coordination Chemistry Reviews, Год журнала: 2024, Номер 514, С. 215888 - 215888
Опубликована: Май 8, 2024
Язык: Английский
Процитировано
18Nano Letters, Год журнала: 2024, Номер 24(10), С. 2953 - 2960
Опубликована: Март 4, 2024
Porous membranes, either polymeric or two-dimensional materials, have been extensively studied because of their outstanding performance in many applications such as water filtration. Recently, inspired by the significant success machine learning (ML) areas scientific discovery, researchers started to tackle problem field membrane design using data-driven ML tools. In this Mini Review, we summarize research efforts on three types design, including (1) property prediction ML, (2) gaining physical insight and drawing quantitative relationships between properties explainable artificial intelligence, (3) ML-guided optimization, virtual screening membranes. On top review previous research, discuss challenges associated with applying for potential future directions.
Язык: Английский
Процитировано
17Nature Water, Год журнала: 2024, Номер 2(8), С. 706 - 718
Опубликована: Авг. 8, 2024
Язык: Английский
Процитировано
16ACS Catalysis, Год журнала: 2023, Номер 13(24), С. 16032 - 16044
Опубликована: Ноя. 30, 2023
Efficient catalyst screening necessitates predictive models for adsorption energy, which is a key descriptor of reactivity. Prevailing methods, notably graph neural networks (GNNs), demand precise atomic coordinates constructing representations, while the integration observable attributes remains challenging. This research introduces CatBERTa, an energy prediction Transformer model that uses textual inputs. Built on encoder pretrained language modeling purposes, CatBERTa processes human-interpretable text, incorporating target features. Attention score analysis reveals CatBERTa's focus tokens related to adsorbates, bulk composition, and their interacting atoms. Moreover, atoms emerge as effective descriptors configurations, factors such bond length properties these offer limited contributions. In predicting from representations initial structures, exhibits precision comparable conventional GNNs. Notably, in subsets recognized high accuracy with GNNs, consistently achieves mean absolute error 0.35 eV. Furthermore, subtraction CatBERTa-predicted energies effectively cancels out systematic errors by much 19.3% chemically similar systems, surpassing reduction observed outcome highlights its potential enhance difference predictions. establishes fundamental framework text-based property without relying also unveiling intricate feature–property relationships.
Язык: Английский
Процитировано
23Journal of the American Chemical Society, Год журнала: 2024, Номер 146(29), С. 20333 - 20348
Опубликована: Июль 10, 2024
Metal-organic frameworks (MOFs) are porous materials with applications in gas separations and catalysis, but a lack of water stability often limits their practical use given the ubiquity water. Consequently, it is useful to predict whether MOF water-stable before investing time resources into synthesis. Existing heuristics for designing MOFs generality limit diversity explored chemistry due narrowly defined criteria. Machine learning (ML) models offer promise improve predictions require data. In an improvement on previous efforts, we enlarge available training data prediction by over 400%, adding 911 labels assigned through semiautomated manuscript analysis curate new set WS24. The additional shown ML model performance (test ROC-AUC > 0.8) diverse both harsher acidic conditions. We illustrate how expanded can be used previously developed activation combination genetic algorithms quickly screen ∼10,000 from space hundreds thousands candidates multivariate (upon activation, water, acid). uncover metal- geometry-specific design rules robust MOFs. this work, which disseminate easy-to-use web interface, expected contribute toward accelerated discovery novel, such as direct air capture treatment.
Язык: Английский
Процитировано
15Journal of Chemical Information and Modeling, Год журнала: 2024, Номер 64(13), С. 4958 - 4965
Опубликована: Март 26, 2024
Along with the development of machine learning, deep and large language models (LLMs) such as GPT-4 (GPT: Generative Pre-Trained Transformer), artificial intelligence (AI) tools have been playing an increasingly important role in chemical material research to facilitate screening design. Despite exciting progress based AI assistance, open-source LLMs not gained much attention from scientific community. This work primarily focused on metal–organic frameworks (MOFs) a subdomain chemistry evaluated six top-rated comprehensive set tasks including MOFs knowledge, basic in-depth knowledge extraction, database reading, predicting property, experiment design, computational scripts generation, guiding experiment, data analysis, paper polishing, which covers units research. In general, these were capable most tasks. Especially, Llama2-7B ChatGLM2-6B found perform particularly well moderate resources. Additionally, performance different parameter versions same model was compared, revealed superior higher versions.
Язык: Английский
Процитировано
14