Integrating Machine Learning and Large Language Models to Advance Wu Exploration of Electrochemical Reactions DOI Creative Commons
Zhiling Zheng, Federico Florit, Brooke Jin

et al.

Published: Aug. 28, 2024

Electrochemical C-H oxidation reactions offer a sustainable route to functionalize hydrocarbons, yet the identification of competent substrates and their synthesis optimization remains challenging. Here, we report an integrated approach combining machine learning (ML) large language models (LLMs) streamline exploration electrochemical reactions. Utilizing batch rapid screening platform, evaluated wide range reactions, initially classifying by reactivity, while LLMs text-mined literature data augment training set. The resulting ML models, one for reactivity prediction other site selectivity, both achieved high accuracy (>90%) enabled virtual set commercially available molecules. To optimize reaction conditions interest upon screening, were prompted generate code iteratively improve yield, lowering barrier scientists access programs, this strategy efficiently identified high-yield eight drug-like substances or intermediates. Notably, benchmarked reliability 10 different LLMs, including llama, Claude, GPT-4, on generating executing codes related based natural prompts given chemists showcase tool-making tool-using capabilities potentials accelerating research across four diverse tasks. In addition, collected experimental benchmark dataset comprising 1071 yields our findings revealed that integrating outperformed using either method alone. We envision combined offers robust generalizable pathway advancing synthetic chemistry

Language: Английский

Integrating Machine Learning and Large Language Models to Advance Wu Exploration of Electrochemical Reactions DOI Creative Commons
Zhiling Zheng, Federico Florit, Brooke Jin

et al.

Published: Aug. 28, 2024

Electrochemical C-H oxidation reactions offer a sustainable route to functionalize hydrocarbons, yet the identification of competent substrates and their synthesis optimization remains challenging. Here, we report an integrated approach combining machine learning (ML) large language models (LLMs) streamline exploration electrochemical reactions. Utilizing batch rapid screening platform, evaluated wide range reactions, initially classifying by reactivity, while LLMs text-mined literature data augment training set. The resulting ML models, one for reactivity prediction other site selectivity, both achieved high accuracy (>90%) enabled virtual set commercially available molecules. To optimize reaction conditions interest upon screening, were prompted generate code iteratively improve yield, lowering barrier scientists access programs, this strategy efficiently identified high-yield eight drug-like substances or intermediates. Notably, benchmarked reliability 10 different LLMs, including llama, Claude, GPT-4, on generating executing codes related based natural prompts given chemists showcase tool-making tool-using capabilities potentials accelerating research across four diverse tasks. In addition, collected experimental benchmark dataset comprising 1071 yields our findings revealed that integrating outperformed using either method alone. We envision combined offers robust generalizable pathway advancing synthetic chemistry

Language: Английский

Citations

1