Elucidating structures from spectra using multimodal embeddings and discrete optimization DOI Creative Commons

A.H. Mirza,

Kevin Maik Jablonka

Published: Nov. 25, 2024

Structure elucidation --- determining molecular structures from spectroscopic data -- remains one of chemistry's most fundamental and challenging tasks, essential for advancing fields drug discovery to materials science. While machine learning approaches have attempted automate this process, they typically focus on single techniques lack crucial confidence metrics, limiting their practical utility. Here, we present spec2struct, a framework that synergistically combines multimodal embeddings, contrastive learning, evolutionary algorithms mimic how expert chemists approach structure determination. By aligning encoders diverse with representations, our system can simultaneously interpret multiple types evidence. This alignment guides genetic evolve chemically valid candidates best match the experimental data. spec2struct not only outperforms existing methods but also provides calibrated contextualized estimates. We demonstrate its real-world impact by identifying several published incorrectly assigned in literature. The combination performance, reliability, versatility positions as powerful tool accelerating chemical discovery.

Language: Английский

Accurate and Efficient Structure Elucidation from Routine One-Dimensional NMR Spectra Using Multitask Machine Learning DOI Creative Commons
Frank Hu, Michael S. Chen, Grant M. Rotskoff

et al.

ACS Central Science, Journal Year: 2024, Volume and Issue: 10(11), P. 2162 - 2170

Published: Nov. 13, 2024

Rapid determination of molecular structures can greatly accelerate workflows across many chemical disciplines. However, elucidating structure using only one-dimensional (1D) NMR spectra, the most readily accessible data, remains an extremely challenging problem because combinatorial explosion number possible molecules as constituent atoms is increased. Here, we introduce a multitask machine learning framework that predicts (formula and connectivity) unknown compound solely based on its 1D 1H and/or 13C spectra. First, show how transformer architecture be constructed to efficiently solve task, traditionally performed by chemists, assembling large numbers fragments into structures. Integrating this capability with convolutional neural network, build end-to-end model for predicting from spectra fast accurate. We demonstrate effectiveness up 19 heavy (non-hydrogen) atoms, size which there are trillions Without relying any prior knowledge such formula, our approach exact molecule 69.6% time within first 15 predictions, reducing search space 11 orders magnitude.

Language: Английский

Citations

4

Nuclear Magnetic Resonance and Artificial Intelligence DOI Creative Commons
Stefan Kühn, Rômulo Pereira de Jesus, Ricardo M. Borges

et al.

Encyclopedia, Journal Year: 2024, Volume and Issue: 4(4), P. 1568 - 1580

Published: Oct. 18, 2024

This review explores the current applications of artificial intelligence (AI) in nuclear magnetic resonance (NMR) spectroscopy, with a particular emphasis on small molecule chemistry. Applications AI techniques, especially machine learning (ML) and deep (DL) areas shift prediction, spectral simulations, processing, structure elucidation, mixture analysis, metabolomics, are demonstrated. The also shows where progress is limited.

Language: Английский

Citations

1

Chemical shift prediction in 13C NMR spectroscopy using ensembles of message passing neural networks (MPNNs) DOI Creative Commons
D.S. Williamson, S. Lopez-Ponte, Isabel Iglesias

et al.

Journal of Magnetic Resonance, Journal Year: 2024, Volume and Issue: 368, P. 107795 - 107795

Published: Oct. 30, 2024

This study reports a deep learning approach that utilises message passing neural networks (MPNNs) for predicting chemical shifts in

Language: Английский

Citations

1

Elucidating structures from spectra using multimodal embeddings and discrete optimization DOI Creative Commons

A.H. Mirza,

Kevin Maik Jablonka

Published: Nov. 25, 2024

Structure elucidation --- determining molecular structures from spectroscopic data -- remains one of chemistry's most fundamental and challenging tasks, essential for advancing fields drug discovery to materials science. While machine learning approaches have attempted automate this process, they typically focus on single techniques lack crucial confidence metrics, limiting their practical utility. Here, we present spec2struct, a framework that synergistically combines multimodal embeddings, contrastive learning, evolutionary algorithms mimic how expert chemists approach structure determination. By aligning encoders diverse with representations, our system can simultaneously interpret multiple types evidence. This alignment guides genetic evolve chemically valid candidates best match the experimental data. spec2struct not only outperforms existing methods but also provides calibrated contextualized estimates. We demonstrate its real-world impact by identifying several published incorrectly assigned in literature. The combination performance, reliability, versatility positions as powerful tool accelerating chemical discovery.

Language: Английский

Citations

0