Loading…

DeepSPInN – deep reinforcement learning for molecular structure prediction from infrared and 13 C NMR spectra

Molecular spectroscopy studies the interaction of molecules with electromagnetic radiation, and interpreting the resultant spectra is invaluable for deducing the molecular structures. However, predicting the molecular structure from spectroscopic data is a strenuous task that requires highly specifi...

Full description

Saved in:
Bibliographic Details
Published in:Digital discovery 2024-04, Vol.3 (4), p.818-829
Main Authors: Devata, Sriram, Sridharan, Bhuvanesh, Mehta, Sarvesh, Pathak, Yashaswi, Laghuvarapu, Siddhartha, Varma, Girish, Priyakumar, U. Deva
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Molecular spectroscopy studies the interaction of molecules with electromagnetic radiation, and interpreting the resultant spectra is invaluable for deducing the molecular structures. However, predicting the molecular structure from spectroscopic data is a strenuous task that requires highly specific domain knowledge. DeepSPInN is a deep reinforcement learning method that predicts the molecular structure when given infrared and 13 C nuclear magnetic resonance spectra by formulating the molecular structure prediction problem as a Markov decision process (MDP) and employs Monte-Carlo tree search to explore and choose the actions in the formulated MDP. On the QM9 dataset, DeepSPInN is able to predict the correct molecular structure for 91.5% of the input spectra in an average time of 77 seconds for molecules with less than 10 heavy atoms. This study is the first of its kind that uses only infrared and 13 C nuclear magnetic resonance spectra for molecular structure prediction without referring to any pre-existing spectral databases or molecular fragment knowledge bases, and is a leap forward in automated molecular spectral analysis.
ISSN:2635-098X
2635-098X
DOI:10.1039/D4DD00008K