NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.
NSF
Decoding the structures and properties of unknown molecules through analyzing the wavelengths of their electromagnetic properties is known as spectral analysis. Spectral analysis is crucial for scientific discovery and practical applications in various fields, including material manufacture, drug design, food safety, explosive detection, and non-invasive diagnosis. Spectral analysis offers rapid, sensitive, non-destructive, and cost-effective identification of unknown molecules through their characteristic numerical signals and outperforms traditional chemical analysis. However, the process of translating numerical signals into molecular structures is currently resource-intensive and not user-friendly because it often requires extensive trial-and-error and specialized training. This project aims to revolutionize spectral analysis using state-of-the-art artificial intelligence (AI) in an automatic, accelerated, and accurate fashion. This project will treat spectral signals and molecular structures as two different "languages". Models developed in this project will automatically transform spectral signals and molecular structures into descriptions of molecules in the two languages and enable rapid conversion between each description based on advanced AI-powered language translation tools. The resulting universal toolkit will simplify and streamline spectral analysis in practical scenarios and benefit applications in scientific research, national healthcare, national security, educational activities, and other domains. The primary intellectual contribution of this project is the development of a novel chemistry-informed, multi-modal, powerful, and flexible deep learning framework to realize automatic, accelerated, and accurate end-to-end spectrum-to-structure translations. Investigators will adapt and leverage foundation models from the frontier of AI, especially pre-trained large language models (LLMs) like Transformers. This project will design an encoder-decoder architecture, where the spectrum encoder converts the input numerical spectral signals (e.g., wavenumber-absorbance pairs from infrared (IR) spectra and chemical shift-intensity pairs from nuclear magnetic resonance (NMR) spectra) into context vectors, and the structure decoder transforms these context vectors into the output molecular fingerprint containing two-dimensional (2D) topological structures and three-dimensional (3D) spatial conformations of target molecules. The encoder and decoder will be pre-trained on high-quality data sets of molecular spectra from experimental measurements and theoretical calculations and fine-tuned to boost the performance. The project will accomplish three fundamental thrusts, including (a) developing natural-language representations for both spectral signals and molecular structures that align with the architecture of foundation models, (b) designing multi-modal learning frameworks to leverage pre-trained foundation models as the backbone approaches and inject chemical constraints as domain-specific knowledge for an end-to-end spectrum-to-structure translation, and (c) tailoring our multi-modal learning frameworks using chemistry- and data-informed schemes to adapt to the practical instrumental analysis pipeline for applications in real-life scenarios. This project will demonstrate significance across a broad range of disciplines where spectral analysis is essential for identifying or recognizing single molecules or molecular mixtures, including chemistry, biology, medicines, pharmacology, astronomy, security, materials science, food science, and environmental science. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Up to $500K
2027-06-30
Detailed requirements not yet analyzed
Have the NOFO? Paste it below for AI-powered requirement analysis.
One-time $749 fee · Includes AI drafting + templates + PDF export
EPSCoR CREST Phase I: Center for Post-Transcriptional Regulation
NSF — up to $7.5M
CREST Phase I: Center for Circadian Rhythmicity and Sleep Homeostasis
NSF — up to $7.4M
Institute for Foundations of Machine Learning
NSF — up to $6.5M
MIP: Biomaterials, Polymers, and Advanced Constructs from Integrated Chemistry Materials Innovation Platform (BioPACIFIC MIP)
NSF — up to $5.8M
A Shallow Drilling Campaign to Assess the Pleistocene Hydrogeology, Geomicrobiology, Nutrient Fluxes, and Fresh Water Resources of the Atlantic Continental Shelf, New England
NSF — up to $5.0M
BII: Predicting the global host-virus network from molecular foundations
NSF — up to $4.8M