NSF AI Disclosure Required

NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.

Collaborative Research: Elements: A Cyberinfrastructure for Seamlessly Integrating Lossy Compression into Neural Networks for Scientific Applications

NSF

open

Scientists are increasingly incorporating machine learning (ML) and artificial intelligence (AI) techniques into their applications to accelerate and enhance scientific research and discovery across a wide range of disciplines. For example, machine learning has been successfully integrated into tools for weather forecasting, earth sciences, astronomy, high-resolution imaging, genomics, and molecular biology. However, the ever-growing size of scientific datasets results in prohibitive hardware resource costs, significantly complicating the deployment of these applications on high-performance computing platforms at scale. Lossy compression — a data reduction technique that significantly reduces dataset size by removing redundant or less important information — has proven effective for many scientific datasets, including those from cosmology and structural biology. Despite its promise, integrating lossy compression into AI-driven scientific applications remains a non-trivial challenge, requiring broad expertise in data compression and machine learning, as well as a deep understanding of application requirements, system considerations, and their interactions. These complexities hinder the adoption of this powerful data reduction technique in scientific applications. The overarching goal of this project is to address this gap by providing a cyberinfrastructure that seamlessly and adaptively integrates lossy compression into deep learning pipelines within scientific applications. This integration will reduce memory usage and communication overhead, enabling AI-for-Science applications to scale to massive datasets. The design includes several key innovations. First, it features a user-friendly interface that allows users to define accuracy requirements — which may evolve during application execution — and to instantiate different compressors, supporting both customization and extensibility. Second, it provides a software layer that integrates with popular deep learning frameworks, such as PyTorch, enabling compression to be applied to existing neural network models with minimal code modifications. Third, it incorporates an adaptive execution engine that dynamically selects the appropriate compressors and error bounds based on the desired accuracy, data characteristics (e.g., smoothness, value range, sparsity), model structure, and system configuration. The cyberinfrastructure will support both existing and emerging machine learning accelerators and will be released as open-source software, accompanied by documentation and training materials to promote adoption within the scientific and computing communities. Ultimately, this project has the potential to benefit the broader community by enabling scalable, AI-driven scientific discovery. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

machine learningbiology

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $300K

Deadline

2028-06-30

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Browse More Grants

Machine Learning Grants Biology Grants