NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.
NSF
Today's data science systems, ranging from batch jobs to interactive interfaces, are surprisingly fragile. Data scientists typically use dozens of libraries, but a single bug in any can destroy hours or even days of computation, causing significant pain. This issue has been widely discussed in the data science community and academic literature. Yet, no principled mechanisms have been proposed to address the issue which might be puzzling to database researchers because existing databases implement checkpointing to periodically save changes in data for future recovery. Why haven't data science systems adopted checkpointing? What are the unique properties of data science systems that challenge the adoption? This project will answer these questions and bring checkpointing to data science systems with zero modifications to existing libraries and programs. If successful, this project can enable checkpointing, for the first time, in today's data science ecosystems. It will enable recovery from crashes, execution “undos”, suspending cloud resources without losing data, etc. This project first identifies a critical challenge: data science systems lack mechanisms for detecting changes in data, an important premise of checkpointing. Existing databases achieve this with centralized buffer pools. In contrast, data science systems intentionally omit centralized data spaces, allowing individual libraries to manage data using shared memory, GPUs, and remote machines for high performance. The changes in these library-managed data must be identified for checkpointing. This project will achieve this identification by developing a nonintrusive state manager that can act like conventional buffer pools without forcing data to be placed in central places. The key idea is to build a mathematical map of library-managed data, including dependencies between data, using graphs. These graphs will enable new algorithms to identify changes, save them partially, and restore states correctly. This project will develop an open-source system, Kishu, to benefit all data practitioners. This project will also pursue its adoption into the National Center for Supercomputing Applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Up to $381K
2030-05-31
Detailed requirements not yet analyzed
Have the NOFO? Paste it below for AI-powered requirement analysis.
One-time $749 fee · Includes AI drafting + templates + PDF export
Canada Foundation for Innovation — Innovation Fund
Canada Foundation for Innovation — up to $50M
Human Frontier Science Program 2025-2027
NSF — up to $21.2M
Entrepreneurial Fellowships to Enhance U.S. Competitiveness
NSF — up to $15.0M
MATERNAL, INFANT AND EARLY CHILDHOOD HOMEVISITING GRANT PROGRAM - PROJECT ADDRESS: 1500 JEFFERSON STREET SE, OLYMPIA, WA...
Department of Health and Human Services — up to $12.0M
MATERNAL, INFANT AND EARLY CHILDHOOD HOMEVISITING GRANT PROGRAM - PROJECT ABSTRACT PROJECT TITLE: MATERNAL, INFANT A...
Department of Health and Human Services — up to $10.9M
Genome Canada — Large-Scale Genomics Research
Genome Canada — up to $10M