NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.
NSF
Whole Genome Sequencing (WGS) is a powerful tool for uncovering genetic variants linked to diseases, understanding evolutionary processes, and tracing population histories. Given that over a million human genomes have been sequenced to date, the sheer volume of data requires advanced computational solutions for efficient analysis. Our research utilizes the Genotype Representation Graph (GRG) to improve the performance of WGS data analysis significantly. By optimizing this data structure and using modern parallel computing architectures and techniques, we aim to reduce the time required for complex genomic analyses and enable the fast and efficient processing of large datasets such as housed in the UK Biobank. This project aims to develop tools and infrastructure that will enable researchers to advance our understanding of human genetics and improve the accuracy of population genetic studies, which will ultimately contribute to better health outcomes and greater scientific knowledge. In addition, this approach to representing large, complex data sets and manipulating them effectively will serve as a proxy for modern computing approaches, to guide the design of advanced parallel computing architectures and techniques. This research focuses on improving the efficiency of Whole Genome Sequencing (WGS) data analysis through two main objectives. The first objective is to optimize the Genotype Representation Graph (GRG) for modern parallel computing architectures, particularly GPUs, to handle the dynamic nature of genomic computations. Additionally, a matrix abstraction of the GRG will be developed, to enable efficient computation on architectures beyond GPUs by utilizing sparse matrices for near-linear scaling on distributed memory machines. The second goal is to use the improved GRG to perform accurate Ancestral Recombination Graph (ARG) inference, a critical step in population genetics. By implementing and testing these approaches on the large-scale UK Biobank data, the scalability and accuracy of the novel methodologies will be demonstrated. This interdisciplinary project will combine high-performance computing advances with innovative data structures to answer key questions in population genetics and provide insights for future high-performance systems in the post-Moore’s Law era. This award is co-funded by the Directorate for Computer and Information Science and Engineering and by the Directorate for Biological Sciences. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Up to $500K
2027-03-31
Detailed requirements not yet analyzed
Have the NOFO? Paste it below for AI-powered requirement analysis.
One-time $749 fee · Includes AI drafting + templates + PDF export
Category I: CloudBank 2: Accelerating Science and Engineering Research in the Commercial Cloud
NSF — up to $24M
Category I: Nexus: A Confluence of High-Performance AI and Scientific Computing with Seamless Scaling from Local to National Resources
NSF — up to $24.0M
Research Infrastructure: Mid-scale RI-1 (MI:IP): Dual-Doppler 3D Mobile Ka-band Rapid-Scanning Volume Imaging Radar for Earth System Science
NSF — up to $20.0M
A Scientific Ocean Drilling Coordinating Office for the US Community
NSF — up to $17.6M
Category I: AMA27: Sustainable Cyber-infrastructure for Expanding Participation
NSF — up to $13.8M
Graduate Research Fellowship Program (GRFP)
NSF — up to $9.0M