NSF AI Disclosure Required

NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.

Collaborative Research: SaTC: CORE: Medium: Practical Generative AI to Enhance Security Classifiers by Overcoming Data Challenges

NSF

open

Machine learning (ML) is increasingly used to combat cyberthreats. ML enables tools known as security classifiers to identify potential cyberthreats, e.g., to detect malicious software ("malware") or a network intrusion. Such classifiers are typically developed by collecting data on threats (e.g., malware samples) and benign entities (e.g., legitimate software), then building an ML model that learns patterns in the gathered training data that suggest the presence of threats. The model is then used in real systems to help identify new undetected threats. However, for many security problems, good training data is hard to find. Threats may be relatively rare, or not shared by people and companies that experience them. This leads to unbalanced datasets that contain mostly benign cases, which ML models often struggle with. Threats also change over time, as malicious software is constantly evolving, and models may quickly go out of date. This project will develop ways to address these data challenges by developing methods for Generative Artificial Intelligence (GenAI) tools to create synthetic but useful data for network and application security tasks. Through this, the project will advance knowledge of both GenAI systems and more practical, effective defenses against cyberthreats. The project team will also create novel educational resources on AI and security topics and provide educational opportunities for pre-college teachers and students and research opportunities for undergraduate students. The project's goal is to boost and maintain the performance of a security task by addressing training data challenges. The work is structured around three research thrusts. The first thrust focuses on conducting an in-depth study to evaluate the effectiveness of existing GenAI schemes in addressing data challenges in ML-based network and application security tasks, highlighting cases where they fall short and where there are opportunities for improvement. The second thrust is to develop a novel GenAI framework called Aura, which will be purpose-built for the security domain to generate high-quality synthetic data, even when training data are limited, biased, or have noisy labels. The third thrust will extend Aura to support security operations after deployment by designing novel techniques to mitigate concept drift and by enabling continual learning against evolving security threats. Aura will also provide novel model interpretation schemes to attribute predictions to synthetic data in the training set. Beyond the contributions to the specific problem of generating useful synthetic data, the project will also provide a case study of the larger goal of leveraging AI-based techniques to support security and privacy, an area of high interest to the research community. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

machine learningeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $320K

Deadline

2027-09-30

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Browse More Grants

Machine Learning Grants Education Grants