NSF AI Disclosure Required

NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.

CAREER: Revolutionizing the Evaluation of AI Agents with Online and Offline Data

NSF

open

This project focuses on designing new methods to facilitate the evaluation of artificial intelligence (AI) agents. It the era where AI agents are rapidly proliferating, with new systems of increasingly capable AI technologies, it is crucial to thoroughly understand their performance capabilities and limitations—a prerequisite for both safe deployment and continuous improvement. Traditional evaluation methods require running AI agents in live environments to collect performance data, but this approach can be resource-intensive and pose significant safety risks. This project addresses these challenges by developing innovative evaluation methods that dramatically reduce the need for expensive and potentially hazardous live testing, thereby accelerating the safe deployment of current AI systems and enabling the development of next-generation AI agents. Additionally, the project will train future AI researchers, helping to expand access to AI research opportunities across the United States. This project pioneers three research thrusts to fulfill different evaluation needs. First, the project delivers methods to efficiently evaluate an AI agent in a holistic manner with a scalar performance metric by reimagining Monte Carlo methods. The key innovation involves repurposing offline data to inform the online sampling process of Monte Carlo methods, thereby reducing the required sample size for accurate performance estimation. Second, the project develops methods to efficiently evaluate an AI agent in a fine-grained manner across different initial conditions by reinvigorating value function learning. The approach identifies statistical metrics that are most indicative and influential to the performance of the learned value function, then optimizes those metrics during online data collection to maximize evaluation efficiency. Third, the project delivers methods to efficiently evaluate an AI agent according to human feedback by developing transformative techniques that substantially improve reward model quality while minimizing human annotation requirements. Through these research activities, the project aims to significantly enhance current methodologies for evaluating AI agents, ultimately accelerating the development pipeline of AI systems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

research

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $507K

Deadline

2030-08-31

Complexity

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Browse More Grants

Research Grants