CAREER: Safety, Human-(mis)alignment, and Interaction-awareness in Control
openNSF
Autonomous control systems in many domains, such as transportation, healthcare, assistance, and manufacturing, operate with humans in increasingly uncertain and interactive environments. Their design introduces unprecedented challenges in ensuring performance, safety, and human alignment due to a broad range of risks. First, existing techniques in risk quantification and control often assume observable state, system dynamics, and in-distribution data with sufficient risk events. However, these assumptions frequently break down in real-world scenarios. Second, inferring and reducing human-perceived risk requires characterizing safe actions from sparse feedback, but existing techniques often cannot handle this complexity. Finally, in nonstationary interactions, control methods that ignore the opponents’ adaptation can unintentionally encourage aggressive or exploitative behaviors by humans. In this proposal, we study how to provide long-term, lifelong assurances against various risks in the control of autonomous systems operating in uncertain and interactive environments. The outcome of this research will be broadly disseminated through seminars and tutorials and integrated into the PI’s classes and K-12 outreach programs. To enrich the learning experience, we will create a virtual game that allows students to understand key concepts through interacting with others. These efforts will train future researchers and engineers, and facilitate the transfer of insights across domains.
The goal of this proposal is to develop techniques that mitigate various risks—visible or latent, actual or perceived—across single and repeated interactions. In Thrust 1, we quantify long-term risks despite latent variables and limited data. In Thrust 2, we develop efficient control techniques that provide long-term assurance against latent and human-perceived risks. One of our core innovations is probabilistic invariance, which flexibly handles unknowns, improves assurance time horizon vs. computation tradeoffs, and allows many risk inference methods to be leveraged in real-time control. In Thrust 3, we study the designs that avoid adverse adaptation or facilitate desirable behaviors in nonstationary interactions. Thrusts 1 and 2 will offer complementary insights into how short-term observations can be used to infer long-term risk and how controllers with short planning horizons can mitigate long-term risks. Thrusts 1-3 will provide a unified perspective across two pairs of topics that are often studied separately—control within a single interaction vs. adaptation across repeated interactions and alignment vs. deliberate misalignment. This unified perspective will enable us to proactively harness uncertainty, adaptation, and (mis)alignment to design safer, high-performing, and more human-aligned autonomous control systems.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.