AAAI 2025 Tutorial on

User-Driven Capability Assessment of
Taskable AI Systems


Philadephia, PA, USA

February 26, 2025
(8:30 AM - 12:30 PM)

Overview

This tutorial will cover approaches for assessing the safety and functionality of AI systems designed to learn continuously and complete tasks in a user's environment. AI systems are increasingly interacting with non-expert users, leading to growing calls for better safety assessment and regulation by users, governments, and industry. While recent AI developments have made it easier to develop taskable AI systems, ensuring their safety presents unique challenges. Unlike traditional engineered systems where limited functionality yields safety, taskable AI systems are designed to adapt to user-specific tasks and environments, invalidating conventional approaches to safety assurance. These challenges cannot be addressed by simply extending existing verification and validation paradigms.

This tutorial is essential for researchers working on AI safety and will interest those in robotics, planning, and human-robot interaction. Participants will learn about foundational topics like active and passive action-model learning and assessment of black-box AI systems in stationary and adaptive settings. The tutorial covers novel capability discovery and assessment techniques, with applications in real-world scenarios like household robotics, digital assistants, autonomous vehicles, and healthcare systems. Specifically, we address three main areas: (i) why conventional verification and validation approaches fall short, (ii) specific requirements and promising research directions for formal assessment of AI systems, and (iii) solutions developed for restricted settings.

By exploring these challenges and research directions, the tutorial will provide both junior and senior researchers with the foundation to contribute to this area of continual assessment of AI systems that can learn, plan and act; emphasizing the interdisciplinary nature of AI assessment that combines formal methods, human-AI interaction, and AI safety.

Please feel free to send tutorial related queries at: pulkitv@mit.edu and siddharths@asu.edu.

Organizers



Pulkit Verma
Pulkit Verma
Postdoctoral Associate, Massachusetts Institute of Technology, USA
Pulkit Verma is a Postdoctoral Associate at the Interactive Robotics Group at the Massachusetts Institute of Technology, where he works with Julie Shah. His research focuses on the safe and reliable behavior of taskable AI agents. He investigates the minimal set of requirements in an AI system that would enable a user to assess and understand the limits of its safe operability. He received his Ph.D. in Computer Science from the School of Computing and Augmented Intelligence, Arizona State University, where he worked with Siddharth Srivastava. Before that, he completed his M.Tech. in Computer Science and Engineering at Indian Institute of Technology Guwahati with Pradip K. Das. He was awarded the Graduate College Completion Fellowship at ASU in 2023, Post Graduation Scholarship from the Government of India in 2013 and 2014, and received the Best Demo Award at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS) in 2022.


Siddharth Srivastava
Siddharth Srivastava
Associate Professor, Arizona State University, USA
Siddharth Srivastava is an Associate Professor in the School of Computing and Augmented Intelligence at Arizona State University. Srivastava was a Staff Scientist at the United Technologies Research Center in Berkeley before joining ASU. Prior to that, he was a postdoctoral researcher working with Stuart Russell and Pieter Abbeel at the University of California Berkeley. Srivastava received his PhD in Computer Science from the University of Massachusetts Amherst, working with Shlomo Zilberstein and Neil Immerman, and a (4+1) MS in Mathematics from Indian Institute of Technology (IIT), Kanpur. Srivastava is a recipient of the NSF CAREER award, the Top 5% Faculty Award from the Fulton Schools of Engineering at ASU, a Best Paper Award at the International Conference on Automated Planning and Scheduling (ICAPS), an Outstanding Dissertation award from the Department of Computer Science at UMass Amherst, a Best Final Year Thesis award from the Department of Mathematics at IIT Kanpur and the National Board of Higher Mathematics Scholarship in India. He served as conference Co-Chair for ICAPS 2019. He currently serves as Chair of the ICAPS Awards Committee and as Associate Editor for the Journal of AI Research.