GenPlan 2023: Seventh Workshop on

Generalization in Planning


Room 238-239, New Orleans Ernest N. Morial Convention Center

New Orleans, USA

December 16, 2023

Overview

Humans are good at solving sequential decision-making problems, generalizing from a few examples, and learning skills that can be transferred to solve unseen problems. However, these problems remain long-standing open problems in AI.

This workshop will feature a synthesis of the best ideas on the topic from multiple highly active research communities. On the one hand, recent advances in deep-reinforcement learning have led to data-driven methods that provide strong short-horizon reasoning and planning, with open problems regarding sample efficiency, generalizability and transferability. On the other hand, advances and open questions in the AI planning community have been complementary, featuring robust analytical methods that provide sample-efficient generalizability and transferability for long-horizon sequential decision making, with open problems in short-horizon control and in the design and modeling of representations.

We welcome submissions addressing the problem of generalizable and transferable learning in all forms of sequential decision making. This event represents the seventh edition of the recurring GenPlan series of Workshops.

Please feel free to send workshop related queries at: genplan23.neurips@gmail.com.

Call for Papers

The workshop will focus on research related to all aspects of learning, generalization, and transfer in sequential decision-making (SDM). This topic features technical problems that are of interest not only in multiple sub-fields of AI research (including reinforcement learning, automated planning, and learning for knowledge representation) but also in other fields of research, including formal methods and program synthesis. We will welcome submissions that address formal as well as empirical issues on topics such as:


Submission Guidelines

Submissions can describe either work in progress or mature work that would be of interest to researchers working on generalization in planning. We also welcome “highlights” papers summarizing and highlighting results from multiple recent papers by the authors. Preference will be given to new work (including highlights) and work in progress rather than exact resubmissions of previously published work.

Submissions of papers being reviewed at other venues (AAAI, ICRA, ICLR, AAMAS, CVPR, etc.) are welcome since GenPlan is a non-archival venue and we will not require a transfer of copyright. If such papers are currently under blind review, please anonymize the submission.

Two types of papers can be submitted:

Submissions may use as many pages of appendices (after the references) as they wish, but the reviewers are not required to read the appendix. Submissions should use the NeurIPS paper format. The papers should adhere to the NeurIPS Code of Conduct and NeurIPS policy on using LLMs for writing in their paper. Papers can be submitted via OpenReview at https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/GenPlan.

Important Dates

Announcement and call for submissions July 25, 2023
Paper submission deadline October 02, 2023 (11:59 PM UTC-12)
Author notification October 27, 2023
Accepted Papers Available Online December 03, 2023
Workshop December 16, 2023

Invited Talks







Giuseppe De Giacomo

Giuseppe De Giacomo
University of Oxford


Logic, Automata, and Games in Linear Temporal Logics on Finite Traces
Temporal logics on finite traces (LTLf, LDLf, PPLTL, etc.) are increasingly attracting the interest of the scientific community. These logics are variants of temporal logics used for specifying dynamic properties in Formal Methods, but focussing on finite though unbounded traces. They are becoming popular in several areas, including AI planning for expressing temporally extended goals, reactive synthesis for automatically synthesizing interactive programs, reinforcement learning for expressing non-Markovian rewards and dynamics, and Business Process Modeling for declaratively specifying processes. These logics can express general safety and guarantee (reachability) properties, though they cannot talk about the behaviors at the infinitum as more traditional temporal logics on infinite traces. The key characteristic of these logics is that they can be reduced to equivalent regular automata, and in turn, automata, once determinized, into two-player games on graphs. This gives them unprecedented computational effectiveness and scalability. In this talk, we will look at these logics, their corresponding automata, and resulting games, and show their relevance in service composition. In particular, we show how they can be used for automatically synthesizing orchestrators for advanced forms of goal-oriented synthesis.


Bio: Giuseppe De Giacomo is a Professor of Computer Science at the Department of Computer Science of the University of Oxford. He has been previously a Professor at the Department of Computer, Control, and Management Engineering of the University of Roma "La Sapienza". His research activity concerns theoretical, methodological, and practical aspects in different areas of AI and CS, most prominently Knowledge Representation, Reasoning about Actions, Generalized Planning, Autonomous Agents, Reactive Synthesis and Verification, Service Composition, Business Process Modeling, and Data Management and Integration. He is an AAAI Fellow, ACM Fellow, and EurAI Fellow. He received an ERC Advanced Grant for the project WhiteMech: White-box Self Programming Mechanisms (2019-2024). He was the Program Chair of ECAI 2020 and KR 2014. He is on the Board of EurAI. He chairs the steering committee of the new EurAI yearly summer school ESSAI.



Hector Geffner
Hector Geffner
RWTH Aachen University

Learning General Policies and Sketches
Recent progress in deep learning and deep reinforcement learning (DRL) has been truly remarkable, yet two problems remain: structural policy generalization and policy reuse. The first is about getting policies that generalize in a reliable way; the second is about getting policies that can be reused and combined in a flexible, goal-oriented manner. The two problems are studied in DRL but only experimentally, and the results are not clear and crisp. In our work, we have tackled these problems in a slightly different way, developing languages for expressing general policies, and methods for learning them using combinatorial and DRL approaches. We have also developed languages for expressing and learning general subgoal structures (sketches) and hierarchical polices which are based on the notion of planning width. In the talk, I'll present the main ideas and results.

This is joint work with Blai Bonet, Simon Ståhlberg, Dominik Drexler, and other members of the RLeap team.


Bio: Hector Geffner is an Alexander von Humboldt Professor at the RWTH Aachen University, Germany and a Guest Wallenberg Professor at Linköping University, Sweden. Before joining RWTH, he was an ICREA Research Professor at the Universitat Pompeu Fabra in Barcelona, Spain. Hector obtained a Ph.D. in Computer Science at UCLA and then worked at the IBM T.J. Watson Research Center in NY, and at the Universidad Simon Bolivar in Caracas. Distinctions for his work and the work of his team include the 1990 ACM Dissertation Award and three ICAPS Influential Paper Awards. Hector leads a project on representation learning for acting and planning (RLeap) funded by an ERC grant and is currently hiring.



Roberta Raileanu
Roberta Raileanu
FAIR, Meta AI


In-Context Learning of Sequential Decision-Making Tasks
Training autonomous agents that can learn new tasks from only a handful of demonstrations is a long-standing problem in machine learning. Recently, transformers have been shown to learn new language or vision tasks without any weight updates from only a few examples, also referred to as in-context learning. However, the sequential decision-making setting poses additional challenges having a lower tolerance for errors since the environment's stochasticity or the agent's actions can lead to unseen, and sometimes unrecoverable, states. In this talk, I will show that naively applying transformers to this setting does not enable in-context learning of new tasks. I will then show how different design choices such as the model size, data diversity, environment stochasticity, and trajectory burstiness, affect in-context learning of sequential decision-making tasks. Finally, I will show that by training on large diverse offline datasets, transformers are able to learn entirely new tasks with unseen states, actions, dynamics, and rewards, using only a handful of demonstrations and no weight updates. I will end my talk with a discussion of the limitations of offline learning approaches in sequential decision-making and some directions for future work.


Bio: Roberta Raileanu is a Research Scientist at Meta and an Honoray Lecturer at UCL. Her research focuses on designing machine learning algorithms that can make robust sequential decisions in complex environments. In particular, Roberta works in the area of deep reinforcement learning, with a focus on generalization, adaptation, continual, and open-ended learning. Roberta holds a PhD in Computer Science from NYU and a B.A. in Astrophysics from Princeton University.


Peter Stone
Peter Stone
The University of Texas at Austin and Sony AI

Causal Dynamics Learning for Task-Independent State Abstraction
Learning dynamics models accurately is an important goal for Model-Based Reinforcement Learning (MBRL), but most MBRL methods learn a dense dynamics model which is vulnerable to spurious correlations and therefore generalizes poorly to unseen states. In this paper, we introduce Causal Dynamics Learning for Task-Independent State Abstraction (CDL), which first learns a theoretically proved causal dynamics model that removes unnecessary dependencies between state variables and the action, thus generalizing well to unseen states. A state abstraction can then be derived from the learned dynamics, which not only improves sample efficiency but also applies to a wider range of tasks than existing state abstraction methods. Evaluated on two simulated environments and downstream tasks, both the dynamics model and policies learned by the proposed method generalize well to unseen states and the derived state abstraction improves sample efficiency compared to learning without it.


Bio: Peter Stone holds the Truchard Foundation Chair in Computer Science at the University of Texas at Austin. He is Associate Chair of the Computer Science Department, as well as Director of Texas Robotics. In 2013 he was awarded the University of Texas System Regents' Outstanding Teaching Award and in 2014 he was inducted into the UT Austin Academy of Distinguished Teachers, earning him the title of University Distinguished Teaching Professor. Professor Stone's research interests in Artificial Intelligence include machine learning (especially reinforcement learning), multiagent systems, and robotics. Professor Stone received his Ph.D in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. He is an Alfred P. Sloan Research Fellow, Guggenheim Fellow, AAAI Fellow, IEEE Fellow, AAAS Fellow, ACM Fellow, Fulbright Scholar, and 2004 ONR Young Investigator. In 2007 he received the prestigious IJCAI Computers and Thought Award, given biannually to the top AI researcher under the age of 35, and in 2016 he was awarded the ACM/SIGAI Autonomous Agents Research Award. Professor Stone co-founded Cogitai, Inc., a startup company focused on continual learning, in 2015, and currently serves as Executive Director of Sony AI America.



Amy Zhang
Amy Zhang
The University of Texas at Austin and Meta AI


Value-Based Abstractions for Planning
As reinforcement learning continues to advance, the integration of efficient planning algorithms with powerful representation learning becomes crucial for solving long-horizon tasks. We address key challenges in planning, reward learning, and representation learning through the objective of learning value-based abstractions. We explore this idea via goal-conditioned reinforcement learning to learn generalizable value functions and action-free pre-training. By leveraging self-supervised reinforcement learning and efficient planning algorithms, these approaches collectively contribute to the advancement of decision-making systems capable of learning and adapting to diverse tasks in real-world environments.


Bio: Amy Zhang is an assistant professor at UT Austin in the Chandra Family Department of Electrical and Computer Engineering and visiting faculty at Meta AI - FAIR. Her work focuses on improving generalization in reinforcement learning through bridging theory and practice in learning and utilizing structure in real world problems. Previously, she was a postdoctoral fellow at UC Berkeley and obtained her PhD from McGill University and the Mila Institute.

Program

Video Recordings now available on the NeurIPS Workshop Page
08:15 AM Workshop Opening
08:20 AM Invited Talk: Peter Stone
Causal Dynamics Learning for Task-Independent State Abstraction
08:55 AM Session Chair: Pulkit Verma
Contributed Talks
09:25 AM Invited Talk: Amy Zhang
Value-Based Abstractions for Planning
10:00 AM Coffee Break
10:30 AM Invited Talk: Hector Geffner
Learning General Policies and Sketches
11:05 AM Session Chair: Anders Jonsson
Paper Talks
11:55 AM Lunch Break
01:30 PM Invited Talk: Giuseppe De Giacomo
Logic, Automata, and Games in Linear Temporal Logics on Finite Traces
02:05 PM Session Chair: Shlomo Zilberstein
Contributed Talks
02:25 PM Poster Session 1: List of Posters
03:00 PM Coffee Break
03:30 PM Poster Session 2: List of Posters
04:00 PM Invited Talk: Roberta Raileanu
In-Context Learning of Sequential Decision-Making Tasks
04:35 PM Session Chair: Matthew Taylor
Contributed Talks
04:55 PM Panel Discussion
Moderator: Blai Bonet
Panelists: Giuseppe De Giacomo, Hector Geffner, Anders Jonsson, Matthew Taylor, and Shlomo Zilberstein.
05:25 PM Closing


Accepted Papers

Committees

Organizing Committee


Pulkit Verma
Pulkit Verma
Arizona State University, USA


Pulkit Verma
Siddharth Srivastava
Arizona State University, USA


Aviv Tamar
Aviv Tamar
Technion - Israel Institute for Technology, Israel


Felipe Trevizan
Felipe Trevizan
Australian National University, Australia




Advisory Board



Program Committee