Humans are not limited to a fixed set of innate or preprogrammed tasks. We learn quickly through language and other forms of natural interaction, and we improve our performance and teach others what we have learned. Understanding the mechanisms that underlie the acquisition of new tasks through natural interaction is an ongoing challenge. Advances in artificial intelligence, cognitive science, and robotics are leading us to future systems with human-like capabilities. A huge gap exists, however, between the highly specialized niche capabilities of current machine learning systems and the generality, flexibility, and in situ robustness of human instruction and learning. Drawing on expertise from multiple disciplines, this Strüngmann Forum Report explores how humans and artificial agents can quickly learn completely new tasks through natural interactions with each other.
The contributors consider functional knowledge requirements, the ontology of interactive task learning, and the representation of task knowledge at multiple levels of abstraction. They explore natural forms of interactions among humans as well as the use of interaction to teach robots and software agents new tasks in complex, dynamic environments. They discuss research challenges and opportunities, including ethical considerations, and make proposals to further understanding of interactive task learning and create new capabilities in assistive robotics, healthcare, education, training, and gaming.
Contributors Tony Belpaeme, Katrien Beuls, Maya Cakmak, Joyce Y. Chai, Franklin Chang, Ropafadzo Denga, Marc Destefano, Mark d'Inverno, Kenneth D. Forbus, Simon Garrod, Kevin A. Gluck, Wayne D. Gray, James Kirk, Kenneth R. Koedinger, Parisa Kordjamshidi, John E. Laird, Christian Lebiere, Stephen C. Levinson, Elena Lieven, John K. Lindstedt, Aaron Mininger, Tom Mitchell, Shiwali Mohan, Ana Paiva, Katerina Pastra, Peter Pirolli, Roussell Rahman, Charles Rich, Katharina J. Rohlfing, Paul S. Rosenbloom, Nele Russwinkel, Dario D. Salvucci, Matthew-Donald D. Sangster, Matthias Scheutz, Julie A. Shah, Candace L. Sidner, Catherine Sibert, Michael Spranger, Luc Steels, Suzanne Stevenson, Terrence C. Stewart, Arthur Still, Andrea Stocco, Niels Taatgen, Andrea L. Thomaz, J. Gregory Trafton, Han L. J. van der Maas, Paul Van Eecke, Kurt VanLehn, Anna-Lisa Vollmer, Janet Wiles, Robert E. Wray III, Matthew Yee-King
Kevin A. Gluck and John E. Laird, Chairs
Program Advisory Committee
Kenneth M. Ford, Kevin A. Gluck, John E. Laird, Elena Lieven, Julia R. Lupp, Luc Steels, and Niels Taatgen
Goals of the Forum
Understanding the acquisition of new tasks through natural interaction is a fundamental unsolved problem. It is an inherently multidisciplinary challenge, which impedes progress due to the fractionated state of the relevant scientific and technical disciplines. This Forum will be a catalyzing event to achieve the following goals:
Insights gained from this Forum will provide a foundational reference and organizing framework for global research and development in interactive task learning.
Context
The stability of social systems depends critically on realizing sustainable methods of “collaboration,” yet how and by which means collaboration is achieved is not clearly understood; neither are the conditions or processes that lead to its breakdown or failure. [For context, collaboration is understood as cooperation between agents toward mutually constructed goals.] Part of the reason for our lack of understanding is that the phenomenon of collaboration is, by nature, a highly multidisciplinary problem, and effective research into its complexities has been difficult to achieve across the broad range of scientific and technical disciplines involved.
The need for a fundamental understanding of collaboration, however, has become increasingly important. Not only does humankind demand answers as it attempts to address critical challenges at multiple scales (e.g., climate change, migration, enhanced automation, social and economic inequality), but ever-increasing technological and economical means of interconnecting people and societies are disrupting long-established, familiar patterns of how we interact. Radical technological changes that are ongoing have the potential to reshape collaboration in ways that are currently hard to predict or influence (e.g., by altering configurations in interaction, information creation, and modes of communication). On one hand, such changes could disrupt hitherto stable forms of collaboration by affecting critical communication channels and traditional roles, as can be observed in the rapidly changing patterns in governance, commerce, and social interaction. On the other, technology could lead to the emergence of novel, successful forms of collaboration that deviate from traditional “hierarchical” architectures. Evidence of this can be seen in areas as diverse as highly automated manufacturing plants, the open science movement, collaborative software repositories, user-centered services, and the sharing of economy-based modes of organization. Without a fundamental understanding of the mechanisms, processes, and boundary conditions of collaboration, it is not possible to evaluate or predict which of these possible scenarios are sustainable or even plausible.
To remedy this knowledge gap requires a comprehensive research program. At its core, a theoretical framework must link pertinent aspects of collaboration across spatiotemporal scales and contexts. This task is a tall order, yet given current pressures on human–human, human–machine, and future machine–machine collaboration, we believe that an attempt must be made for a first survey.
Background and Motivation for the Forum
Humans are not limited to a fixed set of innate or preprogrammed tasks. We quickly learn novel tasks through language and other forms of natural communication, and once we learn them, we learn to perform them better. We learn to play new games in just a few minutes; we learn how to use new devices such as smart phones, computers, and industrial machinery; and we can learn how to help a disabled family member with their everyday tasks, adapting to their needs over time. Advances in artificial intelligence, cognitive science, and robotics are leading us to future systems with impressive cognitive and physical capabilities. However, due to the dynamic, non-stationary environments in which such systems will need to operate, it is impossible to anticipate ahead of time and pre-program all of the knowledge required for these systems to meet their functional requirements. We want systems that are more like partners or teammates, and not merely tools.
How will these future systems learn the unanticipated and evolving complex tasks we want them to perform?
How can this endless variety of new requirements be learned quickly through natural interaction with people?
Currently, only isolated research is being conducted on this problem. Most of the related work ignores and avoids the reality that we need more basic research on the fundamental nature of interactive task learning. Our objective is to catalyze the global research community to pursue the science and technology necessary for interactive task learning; that is, how humans learn new tasks from each other, and how can we develop intelligent artificial agents that also learn and teach new tasks through natural interactions with humans. This is an extremely ambitious problem to tackle, but recent progress in many of the related fields suggests that now is the time to make a cooperative and coordinated push toward interactive task learning.
Understanding the acquisition of new tasks through natural interaction is a fundamental, unsolved problem. Pursuing it will increase our understanding of how both humans and artificial agents convert an externally communicated description or demonstration into efficient executable procedural knowledge that is incrementally and dynamically integrated with existing knowledge. This requires both the assimilation of new knowledge with existing knowledge, and the accommodation of existing knowledge to the new knowledge. Extending our understanding of the computational processes involved when humans learn new tasks will be a major advance for cognitive modeling and provide insight into how task teaching can be structured to make task learning easier and faster for people, suggesting improved methods for training and education. It will also increase our understanding of how a broad range of capabilities we associate with cognition work together, including extracting task-relevant meaning from perception, task-relevant action, grounded language processing, dialog and interaction management, integrated knowledge-rich relational reasoning, problem solving, learning, and metacognition. This integration contrasts the general trend in many relevant fields, which is toward increasing fragmentation and focusing on narrow problems.
The time is ripe for a deep exploration of the topic of how humans and artificial agents can quickly learn completely new tasks through natural interactions. It is a research problem that needs to draw from many disciplines that are often isolated because of different, although related, goals and very different methodologies. This Forum provides a unique opportunity to provide a foundational reference and organizing framework for global research and development in interactive task learning.
Group 1: Interaction for Task Instruction and Learning
Central Question: What are the most effective and natural methods for humans, robots, and AI agents to interact in support of instruction and learning?
Here the focus is on interactive task learning, where a person, robot, or AI agent learns a task from another entity, typically a human teacher, through natural interaction. The interaction can include the teacher describing the goals and steps of the task using natural language, possibly leading the learner through the task; or the interaction can include the teacher demonstrating the task and the student acquiring new task knowledge and skill by observing what the teacher is doing. The interaction can include sketching, gesturing, or using other visual aids and non-verbal communications that must be interpreted in context and with regard to task goals. In addition, the interaction can include combinations of these modes, such as when the instructor provides a demonstration with natural language commentary. Interactive learning can also involve the student asking questions or requesting clarifications or additional examples in order to refine and improve understanding.
Group 2: Task Knowledge
Central Question: What knowledge needs to be learned to acquire a novel task, and what existing background knowledge does an agent need so that it can effectively use that newly acquired knowledge?
The purpose of interactive task learning is for an agent to learn the knowledge necessary to perform well on a new task. Task competence is directly related to the concept of “understanding” as defined by Simon (1977): “S understands task T if S has the knowledge and procedures needed to perform T.” Learning to perform new tasks does not occur in isolation. We use knowledge of subtasks and skills to build on that prior knowledge when learning new tasks. We also have existing knowledge about the structure of tasks, so we already know in the abstract what must be learned for a new task. Once that knowledge is learned, we also have general task-independent reasoning, problem solving, and planning capabilities that we can marshal to apply the task knowledge to perform the task. Task competence also includes general task management abilities such as supporting the pursuit of multiple tasks, interrupting low priority tasks with higher priority tasks, and resuming suspended tasks. Furthermore, being able to perform the task is only the first step in mastering the task. An agent should be able to acquire and learn additional knowledge through instruction, as well as its own experience, so that it achieves mastery of the task.
Another fundamental challenge for a task learning agent is that although it might know about the structure of tasks in general, it still needs to learn the specifics of many different tasks, whose details it may know little if anything about. Thus, it has to be able to learn diverse types of concepts (objects, categories, relations), procedures (hierarchical, recursive, interruptible), and goals (achievement, maintenance, process). Although specialized agents may be capable of learning specific types of tasks (such as puzzles and games, or procedure-based tasks), the ultimate goal is to understand what is required for an agent to learn all types of tasks.
Group 3: Learning Task Knowledge
Central Question: What are the computational processes for assimilating and accommodating the diversity of new task knowledge through natural interaction with a human?
The primary goal of an interactive task learner is to learn a task from its interactions with a teacher and from its own experiences. It must have the necessary reasoning and learning capabilities to interpret instructions, map them onto the current situation, extract information about the task, generalize from examples and demonstrations, store experiences in its memories for future use, and then retrieve them when appropriate. What makes this especially challenging, in comparison to most research on machine learning, is that the learning does not occur within the confines of a specific task, where the learning mechanisms can be optimized to learn specific types of knowledge. In learning a new task, an agent must learn many different types of knowledge, from different types of interactions with an instructor, and it must learn quickly. No human instructor will stand for giving scores of examples: the agent must extract the relevant knowledge during the interactive session, and not through extensive off-line analysis. Moreover, learning must be online and integrated with the agent’s ongoing activities, so that at any time it can learn new aspects of tasks as well as interrupt learning based on demands of its existing goals and task.
Group 4: Task Instruction
Central Question: What instructional principles enable and improve interactive task learning?
Interactive task learning differs from traditional instructional contexts, such as education and training, where the stability of the domains affords development of detailed, carefully crafted curricula. Teachers, trainers, and tutors usually are required to have a relevant advanced degree or expert proficiency before being allowed to instruct in education and training environments. By contrast, the point of interactive task learning is to quickly acquire competence from an available instructor who may be skilled in the task, but may not be skilled in instruction. The task may also be new for the instructor, such as when a person creates a new game, a new technology becomes available, or a creative, new means of accomplishing a previously known task is discovered and needs to be transmitted to a learner. The requirement for a higher level of dynamism and in situ flexibility may impose special requirements on how instruction unfolds or these characteristics may enable specific affordances that can be taken advantage of by either the learner or instructor. We expect that we can draw on the lessons learned from research in education, training, expert systems, and intelligent tutoring.
Allen, J., N. Chambers, G. Ferguson, L. Galescu, H. Jung, M. Swift, and W. Taysom. 2007. PLOW: A Collaborative Task Learning Agent. In: Proc. Conf. on Artificial Intelligence (AAAI), vol. 22, pp. 1514. Menlo Park: AAAI Press.
Argall, B. D., S. Chernova, M. Veloso, and B. Browning. 2009. A Survey of Robot Learning from Demonstration. Robotics and Autonomous Systems 57(5):469–483.
Cakmak, M., and A. L. Thomaz. 2012. Designing Robot Learners that Ask Good Questions. In: Proc. 7th Annual ACM/IEEE Intl. Conf. on Human-Robot Interaction, pp. 17–24.
Gluck, K. A., and R. W. Pew, eds. 2005. Modeling Human Behavior with Integrated Cognitive Architectures: Comparison, Evaluation, and Validation. Mahwah, NJ: Erlbaum.
Hinrich, T. R., and K. D. Forbus. 2014. X Goes First: Teaching Simple Games through Multimodal Interaction. Advances in Cognitive Systems 3:31–46.
Kaiser, Ł. 2012. Learning Games from Videos Guided by Descriptive Complexity. In: Proc. 26th AAAI Conf. on Artificial Intelligence, pp. 963–970. AAAI Press.
Kirk, J., and J. E. Laird. 2013. Learning Task Formulations through Situated Interactive Instruction. In: Proc. 2nd Conf. on Advances in Cognitive Systems, pp. 219–236. Baltimore, MD: ACS.
Petit, M., Lallee, S., Boucher, J.-D., Pointeau, G., Cheminade, P., Ognibene, D., Chinellato, E., Pattacini, U., Gori, I., Martinez-Hernandez, U., Barron-Gonzalez, H., Inderbitzin, M., Luvizotto, A., Vouloutsi, V., Demiris, Y., Metta, G., Dominey, P.F. 2013. The Coordinating Role of Language in Real-Time Multimodal Learning of Cooperative Tasks. IEEE Transactions on Autonomous Mental Development 5(1):3–17.
Simon, H. A. 1977. Artificial intelligence systems that understand. Proc. 5th Intl. Joint Conf. on Artificial Intelligence, vol. 2, pp. 1059–1073. San Francisco: Morgan Kaufmann Publ.
Taatgen, N. A. 2013. The nature and transfer of cognitive skills. Psychological Review 120(3):439–471.
VanLehn, K. 2011. The relative effectiveness of human tutoring, intelligent tutoring systems and other tutoring systems. Educational Psychologist 46(4):197-221.