You are here

Bi-level Optimization for Learning Cost Functions from Demonstration

TitleBi-level Optimization for Learning Cost Functions from Demonstration
Publication TypeConference Paper
Year of Publication2007
AuthorsMills-Price, C., W-K. Wong, P. Tadepalli, and E. W. Dereszynski
Conference Name2007 AAAI Workshop on Acquiring Planning Knowledge via Demonstration
Date Published07/2007
Conference LocationVancouver, BC

An effective way for a novice to learn a new complex task is to observe an expert demonstrate how the task should be accomplished. While the expert demonstration provides all the necessary information for solving that particular instance of the task, the novice needs to be able to generalize from the demonstration in order to accomplish similar tasks in different settings. One way for the novice to generalize to other situations is to learn the expert’s preferences over goal configurations. In many domains, there may be multiple goal states for a given task. However, an expert prefers one goal state over other alternatives and thus demonstrates a particular sequence of steps leading to the preferred goal state. If each goal state is characterized as a vector of features, an internal cost function belonging to the expert can map the features values to a cost. Since the expert chose a particular goal state during the demonstration, we assume that this chosen goal configuration has an optimal cost with respect to the other alternatives considered by the expert.