Graduation Semester and Year

2016

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Manfred Huber

Abstract

Imitation leaning is the learning of advanced behavior whereby an agent acquires a skill by observing another's behavior while performing the same skill. The main objective of imitation learning is to make robots usable for a variety of tasks without programming them but by simply demonstrating new tasks. The power of this approach arises since end users of such robots will frequently not know how to program the robot, might not understand the dynamics and behavioral capabilities of the system, and might not know how to program these robots to get different/new tasks done. Some challenges in achieving imitation capabilities exist, include the difference in state space where the robot observes demonstrations of task in terms of different features compared to the ones describing the space in which it acts. The proposed approach to imitation learning in this thesis allows a robot to learn new tasks just by observing someone doing that task. For achieving this, the robot system uses two models. The first is an Internal model which represents all behavioral capabilities of the robot and consists of all possible states, actions, and the effects of executing the actions. The second is a demonstration model which represents the perception of the task demonstration and is a continuous time, discrete event model consisting of a stream of state behavior sequences. Examples of perceived behavior can include a rolling behavior or a falling behavior of objects, etc. The approach proposed here then learns the similarity between states of the internal model and the states of the demonstrated model using a neural network function approximator and reinforcement learning with a reward feedback signal provided by the demonstrator. Using this similarity function, a heuristic search algorithm is used to find the action sequence that leads to the state and action sequence that is most similar to the observed task demonstrations. In this way, a robot learns to map its internal states to the sequence of observed states, yielding a policy for performing the corresponding task.

Keywords

Functional imitation, Goal based imitation, Learning from demonstration, Programming by demonstration, Action behavior mapping

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS