Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Electrical Engineering


Electrical Engineering

First Advisor

Frank Lewis


The topic discussed in this work addresses the current research being conducted at the Automation & Robotics Research Institute in the areas of UAV quadrotor control and heterogenous multi-vehicle cooperation. Autonomy can be successfully achieved by a robot under the following conditions: the robot has to be able to acquire knowledge about the environment and itself, and it also has to be able to reason under uncertainty. The control system must react quickly to immediate challenges, but also has to slowly adapt and improve based on accumulated knowledge. The major contribution of this work is the transfer of the ADP algorithms from the purely theoretical environment to the complex real-world robotic platforms that work in real-time and in uncontrolled environments. Many solutions are adopted from those present in nature because they have been proven to be close to optimal in very different settings.For the control of a single platform, reinforcement learning algorithms are used to design suboptimal controllers for a class of complex systems that can be conceptually split in local loops with simpler dynamics and relatively weak coupling to the rest of the system. Optimality is enforced by having a global critic but the curse of dimensionality is avoided by using local actors and intelligent pre-processing of the information used for learning the optimal controllers. The system model is used for constructing the structure of the control system, but on top of that the adaptive neural networks that form the actors use the knowledge acquired during normal operation to get closer to optimal control. In real-world experiments, efficient learning is a strong requirement for success. This is accomplished by using an approximation of the system model to focus the learning for equivalent configurations of the state space. Due to the availability of only local data for training, neural networks with local activation functions are implemented.For the control of a formation of robots subjected to dynamic communication constraints, game theory is used in addition to reinforcement learning. The nodes maintain an extra set of state variables about all the other nodes that they can communicate to. The more important are trust and predictability. They are a way to incorporate knowledge acquired in the past into the control decisions taken by each node. The trust variable provides a simple mechanism for the implementation of reinforcement learning. For robot formations, potential field based control algorithms are used to generate the control commands. The formation structure changes due to the environment and due to the decisions of the nodes. It is a problem of building a graph and coalitions by having distributed decisions but still reaching an optimal behavior globally.


Electrical and Computer Engineering | Engineering


Degree granted by The University of Texas at Arlington