Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Electrical Engineering


Electrical Engineering

First Advisor

Frank Lewis


Over the last few decades, strong connections between reinforcement learning (RL) and optimal control have prompted a major effort towards developing online and model-free RL algorithms to learn the solution to optimal control problems. Although RL algorithms have been widely used to solve the optimal regulation problems, few results considered solving the optimal tracking control problem (OTCP), despite the fact that most real-world control applications are tracking problems. On the other hand, existing methods for solving OTCP require complete knowledge of the system dynamics. This research begins with developing an adaptive optimal algorithm for linear quadratic tracking problem (LQT). A discounted performance function is introduced for the LQT problem. A discounted algebraic Riccati equation (ARE) is then derived which gives the solution to the LQT problem. The integral reinforcement learning (IRL) technique and off-policy RL technique are used to learn the solution to the discounted ARE online and without requiring complete knowledge of the system dynamics. The proposed idea is then extended to solve optimal tracking control for nonlinear systems. The input constraints are also taken into account for nonlinear systems.In the next step, the proposed method is extended to solve the CT two-player zero-sum game arising in the H8 tracking control problem. An off-policy RL algorithm is developed which enables us to find the solution to the H8 tracking control problem online in real time and without requiring the disturbance being adjustable, which is usually impractical for most of real systems. The next results show how to design dynamic OPFP controllers for CT linear systems with unknown dynamics. To this end, it is first shown that the system state can be constructed using some limited observations on the system output over a period of the history of the system. A Bellman equation is then developed to evaluate a control policy and find an improved policy simultaneously using only some limited observations on the system output. Then, using this Bellman equation, a model-free IRL-based OPFB controller is developed. Next, a model-free approach is developed for solving output synchronization of heterogeneous multi-agent systems. Both the leader’s and the follower’s dynamics is assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader’s state for each agent. A model-free off-policy RL algorithm is then developed to solve the optimal output synchronization problem online in real time. It is shown that this distributed RL approach implicitly solves the output regulation equations without actually doing so and without requiring knowledge of the leader or of agent’s dynamics. Finally, a model-free RL based method is design for the human-robot interaction system to help the robot adapt itself to the level of the human skills. This assists the human operator to perform a given task with minimum workload demands and optimize the overall human-robot system performance. First, a robot-specific neuro-adaptive controller is designed to make the unknown nonlinear robot behave like a prescribed robot impedance model. Then, a task-specific outer-loop controller is designed to find the optimal parameters of the prescribed robot impedance model, online in real time.


Electrical and Computer Engineering | Engineering


Degree granted by The University of Texas at Arlington