Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Electrical Engineering


Electrical Engineering

First Advisor

Frank Lewis


Optimal feedback control design has been responsible for much of the successful performance of engineered systems in aerospace, manufacturing, industrial processes, vehicles, ships, robotics, and elsewhere. Although most control design methods concern only about the stability of the controlled systems, the stability is a bare minimum requirement and it is desired to design a controller by optimizing some predefined performance criteria. However, the classical optimal control methods rely on offline solutions to complicated Hamilton-Jacobi-Bellman (HJB) equations which require complete knowledge about the system dynamics. Therefore, they are not able to cope with uncertainties and changes in dynamics. This research presents adaptive control structures based on reinforcement learning (RL) for computing online the solutions to H-2 optimal tracking and H-infinity control of single-agent systems and optimal coordination of multi-agent systems. A family of adaptive controllers is designed that converge in real time to optimal control and game theoretic solutions by using data measured along the system trajectories. First, an alternative approach for formulating the optimal tracking problem in a causal manner is developed that enables us to use RL to find the optimal solutions. On-policy RL is used to solve linear and nonlinear H-2 optimal control problems. In contrast to the existing methods, the proposed approach for nonlinear systems takes into account the input constraints in the optimization problem by using a nonquadratic performance function. Then, a new model-free method of off-policy learning is presented to find the solution to the H-infinity control problem online in real-time. The proposed method has two main advantages compared to the other mode-free methods. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Finally, an optimal mode-free solution to the output synchronization problem of heterogeneous discrete-time systems is developed. It is shown that this control protocol implicitly solves the output regulator equations. The relationship between the solution to the output regulator equations and the proposed solution is also shown.


Reinforcement learning, Optimal tracking


Electrical and Computer Engineering | Engineering


Degree granted by The University of Texas at Arlington