Graduation Semester and Year

2007

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Electrical Engineering

Department

Electrical Engineering

First Advisor

Frank Lewis

Abstract

In this work, approximate dynamic programming (ADP) designs based on adaptive critic structures are developed to solve the discrete-time optimal control problems in which the state and action spaces are continuous. This work considers linear discrete-time systems as well as nonlinear discrete-time systems that are affine in the input. This research resulted in forward-in-time reinforcement learning algorithms that converge to the solution of the Generalized Algebraic Riccati Equation (GARE) for linear systems. For the nonlinear case, a forward-in-time reinforcement learning algorithm is presented that converges to the solution of the associated Hamilton-Jacobi Bellman equation (HJB). The results in the linear case can be thought of as a way to solve the GARE of the well-known discrete-time optimal control problem forward in time. Four design algorithms are developed: Heuristic Dynamic programming (HDP), Dual Heuristic dynamic programming (DHP), Action dependent Heuristic Dynamic programming (ADHDP) and Action dependent Dual Heuristic dynamic programming (ADDHP). The significance of these algorithms is that for some of them, particularly the ADHDP algorithm, a priori knowledge of the plant model is not required to solve the dynamic programming problem. Another major outcome of this work is that we introduce a convergent policy iteration scheme based on the HDP algorithm that allows the use of neural networks to arbitrarily approximate for the value function of the discrete-time HJB equation. This online algorithm may be implemented in a way that requires only partial knowledge of the model of the nonlinear dynamical system. The dissertation includes detailed proofs of convergence for the proposed algorithms, HDP, DHP, ADHDP, ADDHP and the nonlinear HDP. Practical numerical examples are provided to show the effectiveness of the developed optimization algorithms. For nonlinear systems, a comparison with methods based on the State-Dependent Riccati Equation (SDRE) is also presented. In all the provided examples, parametric structures like neural networks have been used to find compact representations of the value function and optimal policies for the corresponding optimal control problems

Disciplines

Electrical and Computer Engineering | Engineering

Comments

Degree granted by The University of Texas at Arlington

Share

COinS