Using Approximate Dynamic Programming to Control an Electric Vehicle Charging Station System

Ying Chen

Degree granted by The University of Texas at Arlington

Abstract

Dynamic programming (DP) as a mathematical programming approach to optimize a system evolving over time has been applied to solve the multi-stage optimization problems in a lot of areas such as manufacturing systems and environmental engineering. Due to the “curses of dimensionality”, traditional DP method is only able to solve a low dimensional problem or problems under very limiting restrictions, In order to employ DP to solve high-dimensional practical complex systems, approximate dynamic programming (ADP) is proposed. Several versions of ADP has been introduced in the literature and for this study, the author takes advantage of design and analysis of computer experiments (DACE) approach to discretize the state space via design of experiments and build the value function with statistical tools, which is named as DACE based ADP approach. In this research, the author first takes advantage of support vector regression (SVR) to build the value function instead of the previous ones such as neural network and multivariate adaptive regression spines, and explore the performance of SVR in the value function approximation compared to the other techniques. After that, 45-degree line correspondence stopping criterion is specified with an algorithm. Then, we formulates the complex electric vehicle (EV) charging stations system located in Dallas-Fort Worth (DFW) metropolitan area in Texas as a Markov decision process (MDP) problem and DACE based infinite horizon ADP algorithm with SVR is used to solve this high-dimensional, continuous-state, infinite horizon problem. Specified 45-degree line correspondence criterion is used to stop the DP iterations and select the ADP policy. Greedy algorithm as a benchmark is proposed to conduct a comparison through paired t-test with the selected ADP policy. The results demonstrate that DACE based infinite horizon ADP algorithm is able to solve the high-dimensional, large-scale, complex DP problem over continuous spaces and quantified 45-degree line correspondence rule is able to stop the DP iterations reasonably and select a high-quality ADP policy.