ORCID Identifier(s)


Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Biomedical Engineering



First Advisor

Yujie Chi


The leading cause of premature death (death under the age of 70) is cancer. The top five cancers for both male and female are: lung, colorectum, pancreas, breast cancer, and prostate. In 2020 there was an estimated 19.3 million new cases with an estimated 9.9 million deaths. The cancer burden is expected to grow to 28.4 million by the year 2040. Surgery, chemotherapy, and radiotherapy are the three pillars in the modern clinic for cancer treatment. In radiotherapy, ionizing radiation particles can travel through the patient body, deposit energy along the way and damage the DNA Structure. There needs to be a balance between killing tumor cells and sparing healthy tissue. Intensity Modulated Radiation Therapy (IMRT) made it possible to better focus ionizing radiation deposition to tumors by using multi-leaf collimators (MLC). Stereotactic body radiotherapy (SBRT) further differentiated the radiation response between tumors and normal tissues with delivering a much higher dose per fraction with fewer fractions than conventional radiotherapy for tumors that are sensitive to fractionation. Yet, due to the complex procedure in radiation clinic, including imaging, planning, treatment simulation, and patient setup, the effectiveness of IMRT and SBRT could be hindered. Some hindering factors include organ motion that introduce large uncertainties between dose delivered and dose planned, treatment planning hinder by the quality of each treatment plan being heavily depending on the time and skill of the human planner. In our research, we retrospectively investigated the inter-fractional and intra-fractional motion for patient data collected from a clinic trial in high-risk prostate cancer SBRT. Our investigation revealed that the relative inter-fractional pelvic to prostate motion has a small impact on the pelvic target dose coverage when the patient was set up with prostate site aligned. This was mainly due to the restrict bladder filling protocol before treatment. As for the intra-fractional prostate motion, on average the dose to the prostate dropped by 6.5% because of the motion, which indicated the importance of effective motion intervention during treatment. We also investigated the possibility of automating the treatment planning procedure with reinforcement learning technique. With the use of MLC, treatment planning for radiotherapy is a challenging task as it requires to solve an inverse optimization problem which contains millions of possible solutions. Consequently, the current treatment planning process heavily relies on human labor to tune the planning parameters, which can be tedious, not easily reproducible and time consuming. With the rapid development of Artificial Intelligence (AI), there have been increasing efforts to automate the treatment planning process. One attracting AI technique, named reinforcement learning (RL), enabled the possibility to build a virtual treatment planner (VTP) that can mimic the human-like decision making to tune the treatment planning parameters. In this dissertation, I investigated two types of RL technique, named Q Learning and Actor & Critic techniques for automatic treatment planning. Q Learning is a value-based learning and I applied it to construct a VTP that can operate an in-house dose-volume constrained treatment planning system (TPS) for prostate cancer IMRT treatment planning. The VTP was successfully trained with 10 prostate cancer patient cases and tested with additional 50 cases. One problem for the Q learning is that it is hard to be trained when facing complex treatment planning task as it does not specify an exploration mechanism. To solve the problem, I implemented the Actor & Critic algorithm, which specifies the exploration by the action probabilities of the actor. The preliminary results indicated that the Actor/Critic network is more powerful in generating high-quality treatment plans. Furthermore, to enable the selection of action from a continuous space (to continuously tune a treatment planning parameter), I am in the process of implemented the Proximal Policy Optimization 2 (PPO2) for automatic treatment planning, in the hope that it can provide more powerful tuning of the treatment planning parameters. Overall, I expect my work in motion management and automatic treatment planning would lead to a technique advancement in radiation clinic for cancer treatment.


Deep learning, Reinforcement learning, Q learning


Biomedical Engineering and Bioengineering | Engineering


Degree granted by The University of Texas at Arlington