Graduation Semester and Year

Summer 2024

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Mathematics

Department

Mathematics

First Advisor

Suvra Pal

Second Advisor

Shan Sun- Mitchell

Third Advisor

Andrzej Korzeniowski

Fourth Advisor

Xinlei (Sherry) Wang

Fifth Advisor

Souvik Roy

Abstract

Recent advancements in medical treatments have significantly enhanced the rates of recovery for numerous chronic illnesses. This progress has sparked growing interest in developing suitable statistical models capable of handling survival data that includes substantial cure fractions. The mixture cure model finds extensive application in analyzing survival data when there exists a cured subgroup. Standard logistic regression-based approaches for modeling the incidence part of the mixture cure model may suffer from poor predictive accuracy, especially in the presence of high dimensional covariates and/or non-linear covariate effects. To overcome this limitation, we propose the integration of distinct machine learning algorithms with mixture cure model which offer improved capability in capturing non-linear patterns in the data. Nonetheless, in scenarios where interpretability is crucial, we propose a novel mixture cure model that utilizes a decision trees-based classifier for modeling the incidence part, while preserving either the proportional hazards or the accelerated failure time structure for the latency part.

Moreover, this research extends the scope of mixture cure models to analyze mixed case interval censored data with a cured subgroup. A novel two-component framework is introduced, integrating a support vector machine approach for estimating the likelihood of being cured and a Cox proportional hazards structure for modeling the survival distribution of uncured individuals. An expectation maximization algorithm is developed for parameter estimation. The comprehensive evaluation considers diverse simulation studies, and real-world data, such as cutaneous melanoma data, NASA's Hypobaric Decompression Sickness Data, and leukemia data to demonstrate the superiority and beneficial contribution of incorporating machine learning models into cure rate estimation.

Keywords

Cure rate models, Machine learning, Survival analysis, EM-algorithm, Simulation studies

Disciplines

Data Science | Statistics and Probability

Available for download on Wednesday, July 29, 2026

Share

COinS