Fan Yang

ORCID Identifier(s)


Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Hwa Won Kim

Second Advisor

Gautam Das

Third Advisor

Dajiang Zhu

Fourth Advisor

Amal Isaiah


With the recent advance and widespread adoption of imaging technological innovations, clinical practitioner and scientists can easily acquire and store a large amount of various neuroimaging modalities, such as Diffusion Tensor Imaging (DTI), Magnetic Resonance Imaging (MRI), resting-state functional MRI (rs-fMRI) and Positron Emission Tomography (PET), etc. These novel imaging data sources cover a rich amount of factors that influence patients' cognitive health, offer an objective view of patients at unprecedented multi-resolution for the understanding of brain structure and function, and have the significant potential to improve healthcare by aiding better decision-making in diagnosing, monitoring and treating diseases. Machine Learning methods have emerged as the state-of-the-art in learning from the large-scale neuroimaging data. While their use for medical applications is interesting and insightful, it is often very challenging in practice. Some of the major challenges we encounter in the adoption of Machine Learning methods for neuroscience tasks are that examining the association between the socioeconomic characteristic and brain clinical measurements is difficult given the subtle variations between groups with different socioeconomic status, that effectively characterizing the early symptoms of the Alzheimer’s disease (AD) is in many cases not possible, that forecasting and capturing the disease-related dynamics of clinical measurements is necessary to better understand the progression of AD, and that modelling the dynamic associations between lengthy sequences of multivariate variables for brain connectivity analysis is computational expensive. To take care of these challenges, we propose multiple novel Machine Learning methods for providing a multi-scale representation of the original measurement to enhance the sensitivity of downstream statistical analysis, for integrating graph structure and diagnostic label information to characterize early symptoms of AD, for incorporating time-dependent label information to better understand the progression of AD, and for efficiently estimate and predict dynamic covariances on large-scale time series data. We demonstrate our developed methods on the challenging real-word data from various clinical studies in the neuroscience domain, including Adolescent Brain Cognitive Development (ABCD) study, Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, and Human Connectome Project (HCP). Our contributions advance the state-of-the art in regard to leveraging Machine Learning methods for neuroscience applications and accentuate the foreground in which artificial intelligence on large-scale neuroimaging data can improve healthcare with better decision-makings.


Machine learning, Neuroimaging, Statistical analysis, Representation learning


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington