Graduation Semester and Year
Spring 2026
Language
English
Document Type
Thesis
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Junzhou Huang
Second Advisor
Dajiang Zhu
Third Advisor
Jean Gao
Fourth Advisor
Meng Ye
Abstract
In the evolving field of artificial intelligence, the efficacy of deep learning models is often gated by the quality of their training and the clarity of their decision-making processes. This dissertation addresses these crucial challenges by focusing on two key areas: enhancing pre-training strategies and improving model interpretability. Our approach is twofold, integrating novel pre-training methodologies that embed domain-specific knowledge early in the model training process, and developing advanced techniques for disentangling and clarifying the decision-making mechanisms within these models. The first direction of our research employs MoDNA, a motif-oriented pre-training framework specifically designed for DNA language models. By leveraging self-supervised learning, MoDNA harnesses the vast amounts of unlabeled genomic data while infusing biological priors into the training process, significantly boosting the model’s performance on downstream regulatory tasks such as promoter prediction and transcription factor binding site identification. This method demonstrates how targeted pre-training can overcome the limitations posed by sparse labeled data in the genomics field. The second direction focuses on the interpretability of graph neural networks (GNNs), which are crucial for analyzing structured data. Here, we introduce a novel method, Interpretable Graph Neural Networks with Disentangled Subgraph (IGNN-DS), that disentangles the causal and spurious factors influencing model predictions. By formalizing the interactions among graph structures, their labels, and the derived subgraphs through a Structural Causal Model (SCM), this approach clarifies how predictions are made while enhancing the model’s robustness to distribution shifts and out-of-distribution data. Building upon this foundation, we extend the framework by dividing SCM into two modes: Fully Informative Invariant Features (FIIF) and Partially Informative Invariant Features (PIIF). We introduce Causal Subgraphs and Information Bottlenecks (CSIB), which integrates invariance principles based on graph information bottleneck to guide the generation of causal subgraphs, achieving superior performance in out-of-distribution scenarios. Together, these strategies make deep learning models more reliable and understandable, thereby increasing their applicability and trustworthiness in critical domains such as genomics and structural data analysis. By pushing the frontiers of pre-training and interpretability, this research sets new benchmarks for what deep learning can achieve, facilitating breakthroughs that transform both the field of machine learning and its numerous applications.
Keywords
DNA Language Models;Graph Neural Networks;Model Interpretability
Disciplines
Computer Engineering
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
An, Weizhi, "PRE-TRAINING AND INTERPRETABILITY IN DEEP LEARNING MODELS" (2026). Computer Science and Engineering Dissertations. 11.
https://mavmatrix.uta.edu/cse_dissertations2/11