Graduation Semester and Year

Spring 2026

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Jean Gao

Second Advisor

Dajiang Zhu

Third Advisor

Qilian Liang

Fourth Advisor

Junzhou Huang

Abstract

The rapid growth of single-cell RNA sequencing and transcriptomic datasets has created major computational challenges in causal discovery, representation learning, and biologically faithful data generation. To address these challenges, this dissertation presents three complementary deep learning frameworks for the analysis and modeling of transcriptomic data. Together, these methods form an integrative computational toolkit for understanding complex biological systems from high-dimensional and heterogeneous gene expression data.

First, this dissertation introduces DAG-VAERL, a causal discovery framework that integrates variational autoencoders, graph neural networks, reinforcement learning, and attention mechanisms to infer directed acyclic graphs for gene regulatory network analysis. DAG-VAERL improves causal structure learning in nonlinear and high-dimensional settings and demonstrates strong performance on both synthetic datasets and Alzheimer’s disease transcriptomic data, enabling more accurate discovery of causal relationships among lncRNAs and disease-related genes.

Second, this dissertation proposes the Transcriptome Graph Transformer (TGT), an unsupervised graph Transformer framework for transcriptomic representation learning. By modeling heterogeneous biological graphs composed of gene, pathway, and virtual nodes, TGT learns generalizable transcriptomic representations through pretraining. The model demonstrates strong performance across multiple downstream tasks, including Alzheimer’s disease classification, tumor transcriptomic classification, biomarker and pathway discovery, and zero-shot clustering of both transcriptomic and spatial transcriptomic data, while also providing improved interpretability and cross-dataset generalization.

Finally, this dissertation presents TransFlow, a Transformer-enhanced flow matching framework for in silico generation of single-cell RNA expression profiles. By learning biologically meaningful continuous-time transport dynamics under sparsity and non-negativity constraints, TransFlow generates realistic and cell-type-specific synthetic transcriptomic data. Experimental results on PBMC and Alzheimer’s disease datasets show that the framework better preserves manifold structure, correlation patterns, sparsity characteristics, and biologically relevant lncRNA-associated functional programs, supporting applications in data augmentation, benchmarking, and simulation of cellular states.

Overall, the methods presented in this dissertation advance transcriptomic analysis across the three core tasks of causal inference, unsupervised representation learning, and generative modeling. These contributions provide practical computational approaches for uncovering disease mechanisms, identifying biomarkers, and modeling complex cellular states, thereby supporting future advances in systems biology and precision medicine.

Keywords

Single-Cell Multi-Omics, Transcriptomics, Gene Regulatory Networks, Causal Discovery, Directed Acyclic Graphs; Graph Neural Networks, Graph Transformer, Deep Generative Models, Alzheimer’s Disease, Data Integration, Computational Biology

Disciplines

Biomedical Informatics | Other Computer Sciences

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Recommended Citation

Long, Teng, "INTEGRATIVE APPROACHES AND DATA ANALYSIS FOR SINGLE-CELL RNA SEQUENCING DATA" (2026). Computer Science and Engineering Dissertations. 1.
https://mavmatrix.uta.edu/cse_dissertations2/1

Download

Included in

Biomedical Informatics Commons, Other Computer Sciences Commons

COinS

Computer Science and Engineering Dissertations

INTEGRATIVE APPROACHES AND DATA ANALYSIS FOR SINGLE-CELL RNA SEQUENCING DATA

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Search

Browse

Author & Creator Corner

Links

Computer Science and Engineering Dissertations

INTEGRATIVE APPROACHES AND DATA ANALYSIS FOR SINGLE-CELL RNA SEQUENCING DATA

Author

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Share

Search

Browse

Author & Creator Corner

Links