Graduation Semester and Year
Spring 2026
Language
English, British
Document Type
Thesis
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Dajiang Zhu
Abstract
The complexity of human disease arises from biological processes that unfold across multiple scales, from molecular variation through cellular function, tissue organisation, brain phenotypes, each of which is associated with distinct measurement modalities, regularities, and characteristic. Contemporary biomedical artificial intelligence has brought the opportunity to reveal the complexity with in; however, its methodological default, in which models are trained on most readily available modality, does not adequately engage with the multi-scale connected structure by which biological meaning is constituted. The research area of multi-omics and multi-modal AI for biomedicine remains at an early exploratory stage, and the work presented in this dissertation and accompanying review is offered as one contribution to advancing that exploration. Specifically, it advances the argument, supported by methodological demonstrations, that the next generation of biomedical AI must be redesigned around biological multi-scale networks of bio-factors rather than developed in abstraction from it.
This argument is developed through a coordinated programme of representation learning across three complementary directions. The first concerns brain phenotyping, advancing the principle that computational phenotypes should be anatomically meaningful and individualised rather than imposed by arbitrary atlases. The second concerns the representation of multi-omics bio-factors , showing that large language models adapted to curated biomedical evidence can have their internal embedding spaces reorganised according to tissue, organ, and mechanistic structure, thereby transforming general-purpose models into instruments by which the relational architecture of biomedical knowledge is encoded. The third concerns multi-omics integration, proposing diverse integration methodology of genomics and brain phenotypes. This shift from fusion-as-combination to fusion-as-modulation more faithfully reflects the regulatory relationship between genotype and phenotype.
Taken together, these contributions converge on a methodological thesis: progress in biomedical AI depends on the encoding of biological structure, i.e., anatomical, evidentiary, and regulatory structure, within the architecture of the model. The literature review situates this thesis within the broader landscape of biomedical foundation models, examining the structural limitations that constrain current systems, including the gap between predictive accuracy and biological meaning, the landscape and characteristic of biomedical data, the persistent challenge of cross-scale integration, and the insufficiency of current tokenisation schemes for context-dependent bio-factors. It further articulates the trajectories along which the field is likely to advance. The author's efforts are presented as one early-stage exploration in a research area that will require sustained and collective work over the coming decade. The central throughline is a reframing of biomedical artificial intelligence into a scientific instrument for cross-scale biological long-context understanding, and ultimately contributing to the understanding of complex human disease.
Keywords
Deep learning, Multi-omics, Genomics, Genetics, Biomedicine, Artificial Intelligence, Multi-modality, Brain imaging, Neurodegenerative Diseases, Aging
Disciplines
Computational Biology | Data Science | Genetics | Genomics | Numerical Analysis and Scientific Computing | Theory and Algorithms
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Recommended Citation
Lyu, Yanjun, "TOWARD INTERPRETABLE MULTI-OMICS MULTIMODAL BIOMEDICAL ARTIFICIAL INTELLIGENCE" (2026). Computer Science and Engineering Dissertations. 10.
https://mavmatrix.uta.edu/cse_dissertations2/10
Included in
Computational Biology Commons, Data Science Commons, Genetics Commons, Genomics Commons, Numerical Analysis and Scientific Computing Commons, Theory and Algorithms Commons