Graduation Semester and Year

Spring 2026

Language

English, British

Document Type

Thesis

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Dajiang Zhu

Abstract

The complexity of human disease arises from biological processes that unfold across multiple scales, from molecular variation through cellular function, tissue organisation, brain phenotypes, each of which is associated with distinct measurement modalities, regularities, and characteristic. Contemporary biomedical artificial intelligence has brought the opportunity to reveal the complexity with in; however, its methodological default, in which models are trained on most readily available modality, does not adequately engage with the multi-scale connected structure by which biological meaning is constituted. The research area of multi-omics and multi-modal AI for biomedicine remains at an early exploratory stage, and the work presented in this dissertation and accompanying review is offered as one contribution to advancing that exploration. Specifically, it advances the argument, supported by methodological demonstrations, that the next generation of biomedical AI must be redesigned around biological multi-scale networks of bio-factors rather than developed in abstraction from it.

This argument is developed through a coordinated programme of representation learning across three complementary directions. The first concerns brain phenotyping, advancing the principle that computational phenotypes should be anatomically meaningful and individualised rather than imposed by arbitrary atlases. The second concerns the representation of multi-omics bio-factors , showing that large language models adapted to curated biomedical evidence can have their internal embedding spaces reorganised according to tissue, organ, and mechanistic structure, thereby transforming general-purpose models into instruments by which the relational architecture of biomedical knowledge is encoded. The third concerns multi-omics integration, proposing diverse integration methodology of genomics and brain phenotypes. This shift from fusion-as-combination to fusion-as-modulation more faithfully reflects the regulatory relationship between genotype and phenotype.

Taken together, these contributions converge on a methodological thesis: progress in biomedical AI depends on the encoding of biological structure, i.e., anatomical, evidentiary, and regulatory structure, within the architecture of the model. The literature review situates this thesis within the broader landscape of biomedical foundation models, examining the structural limitations that constrain current systems, including the gap between predictive accuracy and biological meaning, the landscape and characteristic of biomedical data, the persistent challenge of cross-scale integration, and the insufficiency of current tokenisation schemes for context-dependent bio-factors. It further articulates the trajectories along which the field is likely to advance. The author's efforts are presented as one early-stage exploration in a research area that will require sustained and collective work over the coming decade. The central throughline is a reframing of biomedical artificial intelligence into a scientific instrument for cross-scale biological long-context understanding, and ultimately contributing to the understanding of complex human disease.

Keywords

Deep learning, Multi-omics, Genomics, Genetics, Biomedicine, Artificial Intelligence, Multi-modality, Brain imaging, Neurodegenerative Diseases, Aging

Disciplines

Computational Biology | Data Science | Genetics | Genomics | Numerical Analysis and Scientific Computing | Theory and Algorithms

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.