Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Sajal Das


The advent of human genome annotation in the early years of third millennium has enabled the scientist to interlink the processes of life at the molecular level. The coincidence of this breakthrough along with advances in computational technology and high-throughput experimental techniques has promoted the emergence of numerous -omics data resources. Although for years before such discovery, scientist believed that cellular processes are the product of interaction between genes and gene products; However, any effort to exploit a comprehensive picture of cellular processes had been obscured due to the knowledge gap that avoided to correlate the cellular processes at the lowest level. Having the organisms blueprint in hand has encouraged many researchers to study a biological process as a part of a whole rather than in isolation. From an engineering point of view, the biologists interests now revolve around comprehending the system level behavior of a biological process in a complex biological network. Studying biological systems demands for modeling and simulation tools that can capture the dynamics of these systems in time and space. Many variant of these tools have been proposed elsewhere which all try to approximate the Chemical Master Equation (CME). These modeling and simulation tools are broadly classified into deterministic and stochastic based on their temporal evolutions. In the former class, the tools that project a biological system into a set of Ordinary Differential Equations (ODE) are the most prevalent. However, it has been shown elsewhere that these models can not capture the nonlinearity and the deviant effects that exist in the biological processes, due to the inherent random environment of the cell. In latter class, majority of tools comprise strains of Gillespie algorithm, where the system is mapped into sets of chemical kinetic equations which evolves in Monte Carlo steps. The main problem that deteriorates the utilization of these simulation tools is their temporal complexity. A common drawback for both Gillespie and ODE based approaches is their oversimplification in abstracting the physiology of a process that is represented by an equation along with a single kinetic rate constant. In this dissertation we first elucidate how the Stochastic Discrete Event Simulation (SDES) could be applied in capturing the behavior of biological processes as sets of biological events (bioevents) with random holding times. Then we introduce the architecture of 'eukaSimBioSys' which is designed for system-wide simulation of a eukaryotic cell. The model repository is one of the essential components of our proposed architecture, which comprises reusable modules of parametric models. Each of these parametric models once coupled with a proper parameter set is then applied to capture the holding time of a specific bioevent. These models are physicochemical models that attempt to abstract bimolecular interactions (i.e. modifications, associations, translocations, localizations, etc.) into a parametric probability distribution function of time. Typical interactions include: reaction, receptor-ligand binding, protein-protein binding, chromatin remodeling, transcription, translation, splicing, etc. The previous researchers have already started building this model library and in this work we add four new models (i) ligand-receptor binding, (ii) DNA fluctuations, (ii) chromatin remodeling, and (iii) splicing. For the first one we have developed both the eukaryotic and prokaryotic variants of the model, where as the rest are specific to eukaryotes. These models have been validated with the published experimental data where empirical results were available. Cell activity is the product of an intricate interaction among three main cellular networks: Signal Transduction Network (STN), Transcription Regulatory Network (TRN), and Metabolic Network (MTN). Each cellular function composed of one or more edges within or across these networks. Hence, system-wide study of a cell requires clear and explicit definition of these networks. We have incorporated the semantic of these networks in 'eukaSimBioSys' by designing an object-oriented database to hold the layout of these three networks along with their inter-relationships. We have populated these databases for 'human BCell' and 'human cardiac myocyte' from data available in literature and other databases. Despite the advances in health science, and discovery of new drugs, still heart disease is the most life threatening disease in both industrial and developing countries. Cardiac myocytes are the main players of the perpetual heart contraction function and are among the most energy consuming tissues in the body. Any changes in their normal metabolism can lead to severe consequences for an individual. Glucose and fatty acids comprise the major sources of energy for the myocardial cells, the interplay between these two sources is predominantly controlled by insulin. As the ultimate goal of this dissertation we have incorporated all the models developed in this dissertation and elsewhere into 'eukaSimBioSys' and utilized that to conduct unique in-silico experiments for studying the effects of insulin on metabolism of heart muscles. We exploit the features and capacities of our software by conducting six in-silico experiments where we proved its outstanding potentials in regenerating the experiential data and performing hypothesis testings by applying the experimental conditions in-silico. The biological facts that we validated in-silico briefly include: plasticity of cardiac myocytes, contributions of exogenous glucose and fatty acid in myocardial energetics, transcription regulation of insulin, and the effect of genetic null-mutations on metabolic pathways. One of the unique features of 'eukaSimBioSys' that was demonstrated throughout an in-silico experiment was the ability of the software to perform the system-wide simulation of myocardial cellular networks for a prolonged time (48 hours). To construct the SRN, TRN, and MTB for the experiment we incorporated the information from three major databases (i.e. KEGG, BiGG, HumanCyc) along with data from exhaustive literature searches. 'eukaSimBioSys' features variety of promising applications in the biology and health science. It could be applied to suggest the more promising experimental condition for the experimentalist or help investigating new pathways and regulatory mechanisms. Another very important application of this software is in rebuilding the disease scenarios such as hyperglycemia, diabetes, hypertension, ischemia, etc. In-silico investigation on the effects and side-effects of a new drug is another potential application of this emerging software. Note that utilizing 'eukaSimBioSys' for the above purposes might subject to certain case based enhancements to the current version of the software.


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington