Author

Dongchul Kim

Graduation Semester and Year

2014

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Jean Gao

Abstract

Inferring biological networks from high-throughput bioinformatics data is one of the most interesting areas in the systems biology research in order to elucidate cellular and physiological mechanisms. In this thesis, network inference methods are proposed to solve biological problems. We first investigated how the exposure to low dose ionizing radiation (IR) affects the human body by observing the signaling pathway associated with Ataxia Telangiectasia mutated using Reverse Phase Protein Array and isogenic human Ataxia Telangiectasia cells under different amounts and durations of IR exposure. DNA damage-caused pathways are derived from learning Bayesian networks in integration with prior knowledge such as Protein-Protein Interactions and signaling pathways from well-known databases. The experimental results show which proteins are involved in signaling pathways under IR, how the inferred pathways are different under low and high doses of IR, and how the selected proteins regulate each other in the inferred pathways.In network inference research, there are two issues to solve. First, depending on the structural or computational model of inference method, the performance tends to be inconsistent due to innately different advantages and limitations of the methods. Second, sparse linear regression that is penalized by the regularization parameter and bootstrapping-based sparse linear regression methods were suggested as state of the art in recent related works for network inference. However, they are not effective for a small sample size data and also a true regulator could be missed if the target gene is strongly affected by an indirect regulator with high correlation or another true regulator. To solve the limitations of bootstrapping, a lasso-based random feature selection algorithm is proposed to achieve better performance.In order to elucidate the overall relationships between gene expressions and genetic perturbations, we propose a network inference method to infer gene regulatory network where Single Nucleotide Polymorphism (SNP) is involved as a regulator of genes. In the most of the network inferences named as SNP-Gene Regulatory Network (SGRN) inference, pairs of SNP-gene are given by separately performing expression Quantitative Trait Loci (eQTL) mappings. A SGRN inference method without pre-defined eQTL information is proposed assuming a gene is regulated by a single SNP at most.We also studied how a medicine can be customized to individual patients considering biological features of the patients, i.e., Personalized Medicine. Our goal is to predict drug sensitivity levels of cancer patients in order to provide an optimal drug to the patients avoiding a waste of time with ineffective treatments. For the classification of patients to the optimal drug, we employed Bayesian Network Classifier (BNC) that consists of two components, parameters and network structure. Since the networks of BNC represent the dependency of proteins, these multiple networks of BNCs for multiple drugs also provide important information of relationships between proteins in order to identify the biomarkers of a target cancer from the integration of the multiple networks.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS