Graduation Semester and Year

2008

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Nikola Stojanovic

Abstract

Genome-wide association studies of the genetic underpinnings of complex phenotypes, and human diseases in particular, have been steadily gaining momentum over the past several years. Yet, the number of polymorphic sites in the human genome, including, but not limited to, Single Nucleotide Polymorphisms (SNPs) is so large that identifying the combination of these few which have a significant effect on the condition of interest remains an overwhelming task. The goal of this thesis work was to identify biologically and medically relevant SNPs which could be the best possible candidates for further association studies. In this thesis we present a new networked solution, and a program GeneNAB implementing it, to the computational identification and ranking of SNPs likely to be relevant for the phenotype of interest, genome-wide. The architecture of our system is similar to that of the Distributed Annotation System (DAS), proposed by a team of prominent bioinformaticians several years ago. However, not all of the resources we use could follow the DAS protocol, so we employed a variety of methods for accessing different resources on the Internet needed for our study. We start with a gene or a cellular pathway previously associated with the condition of interest and find SNPs in all other genes participating in the same pathway. We then rank these SNPs according to their likelihood to be biomedically relevant for the condition and report the ranked list flagging the top entries as candidates for further experimental work. We have applied our system to the Toll-like receptor pathway which provides a mechanism for the development of inflammatory reaction in a variety of conditions, from infection to cell damage. Although many of our top-scoring SNPs still need experimental validation, we have indeed successfully located several which have been previously confirmed as medically relevant. We expect that the output of our software will be useful to guide further laboratory and clinical studies of groups of SNPs affecting any condition of interest.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS