Graduation Semester and Year




Document Type


Degree Name

Master of Science in Computer Science


Computer Science and Engineering

First Advisor

Nikola Stojanovic


Rapid advances in engineering and technology have fueled research in areas such as molecular biology and genetics to a new level and thus have given birth to multidisciplinary fields like bioinformatics and computational biology. These fields involve the application of one's expertise in one or more areas including computer science, information science, statistics, chemistry and mathematics to solve problems related to biology in general and molecular biology and genetics in particular. It is a challenging research area which requires one to grasp diverse fields at almost equal rigor and at the same time provides new insights into the fascinating field of biology. Recombination is a natural phenomenon which occurs in germ cells during meiotic division in higher organisms. Much about meiotic recombination is already known, however, the genetic mechanism of its regulation remains elusive. Our research team was interested in studying the genetic control of this phenomenon by analyzing the variability of rates of recombination between several strains of inbred mice. To meet that goal we have assembled a large amount of data and performed computational analysis of linkage maps of different mice strains. Genome data are available from various public databases which are complex and geographically dispersed. These databases tend to evolve rapidly, due to the fast generation and inclusion of new or more accurate genomic data. In addition, the interfaces to these databases are usually designed to provide information in a way most suitable for viewing by human investigators, and are hence limited in functionality. These issues add up to make the data collection and preprocessing a rather complex problem, alas one which rarely features prominently in any scientific report. In this thesis we discuss two different approaches to estimate the variability of recombination rates in mice. We present efficient computational tools we have designed to gather data from public databases and discuss the preprocessing specific to the requirements dictated by the nature of our analysis. We also provide a description of the software tools we have used for the analysis of recombination rates. Thereafter, we discuss the algorithms, scripts and programs we have designed to arrive at many intermediate results required for the statistical analysis. After we have delivered the results described in this thesis, it is now up to our biology collaborators to use them for the study of the mechanisms driving the recombination in mice, the original motivation for this work. In that sense, this project is still much of a work in progress, and we expect that we, from the computer science side, have indeed paved the way for a major discovery in life sciences.


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington