Graduation Semester and Year
2005
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
David Levine
Abstract
In the last few years many computer and laboratory improvements in the production and analysis of DNA sequences have made possible the complete sequencing of whole genomes. This provides us with a wealth of raw genomes that needs to be processed and annotated. 5% to 80% of eukaryotic genomes contain repetitive DNA consisting of transposable elements and tandem repeats which needs to be identified, classified and annotated in order to sequence and annotate the entire genome accurately. Existing tools allow us to identify and annotate transposable elements (TE) but no tool exists for their classification. This thesis work introduces REPCLASS an automated tool for the classification of transposable elements that are identified de novo in new genomes. REPCLASS consists of a workflow consisting of several methods to provide a tentative classification of TE consensus sequences. REPCLASS is also a distributed application utilizing high performance cluster computing for performing the computationally intensive task of classification.
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Ranganathan, Nirmal, "REPCLASS: Cluster And Grid Enabled Automatic Classification Of Transposable Elements Identified De Novo In Genome Sequences" (2005). Computer Science and Engineering Theses. 50.
https://mavmatrix.uta.edu/cse_theses/50
Comments
Degree granted by The University of Texas at Arlington