ORCID Identifier(s)

0000-0003-3252-950X

Graduation Semester and Year

2019

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Gergely Zaruba

Abstract

Finding candidate genes that could cause specific diseases has been the subject of many studies. This is an important research task, however in the biological experimentation domain it can be very expensive and time consuming. So an alternative way is to find gene expression values from partial measurements and try to predict the rest. By using computational methods, we can statistically estimate these relationships faster and in a more efficient way, providing domain experts suggestions on what exploration of likely relationships they should be focusing. One common computational approach is to model the gene expression data as a matrix (where each row represents a gene and each column a subject); the entries of the matrix can then be mRNA measurements that show the extent of gene expressions. Since entries of the dataset are based on partial measurements, the dataset has missing values, and the problem is then to estimate the missing values and thus to recover the global matrix based on the known values. The main aim of this research is to investigate matrix completion methods for predicting gene expression values.

Keywords

Gene expression, Matrix completion, Matrix factorization, Micro RNA, Long non-coding RNA

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS