Graduation Semester and Year
2016
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
Leonidas Fegaras
Abstract
Non-Negative matrix factorization is well-known complex machine learning algorithm which is also used in collaborative filtering. Collaborative filtering technique is used in recommendation systems and these techniques aim at predicting the missing values in user-item association matrix. User-item association matrix contains number of users as rows and number of movies as columns and the values are the ratings given by user to respective movies. These matrices have large dimensions, missing values and needs parallel processing. Map reduce query language (MRQL) is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache hadoop, spark, hama and flink. Large scale matrix operations require proper scaling and optimization in distributed systems. Therefore, In this work we are analyzing the performance of MRQL on complex matrix operations by using different sparse matrix datasets in spark mode. This work aims at performance analysis of Map Redce Query Language on complex matrix operations and ease of scalability of these operations. We have performed simple matrix operation like multiplication, division, addition, subtraction and also complex operation like factorization. Gaussian non negative matrix factorization and stochiastic gradient descent based matrix factorization are the two algorithms which are tested in spark and flink modes of MRQL with dataset of movie ratings. The performance analysis in the experiments will help readers to understand and analyze the performance of MRQL and also understand more about MRQL.
Keywords
Matrix factorization, Map reduce query language, MRQL
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Ulde, Ahmed Abdul Hameed A, "PERFORMANCE EVALUATION OF MATRIX OPERATIONS ON MAP-REDUCE QUERY LANGUAGE" (2016). Computer Science and Engineering Theses. 425.
https://mavmatrix.uta.edu/cse_theses/425
Comments
Degree granted by The University of Texas at Arlington