Author

Rohit Bhawal

ORCID Identifier(s)

0000-0002-0682-7549

Graduation Semester and Year

2017

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Leonidas Fegaras

Abstract

In today’s world where there is no limit to the amount of data being collected from IOT devices, social media platforms, and other big data applications, there is a need for systems to process them efficiently and effortlessly. Analyzing the data to identify trends, detect patterns and find other valuable information is critical for any business application. The analyzed data when produced in visual format like graphs, enables one to grasp difficult concepts or identify new patterns easily. MRQL is an SQL-like query language for large scale data analysis built on top of Apache Hadoop, Spark, Flink and Hama which allows to write query for big data analysis. In this thesis, the MRQL language has been enhanced by adding a JSON generator functionality that allows the language to output results in JSON format based on the input query. The structure of the generated JSON output is query dependent, in that the JSON output is in symmetry with the query specification. This functionality provides for feature integration of MRQL to any external system. In this context, a web application has been developed to use the JSON output and to generate a graphical visualization of query results. This visualizer is an example of integration of MRQL to an external system via the JSON data generator. This helps in providing vital visual information on the analyzed data from the query. The developed web application allows a user to submit an MRQL query on their Big Data stored on a distributed file system and then to visualize the query result as a graph. The application currently supports MapReduce and Spark as platforms to run MRQL queries, using in-memory, Local, or Distributed mode, based on the environment on which it has been deployed. It enables a user to use the MRQL language to perform data analysis and then visualize the result.

Keywords

Big data, MapReduce, MRQL, D3JS, JSON, Hadoop, Visualization, Integration

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS