Graduation Semester and Year
2016
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Engineering
Department
Computer Science and Engineering
First Advisor
Leonidas Fegaras
Abstract
The use of real time processing of data has increased in recent years with the increase of data captured by social media platforms, IOT and other big data applications. The processing of data in real time has been an important aspect of the day from finding the trends over the internet to fraud detection of in the banking transactions. Finding relevant information from large amount of data has always been a difficult problem to solve. MRQL is a query language that can be used on top of different big data platform such as Apache Hadoop, Flink, Hama, and Spark that enables the professionals with Database query knowledge to write queries to run programs on top of these computational systems. In this work, we have tried to integrate the MRQL query language with a new real time big data computational system called Apache storm. This system was developed by twitter to analyze the trending topics in the social media and is widely used in industry today. The query written in MRQL is converted into a physical plan that involves execution of different functions such as Map Reduce, Aggregation etc. which has to be executed by the platform in its own execution plan. The implementation of Map Reduce has been done in this work for Storm which covers execution for important physical plans of query such as Select and Group By. The implementation of Map Reduce is also an important a part in every big data processing platform. This project will be the starting point in implementation of the MRQL for Apache Storm and the implementation can be extended to support various query plans involving Map Reduce.
Keywords
MRQL, Real time systems, Query language, Apache storm, Apache MRQL, MapReduce
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Paudel, Achyut, "INTEGRATION OF APACHE MRQL QUERY LANGUAGE WITH APACHE STORM REALTIME COMPUTATIONAL SYSTEM" (2016). Computer Science and Engineering Theses. 423.
https://mavmatrix.uta.edu/cse_theses/423
Comments
Degree granted by The University of Texas at Arlington