Graduation Semester and Year
2016
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
Chengkai Li
Abstract
Querying graph data can be difficult as it requires the user to have knowledge of the underlying schema and the query language. Visual query builders allow users to formulate the intended query by drawing nodes and edges of the query graph, which can be translated into a database query. Visual query builders help users formulate the query without requiring the user to have knowledge of the query language and the underlying schema. To the best of our knowledge, none of the currently available visual query builders suggest users what nodes/edges to include into their query graph. We provide suggestions to users via machine learning algorithms and help them formulate their intended query. No readily available dataset can be directly used to train our algorithms, so we simulate the training data using Freebase, DBpedia, and Wikipedia and use them to train our algorithms. We also compare the performance of four machine learning algorithms, namely Naïve Bayes (NB), Random Forest (RF), Classification based on Association Rules (CAR), and a recommendation system based on SVD (SVD), in suggesting the edges that can be added to the query graph. On an average, CAR requires 67 suggestions to complete a query graph on Freebase while other algorithms require 83-160 suggestions. Moreover, Naïve Bayes requires an average of 134 suggestions to complete a query graph on DBpedia while other algorithms require 150-171 suggestions.
Keywords
Machine learning, Data mining, Querying data graphs, Visual query builders
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Bhoopalam, Rohit Ravi Kumar, "COMPARISON OF MACHINE LEARNING ALGORITHMS IN SUGGESTING CANDIDATE EDGES TO CONSTRUCT A QUERY ON HETEROGENEOUS GRAPHS" (2016). Computer Science and Engineering Theses. 389.
https://mavmatrix.uta.edu/cse_theses/389
Comments
Degree granted by The University of Texas at Arlington