Author

Ankur Goyal

Graduation Semester and Year

2015

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Sharma Upendranath Chakravarthy

Abstract

Graphs have become one of the preferred ways to store structured data for various applications such as social network graphs, complex molecular structure, etc. Proliferation of graph databases has resulted in a growing need for effective querying methods to retrieve desired information. Querying has been widely studied in relational databases where the query optimizer finds a sequence of query execution steps (or plans) for efficient execution of the given query. Until now, most of the work on graph databases has concentrated on mining. For querying graph databases, users have to either learn a graph query language for posing their queries or use provided customized searches of specific substructures. Hence, there is a clear need for posing queries using graphs, consider alternative plans, and select a plan that can be processed efficiently on the graph database. In this thesis, we propose an approach to generate plans from a query using a cost-based approach that is tailored to the characteristics of the graph database. We collect metadata pertaining to the graph database and use cost estimates to evaluate the cost of execution of each plan. We use a branch and bound algorithm to limit the state space generated for identifying a good plan. Extensive experiments on different types of queries over two graph databases (IMDB and DBLP) are performed to validate our approach. Subdue – a graph mining algorithm has been modified to process a query plan instead of performing mining.

Keywords

Query optimizer, Graph databases, Querying system

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS