Graduation Semester and Year

2007

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Gautam Das

Abstract

Today's data is rarely stored in centralized location due to the enormous amount of information that needs to be stored and also to increase reliability, availability and performance of the system. Same data is stored in different format into different company's database as well as they may be partitioned or replicated. We consider various scenarios of distributed database such as horizontal, vertical fragmentation and attribute overlapping. Allowing access to integrated information from these multiple datasets can provide accurate and wholesome information to the end-user. We research on efficient querying to these distributed databases to get top k elements matching the ranking order provided by the user. We also discuss hierarchical way of using the top k algorithm and their limitations to our problem. We propose four different algorithms based on NRA algorithm to solve this problem efficiently and compare and contrast these methods. Once the combination of data sources has been identified, we use our algorithms to get the top elements from these data source combination, process them to get the top k elements according to the user's ranking function.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS