Graduation Semester and Year

2018

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Gautam Das

Abstract

Deep web databases are pillars of today’s internet services hidden behind HTML forms and Top-K search interfaces. While Top-K search interfaces provide a good way to retrieve information, it still lacks in addressing the diverse preferences of the users. Due to query rate limit constraint - i.e., maximum number of k-Nearest Neighbors queries a user/IP address can issue over a specific period of time, it is often impossible to access all the tuples in backed database. With the query rate limit constraint in mind, our motivation is twofold (i) Enable users to obtain individual records from these databases and rank them according to the user’s preference, (ii) Enable the user to access aggregate information over these databases. We introduce QR2 and DBLoc, both these systems access the hidden databases via their public search interfaces and operate without any knowledge on the underlying system ranking function. While QR2 helps in ranked retrieval of single tuples, DBLoc helps in aggregating information over Location based services. QR2 enables on-the-fly processing of queries with any user-specified ranking functions (with or without selection conditions), no matter if the ranking function is supported by the database or not. Using DBLoc the users can perform density based clustering over the backend database of Location Based Services. Thus, DBLOC aims to mine from the LBS a cluster assignment function f (·). We have developed an efficient system for both these problems to be scalable, reliable and secure. We also support multi user accessibility for both these systems and illustrate how to efficiently deploy them in the industry.

Keywords

Information retrieval, Query processing, Data exploration

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

27371-2.zip (1351 kB)

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.