Graduation Semester and Year
2006
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
Gautam Das
Abstract
Top-k queries on large multi-attribute data sets are fundamental operations in information retrieval and ranking applications. In this thesis, we initiate research on the anytime behavior of top-k algorithms on exact and fuzzy data. In particular given specific topk algorithms we are interested in studying their progress towards identification of the correct result at any point of the algorithms' execution. We adopt a probabilistic approach where we seek to report at any point the scores of the top-k results the algorithm has identified, as well as associate a confidence with this prediction. Such functionality can be a valuable asset when one is interested to reduce the runtime cost of top-k computations. We show analytically that such probability and confidence are monotone in expectation. We present a thorough experimental evaluation to validate our techniques using both synthetic and real data sets.
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Chaudhari, Bhushan P., "Anytime Top-k Queries On Exact And Fuzzy Data" (2006). Computer Science and Engineering Theses. 116.
https://mavmatrix.uta.edu/cse_theses/116
Comments
Degree granted by The University of Texas at Arlington