Graduation Semester and Year
2005
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Engineering
Department
Computer Science and Engineering
First Advisor
Ramez Elmasri
Abstract
The increase in the usage and popularity of semi-structured data has received considerable attention, and a lot of research is going on for the efficient retrieval and storage of semi-structured data. A popular model and language for semi-structured data is XML. In this thesis we focus on the structure based indexing of XML. As a part of an ongoing XML indexing project, we study and implement A(k)-index, which is a structure based indexing technique; and propose the use of offset, length pair to retrieve nodes of interest. We record offset and length of every node using the SAX parser, and then we use Random Access File to retrieve nodes from a XML file using the A(k)-index. It can accurately support all path expressions of length up to k, and retrieve the result directly from the XML file. We also compare the performance of the indexing technique when different k values are used.
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Manandhar, Niroj, "Structure Based XML Indexing" (2005). Computer Science and Engineering Theses. 347.
https://mavmatrix.uta.edu/cse_theses/347
Comments
Degree granted by The University of Texas at Arlington