Graduation Semester and Year

2005

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Engineering

Department

Computer Science and Engineering

First Advisor

Ramez Elmasri

Abstract

The increase in the usage and popularity of semi-structured data has received considerable attention, and a lot of research is going on for the efficient retrieval and storage of semi-structured data. A popular model and language for semi-structured data is XML. In this thesis we focus on the structure based indexing of XML. As a part of an ongoing XML indexing project, we study and implement A(k)-index, which is a structure based indexing technique; and propose the use of offset, length pair to retrieve nodes of interest. We record offset and length of every node using the SAX parser, and then we use Random Access File to retrieve nodes from a XML file using the A(k)-index. It can accurately support all path expressions of length up to k, and retrieve the result directly from the XML file. We also compare the performance of the indexing technique when different k values are used.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS