Graduation Semester and Year
2022
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
Vassilis Athitsos
Abstract
Given a database of objects and a query object, it’s possible to gather a number of the closest neighbors to the query object. This operation is important to a number of diverse fields such as computer vision, content- based information retrieval, and chemistry. However, distance measures used to determine neighbors can cause queries to be computationally expensive, either because the distance measure is complex or because it is nonmetric and prevents efficient indexing methods. This work presents novel methods of triplet mining that enable neural networks using triplet loss to learn the manifold that data resides in. These neural networks can learn to embed arbitrary data with arbitrary distance measures between them. Experiments are performed on an offline digit dataset, speech commands, and offline and online sign language data. Results demonstrate effectiveness over a baseline when a network architecture suited for a particular dataset is trained. When compared to other methods of topology preserving embeddings, the neural network based method outperforms in all but one dataset. Results show there is not a particular method of triplet mining that vastly outperforms the others, and the best method likely depends on the problem being addressed.
Keywords
Triplet mining, Triplet loss, Neural networks, dimensionality reduction
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Lary, Mason, "LEARNING TOPOLOGY PRESERVING EMBEDDINGS FOR SPEEDING UP NEAREST NEIGHBOR RETRIEVAL" (2022). Computer Science and Engineering Theses. 412.
https://mavmatrix.uta.edu/cse_theses/412
Comments
Degree granted by The University of Texas at Arlington