Mathematics Dissertations - Archive

NEIGHBORHOOD EMBEDDINGS AND SCALABLE LEARNING FOR OPTIMAL TRANSPORT AND UNBALANCED OPTIMAL TRANSPORT

Muhammad S. Rana, University of Texas at ArlingtonFollow

ORCID Identifier(s)

0009-0000-3352-2495

Graduation Semester and Year

Spring 2026

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Mathematics

Department

Mathematics

First Advisor

Dr. Keaton Hamm

Second Advisor

Dr. Gaik Ambartsoumian

Third Advisor

Dr. Andrzej Korzeniowski

Fourth Advisor

Dr. Wei Jiang

Abstract

Dimensionality reduction techniques are developed from the assumption that high-dimensional data often arises from low-dimensional structures embedded into the high-dimensional ambient space. Classical dimensionality reduction methods rely on the Euclidean distance, which may fail to capture the geometric structures of the datasets. This dissertation includes alternative metrics for dimensionality reduction techniques and challenges in applying these techniques.

First, we investigate the Wasserstein distance based neighbor embeddings for dimensionality reduction methods and compare the classification and clustering performance with the classical Euclidean based methods. The Wasserstein distance models the data as probability distributions and compares two distributions applying optimal transport (OT) through mass transportation. In this work, we show that the Wasserstein distance based methods provide a more meaningful representation and often demonstrates better performance in downstream machine learning tasks such as classification and clustering.

Second, though the Wasserstein distance based neighbor embeddings often shows improved machine learning tasks performance over classical Euclidean methods, the limitations of the Wasserstein distance based methods are that the data are considered as probability distributions with equal mass 1. We propose neighbor em- beddings using unbalanced optimal transport (UOT) that relax the constraint of mass conservation. This relaxation enables robust comparisons between distributions with uneven mass. The study shows that on the MedMNIST datasets, UOT outperforms classical OT in classification and clustering 87.5% and 83.33% of the time and UOT outperforms both OT and Euclidean based methods 55.5% of the time.

Finally, we study the computational challenges associated with the Wasser- stein distance in large scale settings. We propose two algorithms that estimate the Wasserstein distance from a sample of a few entries. The first algorithm uniformly samples entries from the upper triangular Wasserstein distance matrix and uses iterative methods (Barzilai-Borwein descent) to estimate the matrices. The second algorithm samples O(d log d) random columns of the Wasserstein distance matrices and applies the Nystr¨om method for matrix completion. We show that the Nystr¨om method (Algorithm 2) can outperform the iterative method (Algorithm 1) and the Nystr¨om method is stable for the classification task even for only 10% of the sample columns.

Keywords

Unbalanced optimal transport, Hellinger– Kantorovich distance, manifold learning, measure-valued data, image processing, machine learning

Disciplines

Data Science

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Comments

I would like to express my deepest gratitude to my supervisor Dr. Keaton Hamm for his continuous support and guidance throughout my doctoral studies. Without his patience and exceptional mentorship, this thesis would not have been possible. I am thankful for his kindness, genuine care during my PhD journey. I am also grateful to Dr. Andrzej Korzeniowski, Dr. Gaik Ambartsoumian and Dr. Wei Jiang for their invaluable time and constructive suggestions.

I would like to thank Dr. Hristo V. Kojouharov, former graduate advisor and current chair of Mathematics department, for his availability and support from the beginning of my PhD studies. I am also grateful to the Mathematics department of The University of Texas at Arlington for funding my studies.

Special thanks to my collaborators for their collaboration and contribution to the results presented in this thesis. Thanks to Dr. Mohammad Atiqul Islam of the Computer Science and Engineering department of The University of Texas at Arlington and Dr. Zahidur Rahim Talukder, then a doctoral student of the Computer Science and Engineering of The University of Texas at Arlington and now an assistant professor of Texas Lutheran University, for the opportunity to collaborate on their project and develop the theoretical analysis of the project. Thanks to Dr. Abiy Tasissa of the department of Mathematical Sciences of Tufts University and Dr. Hanqin Cai of the School of Data, Mathematical, and Statistical Sciences of University of Central Florida for their invaluable guidance and collaboration. I also thank Phuong Trinh, Ryan Bui and Yakov Gavriyelov for their contribution in different research projects during my PhD studies through undergraduate capstone project.

The Army Research Office is acknowledged for their financial support and accomplished this work under Grant Number W911NF-23-1-0213.

Recommended Citation

Rana, Muhammad S., "NEIGHBORHOOD EMBEDDINGS AND SCALABLE LEARNING FOR OPTIMAL TRANSPORT AND UNBALANCED OPTIMAL TRANSPORT" (2026). Mathematics Dissertations - Archive. 273.
https://mavmatrix.uta.edu/math_dissertations/273

Download

Included in

Data Science Commons

COinS

Mathematics Dissertations - Archive

NEIGHBORHOOD EMBEDDINGS AND SCALABLE LEARNING FOR OPTIMAL TRANSPORT AND UNBALANCED OPTIMAL TRANSPORT

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Comments

Recommended Citation

Included in

Search

Browse

Author & Creator Corner

Links

Mathematics Dissertations - Archive

NEIGHBORHOOD EMBEDDINGS AND SCALABLE LEARNING FOR OPTIMAL TRANSPORT AND UNBALANCED OPTIMAL TRANSPORT

Author

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Comments

Recommended Citation

Included in

Share

Search

Browse

Author & Creator Corner

Links