Graduation Semester and Year
2019
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Electrical Engineering
Department
Electrical Engineering
First Advisor
Ioannis D Schizas
Abstract
This work discusses the problem of unsupervised clustering of signals/data vectors based on their information content. A correlation based perspective to the clustering problem has been considered, thus relying on the high correlation between data vectors from the same class rather than on the position of the vectors in the data space. In the past, correlation based clustering has been formulated using a canonical correlation framework or as a matrix factorization problem and has been solved with different variants of gradient descent. This work focuses on improving the clustering performance by modifying the framework to utilize non-linear associations or correlations. To this end, kernelized variants for both the correlation based frameworks have been presented. We also propose an unsupervised kernel learning framework that performs at par with the state of the art supervised kernel learning methods. The proposed method uses a novel eigenvalue maximization framework to learn a convex combination of a dictionary of kernels that will be most suited for the correlation based clustering approach. A joint non-negative matrix factorization based clustering and kernel learning framework has also been proposed. Under certain assumptions, the joint formulation is guaranteed to find the ideal combination of kernels for correlation based clustering. We also establish the convergence of the proposed formulation to a stationary point. Going beyond kernel based non-linear maps/associations, we propose two unsupervised deep learning methods to map the data vectors from the data space to a feature space wherein the within class vectors are highly correlated while the vectors across classes are uncorrelated. As part of this work we have utilized different optimization approaches like mixed integer programming (MIP) and majorize-minimize (MM) algorithms towards solving the resulting non-convex problems. The different methods developed as part of this research, have been applied to a variety of datasets including data from wireless sensor networks (WSN), remote sensing, human activity classification, etc., and the results have been compared to the state of the art algorithms.
Keywords
Correlation analysis, Kernel learning, Clustering, Matrix factorization
Disciplines
Electrical and Computer Engineering | Engineering
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Malhotra, Akshay, "KERNELS AND BEYOND FOR DATA SIMILARITY LEARNING IN DATA MINING" (2019). Electrical Engineering Dissertations. 336.
https://mavmatrix.uta.edu/electricaleng_dissertations/336
Comments
Degree granted by The University of Texas at Arlington