Author

Shuai Zheng

Graduation Semester and Year

2017

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Chris Ding

Abstract

Machine learning technology is now widely used in engineering, science, finance, healthcare, etc. In this dissertation, we make several advances in machine learning technologies for high dimensional data analysis, image data classification, recommender systems and classification algorithms. In this big data era, many data are high dimensional data which is difficult to analyze. We propose two efficient Linear Discriminant Analysis (LDA) based methods to reduce data to low dimensions. Kernel alignment measures the degree of similarity between two kernels. We propose kernel alignment inspired LDA to find a subspace to maximize the alignment between subspace-transformed data kernel and class indicator kernel. Classical LDA uses arithmetic mean of all between-class distances. However, arithmetic mean between-class distance has some limitations. First, large between-class distance could dominate the arithmetic mean. Second, arithmetic mean does not consider pairwise between-class distance and thus some classes may overlap with each other in the subspace. We propose harmonic mean based LDA to overcome the limitations of classical LDA. Low-rank models can capture the correlations between data. We propose an efficient low-rank regression model for image and website classification and a regularized Singular Value Decomposition (SVD) model for recommender system. Real life data often includes information from different channels. These different aspects/channels of the same object are called multi-view data. In this work, we propose a multi-view low-rank regression model by imposing low-rank constraints on multi-view data and we provide a closed-form solution to the multi-view low-rank regression model. Recommender system is very important for online advertising, online shopping, social network, etc. In recent applications, regularization becomes an increasing trend. We present a regularized SVD (RSVD) model for recommender system to improve standard SVD based models. Support Vector Machine (SVM) is an efficient classification approach, which finds a hyperplane to separate data from different classes. This hyperplane is determined by support vectors. In existing SVM formulations, the objective function uses L2 norm or L1 norm on slack variables. The number of support vectors is a measure of generalization errors. In this work, we propose a Minimal SVM, which uses L0.5 norm on slack variables. The result model further reduces the number of support vectors and increases the classification performance.

Keywords

Machine learning, Linear discriminant analysis, Multi-view data, Low-rank, Regression, Singular value decomposition, Recommender system, Support vector machines

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS