Jia Chen

ORCID Identifier(s)


Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Electrical Engineering


Electrical Engineering

First Advisor

Ioannis D Schizas


In many data acquisition applications such as in sensor networks, the acquired sensor measurements contain information about multiple sources placed at different spatial locations. Such sources could correspond to different e.g., thermal sources or transmitters placed at different locations inside the sensed field. Before applying any statistical inference task, it is essential to identify which groups of sensors acquire observations that contain information about the same sources. This is essential to avoid `mixing' observations that contain information about uncorrelated sources. In this thesis, the goal is to cluster sensors into different groups based on their source information content about the field sources and isolate sensors acquiring only noise. Two scenarios are considered in this thesis, in one of which the number of sources is given to the sensors and in the other scenario, the number of sources is unknown. \\ Toward this end, for the first scenario, a novel canonical correlation analysis (CCA) framework equipped with sparsity-inducing norm-one regularization is introduced to identify correlated sensor measurements and identify informative groups of sensors. It is established that the novel framework is capable to cluster sensors, based on their source content, correctly (with probability one) even in nonlinear settings and when sources do not overlap. Block coordinate techniques (BCD) are employed to derive a centralized algorithm that minimizes the sparsity-aware CCA framework. The latter framework is reformulated as a separable optimization program which is tackled in a distributed fashion via the alternating direction method of multipliers (ADMM). A computationally efficient online distributed algorithm is further derived that is capable to process sensor data online. Extensive numerical tests corroborate that the novel techniques outperform existing alternatives\\ Furthermore, in the second scenario where the number of the sources is not available, two strategies are provided. One strategy is that the traditional canonical correlation analysis (CCA) framework is equipped with norm-one and norm-two regularization terms in order to cluster the sensor data while determining the number of field sources. ADMM and BCD techniques are utilized to derive centralized and distributed algorithms tackling the proposed regularized CCA framework. The capability of correct clustering of sensors in the novel regularized CCA algorithm is verified in heterogeneous sensing systems, consisting of sensors with different sensing capabilities, offering flexibility and providing multiple views of the sensed field by acquiring different types of measurements. The other strategy is that principal component analysis (PCA) combined with moving-average (MA) filtering is utilized to eliminate sensing noise variance and extract the number of principal components in the sensor data covariance corresponding to the uncorrelated sources. Given the estimated number of sources, two applications are considered. In the first application, a novel communication efficient scheme for reconstructing a field sensed by spatially scattered sensors is put forth, which relies on norm-one regularized CCA, PCA, as well as normalized least mean-square adaptive filtering. In the second application, a multiset CCA (M-CCA) framework is proposed to uncover information in multiple heterogeneous sensor data sets and cluster sensors according to their source content.\\


Canonical correlation analysis, Clustering sensor data, Sparsity, Field reconstruction, Heterogeneous sensor network, ADMM, BCD, Distributed fashion, Online processing, Adaptive implementation.


Electrical and Computer Engineering | Engineering


Degree granted by The University of Texas at Arlington