ORCID Identifier(s)

0000-0003-4488-2706

Graduation Semester and Year

2020

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Sharma Chakravarthy

Abstract

Substructure discovery is well-researched for single graphs (both simple and attribute) as it is an important component of knowledge discovery for many applications such as finding the core substructure in a protein, important concept in a large graph, etc. However, multilayer networks or MLNs (instead of attribute graphs) have been shown to be better for modeling complex data sets that have multiple entity and feature types. This model provides more clarity on semantics of data sets as well as the ability to use an arbitrary subset of layers for analysis. However, the challenge is that many algorithms such as community and centrality detection as well as substructure discovery need to be extended to MLN representation. With the representation of complex data sets as MLNs brings new challenges in terms of fi nding substructures in MLNs or a subset of MLNs. A naive approach would be to collapse (or aggregate) all (or a subset of) layers into a single attribute graph and use extant algorithms. There are a number of substructure discovery algorithms ranging from memory-based, disk-based, SQL-based, and partitioned using map/reduce framework. While substructure discovery has been widely used for the analysis of single networks, attribute graphs, and forests, the development of an efficient substructure discovery methods for multilayer networks without aggregation is currently not available. This thesis proposes a new decoupling approach-based substructure discovery algorithm for homogeneous MLNs (or HoMLNs). HoMLNs are MLNs where each layer has the same set of nodes but different connectivity in each layer. In this dissertation, we propose a decoupling approach-based algorithm for HoMLNs where aggregation is not needed. Further, the algorithm has been implemented using the Map/Reduce framework in order to handle arbitrary number of layers and improving the response time through parallelism. Each layer is processed individually/separately in parallel, but the substructures generated for each layer are combined after each iteration to identify substructures across layers (or MLN). The focus is on correctness of the algorithm and resource utilization with respect to number of layers. The proposed algorithm is validated analytically as well through extensive experimental analysis on large real-world and synthetic graphs.

Keywords

Substructure discovery, Multilayer networks, Graph mining, MapReduce

Disciplines

Computer Sciences | Physical Sciences and Mathematics

License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Comments

Degree granted by The University of Texas at Arlington

Recommended Citation

Rai, Anish, "MLN-Subdue: Decoupling Approach-Based Substructure Discovery In Multilayer Networks (MLNs)" (2020). Computer Science and Engineering Theses - Archive. 445.
https://mavmatrix.uta.edu/cse_theses/445

29135-2.zip (2967 kB)

Download

Included in

Computer Sciences Commons

COinS

Computer Science and Engineering Theses - Archive

MLN-Subdue: Decoupling Approach-Based Substructure Discovery In Multilayer Networks (MLNs)