Graduation Semester and Year
2018
Language
English
Document Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science and Engineering
First Advisor
Chengkai Li
Abstract
Several applications deploy the use of large entity graphs. Given the entirety of its application scope, it is challenging to select a single entity graph for a particular need from numerous data sources. For a comprehensible overview of the entity graph, we may project a preview table for compact representation of an entity graph. Each preview table represents a single entity type in the dataset. We need to find the representative entities for a given entity type from the entity graph to show the coverage of a dataset. In this paper, we propose a method to find representative entities for a given entity type from the entity graph. Each entity of the same type is represented by a multi-dimensional label vector using neighborhood nodes. We apply the k-means clustering algorithm on the generated label vectors of the same entity type. The clustering algorithm divides a set of entities into k disjoint clusters. The nearest entity to the centroid of each cluster is used as the representative entity for the given entity type. We have performed experiments on the Freebase dataset, based off of which, we got diverse and important representative entities for the tv, film and location domain. We can use these representative entities in the generation of preview tables. This helps the data worker understand the coverage of a particular entity type in the dataset.
Keywords
Representative entities, Entity similarity, Graph mining, Entity graph
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Shingavi, Ankit Anil, "Finding Representative Entities From Entity Graph By Using Neighborhood Based Entity Similarity" (2018). Computer Science and Engineering Theses. 458.
https://mavmatrix.uta.edu/cse_theses/458
Comments
Degree granted by The University of Texas at Arlington