Graduation Semester and Year

2014

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Chengkai Li

Abstract

Entity linking allows one to have collections of data from multiple sources as a global dataset and then query those data. Entity linking allows us to do knowledge discovery on this global dataset which might result in the discovery of some interesting facts and information. Microsoft Academic Search (MAS) is a free public search engine for academic papers and contains the bibliographic information for papers published in journals, conference proceedings and respective citations. As of February 2014, it has indexed over 40 million publications and 20 million authors. LinkedIn is a social networking service used for professional networking. LinkedIn has an estimated 259 million users worldwide. Linking the author from MAS to the person from LinkedIn produces a bigger dataset. The resulting dataset enables us to find more interesting measures about an author such as the author's educational institutions, previous work experiences and social groups. We are effectively collecting missing pieces of information about an author from LinkedIn as part of forming an extensive dataset. In this process, we are resolving the ambiguity of multiple persons with the same name as the author and classifying them. Our experimental results indicate that we can attain a higher precision of 98% if we have a higher threshold of 2.8.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS