Graduation Semester and Year
2015
Language
English
Document Type
Thesis
Degree Name
Master of Science in Information Systems
Department
Information Systems and Operations Management
First Advisor
Riyaz Sikora
Abstract
Exploratory data analysis (EDA) refers to an iterative process through which analysts constantly ‘ask questions’ and extract knowledge from data. EDA is becoming more and more important for modern data analysis, such as business analytics and business intelligence, as it greatly relaxes the statistical assumption required by its counterpart—confirmation data analysis (CDA), and involves analysts directly in the data mining process. However, exploratory visual analysis, as the central part of EDA, requires heavy data manipulations and tedious visual specifications, which might impede the EDA process if the analyst has no guidelines to follow. In this paper, we present a framework of visual data exploration in terms of the type of variable given, using the effectiveness and expressiveness rules of visual encoding design developed by Munzner [1] as guidelines, in order to facilitate the EDA process. A classification problem of the Titanic data is also provided to demonstrate how the visual exploratory analysis facilitates the data mining process by increasing the accuracy rate of prediction. In addition, we classify prevailing data visualization technologies, including the layered grammar of ggplot2 [2], the VizQL of Tableau [3], d3 [4] and Shiny [5], as grammar-based and web-based, and review their adaptability for EDA, as EDA is discovery-oriented and analysts must be able to quickly change both what they are viewing and how they are viewing the data.
Keywords
Exploratory data analysis, Data visualization, Data mining
Disciplines
Business | Management Information Systems
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Mao, Yingsen, "DATA VISUALIZATION IN EXPLORATORY DATA ANALYSIS: AN OVERVIEW OF METHODS AND TECHNOLOGIES" (2015). Information Systems & Operations Management Theses. 3.
https://mavmatrix.uta.edu/infosystemsopmanage_theses/3