Graduation Semester and Year
Spring 2024
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Dr. Cesar Torres
Second Advisor
David Levine
Third Advisor
Dr. Manfred Huber
Fourth Advisor
Dr. Ming Li
Abstract
Benchmark datasets are critical to the evolution of AI efforts yet often embed unintended biases that influence the models that drive human-AI interactions. A deeper inspection and awareness of data is needed to understand the biases datasets may contain. In this dissertation, I introduce the Tag-and-Release method, inspired from wildlife research, that treats data as an organism and examines how different environments (i.e., CNNs) select for unique traits or characteristics that ultimately impact data's survival. Using the canonical MNIST handwritten digit dataset as a case study, I describe how the Tag-and-Release method can be used to analyze how dataset imbalance biases propagate into different neural architectures. I demonstrate how the technique can be scaled to coordinate data inspection efforts with crowd workers to annotate the dataset. Using the tagged data, I developed explainable AI interventions through a user study with machine learning students. I present our findings for developing balanced and fair datasets, stimulate discussions about models as ecosystems, and advocate for a data conservatory for coordinated efforts to support explainable AI initiatives within intelligent systems.
Keywords
explainable AI, model analysis, data labeling, tagging, MNIST, ML education, ML intervention
Disciplines
Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Zaman, Akib, "Living Datasets: Towards Data-Centric AI Explainability and Bias Mitigation" (2024). Computer Science and Engineering Dissertations. 3.
https://mavmatrix.uta.edu/cse_dissertations/3
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons