ORCID Identifier(s)

ORCID 0009-0009-8746-0600

Graduation Semester and Year

Fall 2024

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Jeff Lei

Abstract

Machine learning (ML) algorithms are changing many aspects of modern life by analyzing data, identifying patterns, and making predictive decisions across industries such as healthcare, transportation, finance, and e-commerce. However, ML models often operate as "black boxes," making it difficult to interpret their decision-making processes. This lack of transparency creates challenges in testing, debugging, and understanding model behavior, which affects user trust and raises concerns about trustworthiness, accountability, reliability, and fairness in high-stakes applications.

Explainable Artificial Intelligence (XAI) aims to address these challenges by providing tools and methods that explain the decision-making processes of ML models in a way that is understandable to humans. XAI helps by answering questions about model behavior, such as why a decision was made and what changes could affect future decisions. Interpretability is necessary for validating model reliability, finding biases, and supporting user trust. Testing and debugging ML models is important for maintaining reliable performance, as these models depend on the quality of their training data. Data issues, such as feature noise or mislabeled data (class noise), can impact model accuracy and result in misclassifications. Identifying and addressing these data problems is necessary for effective model performance.

Testing ML models involves evaluating their performance, reliability, and robustness by analyzing behavior across diverse inputs and scenarios. Unlike traditional software with explicitly defined rules, ML models rely on data-driven abstractions to make decisions, requiring testing approaches that capture essential feature interactions and assess model responses comprehensively. These feature interactions play a crucial role in shaping model outcomes, and their learned nature adds complexity to testing, requiring adaptive methods that go beyond fault identification to evaluate how these interactions affect predictions. Consequently, ML testing demands specialized techniques to validate these systems effectively. This dissertation explores approaches for enhancing transparency and reliability in ML-based AI systems, with a focus on testing, debugging, and explainability of ML-based AI systems.

This dissertation is presented in an article-based format and comprises five research papers. The first paper reports on a software debugging-based approach, DeltaExplainer, for generating counterfactual explanations of ML model predictions. The second paper reports on Proxima, a proxy model-based method for influence analysis. The third paper evaluates the effects of various model compression techniques on the accuracy and efficiency of influence analysis. The fourth paper presents DeltaRepair, an approach for debugging ML models for mislabeled instances through influence analysis and delta debugging. The fifth paper explores surrogate model construction using combinatorial testing and active learning to efficiently capture essential feature interactions that drive model predictions.

Keywords

Machine Learning (ML), Explainable Artificial Intelligence (XAI), Testing and Debugging ML Models, Interpretability, AI-Based Software Systems, Trustworthy AI, Robustness in ML Systems, Transparency in AI Systems

Disciplines

Artificial Intelligence and Robotics | Software Engineering

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.