Graduation Semester and Year
Fall 2024
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Jeff Lei
Abstract
Machine learning (ML) algorithms are changing many aspects of modern life by analyzing data, identifying patterns, and making predictive decisions across industries such as healthcare, transportation, finance, and e-commerce. However, ML models often operate as "black boxes," making it difficult to interpret their decision-making processes. This lack of transparency creates challenges in testing, debugging, and understanding model behavior, which affects user trust and raises concerns about trustworthiness, accountability, reliability, and fairness in high-stakes applications.
Explainable Artificial Intelligence (XAI) aims to address these challenges by providing tools and methods that explain the decision-making processes of ML models in a way that is understandable to humans. XAI helps by answering questions about model behavior, such as why a decision was made and what changes could affect future decisions. Interpretability is necessary for validating model reliability, finding biases, and supporting user trust. Testing and debugging ML models is important for maintaining reliable performance, as these models depend on the quality of their training data. Data issues, such as feature noise or mislabeled data (class noise), can impact model accuracy and result in misclassifications. Identifying and addressing these data problems is necessary for effective model performance.
Testing ML models involves evaluating their performance, reliability, and robustness by analyzing behavior across diverse inputs and scenarios. Unlike traditional software with explicitly defined rules, ML models rely on data-driven abstractions to make decisions, requiring testing approaches that capture essential feature interactions and assess model responses comprehensively. These feature interactions play a crucial role in shaping model outcomes, and their learned nature adds complexity to testing, requiring adaptive methods that go beyond fault identification to evaluate how these interactions affect predictions. Consequently, ML testing demands specialized techniques to validate these systems effectively. This dissertation explores approaches for enhancing transparency and reliability in ML-based AI systems, with a focus on testing, debugging, and explainability of ML-based AI systems.
This dissertation is presented in an article-based format and comprises five research papers. The first paper reports on a software debugging-based approach, DeltaExplainer, for generating counterfactual explanations of ML model predictions. The second paper reports on Proxima, a proxy model-based method for influence analysis. The third paper evaluates the effects of various model compression techniques on the accuracy and efficiency of influence analysis. The fourth paper presents DeltaRepair, an approach for debugging ML models for mislabeled instances through influence analysis and delta debugging. The fifth paper explores surrogate model construction using combinatorial testing and active learning to efficiently capture essential feature interactions that drive model predictions.
Keywords
Machine Learning (ML), Explainable Artificial Intelligence (XAI), Testing and Debugging ML Models, Interpretability, AI-Based Software Systems, Trustworthy AI, Robustness in ML Systems, Transparency in AI Systems
Disciplines
Artificial Intelligence and Robotics | Software Engineering
License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Recommended Citation
Shree, Sunny, "Leveraging Software Testing Techniques to Explain, Analyze, and Debug Machine Learning Models" (2024). Computer Science and Engineering Dissertations. 400.
https://mavmatrix.uta.edu/cse_dissertations/400