ORCID Identifier(s)

0000-0001-9166-2365

Graduation Semester and Year

2023

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Gautam Das

Abstract

With the advent of advanced computational models, we are being constantly judged by AI systems, complex algorithmic systems based on data that has been collected about us. These analysis are critical as they span many wide spread areas of our lives. For instance, these systems have been shown to find effective ways to fight back and make informed decisions during the COVID-19 pandemic. The wide spread use of these models naturally give rise to a few questions regarding explaining decisions that these systems have made, fairness questions in various parts of these systems. In this dissertation, we present three important problems pertaining to interpret-ability and explain-ability in interactive black-box systems. User preference is often complex to express and quantify. In our first work in the dissertation, we present the problem of finding a small set of items from a data-set. These set of items minimize the user's regret of missing out on the top item from the whole data-set. Such a model also presents an opportunity for interactively understanding the user's preference and providing the user with the top item in the data-set. Our work presents the novel framework to deal with a class of regret measures. Use of data which is riddled with representative biases leads to unfair outcomes. In our second word, we present the problem of integrating fairness into range queries. More importantly, our goal is to eliminate representative biases from the output data which often is consumed by complex systems which can make critical decisions. In our work, we present a innovative index structure to deal with the single predicate range queries and a neighborhood based search to deal with multi-predicate queries. In our third work in this dissertation, we present the problem of explain-ability in black box matching systems. Our approach relies on Shapley values based methods for explanations. To speed the slow nature of Shapley algorithm, we present a sampling based approximate approach with theoretical guarantees. We present a few common problems that arise in these bipartite black-box matching systems. For each of these problems, we illustrate the design of the Shapley value function. For each of these problems, we discuss our approach to the problem, analyze its behavior and empirically evaluate our procedure. Our extensive experimental results show the efficacy and efficiency of our techniques.

Keywords

Fairness, Responsible data management

Disciplines

Computer Sciences | Physical Sciences and Mathematics

License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Comments

Degree granted by The University of Texas at Arlington

Recommended Citation

Shetiya, Suraj, "COMPACT REPRESENTATIVES OF DATABASES AND RESPONSIBLE DATA MANAGEMENT" (2023). Computer Science and Engineering Dissertations - Archive. 292.
https://mavmatrix.uta.edu/cse_dissertations/292

31725-2.zip (2111 kB)

Download

Included in

Computer Sciences Commons

COinS

Computer Science and Engineering Dissertations - Archive

COMPACT REPRESENTATIVES OF DATABASES AND RESPONSIBLE DATA MANAGEMENT